So your client is on Square Space. And all of their files were uploaded there. Squarespace however doesn’t provide FTP so you can’t just go in and grab them all. Only images are found in the /sitemap.xml file – so we need a way to get a list of all of the PDF files that have ever been uploaded to yoursite.ca – fret not. There’s a solution.
Google-Fu Those Square Space Files
Okay so we all know Google sees pretty much everything. Fortunately inspite of Squarespace using a CDN, Google picks up all file uploads on squarespace sites – typically they look something like:
http://yoursite.ca/s/the_file_name.pdf
All we need to do is perform an advanced Google search – type this into Google:
site:yoursite.ca filetype:pdf
It goes without saying to replace yoursite.ca with the url you’re grabbing files from.
Modify Search Results for Batch Download
Next up we need to modify the search results to show more than the default 10 per page that Google shows. In the top right corner of the search results you should see a “Search Settings” button – it may be behind a Gear icon. Once in there, you can turn off instant search off and enable 100 search results per page.
Bring in the Scraper and Download Squarespace Files
Now we need to find a way to automatically download all of those pdf files from the Square Space URL’s that Google found. I installed Download Master on Chrome (though there are other download managers for Chrome/Firefox/Safari out there). This one came up first while Googling and it seemed to work just fine. Run it on the results page and set it to download only PDFs.
Marvel at your hard work
And there you have it. All of the files you were after in your downloads folder. Now keep in mind if there are more than 100 files found, you may have to run Download Master on each page of Google’s Search Engine Results. But I’ll take running that scraper a few times over manually downloading each file any day of the week!