This is a simple script that scrapes the website using wget in order to maintain a static fallback. This script requires pretty permalinks unless used on a single page site.
The script runs three download steps:
- Download all files on the website using
wget, waiting 1 second between requests. The script ignores any file with a query parameter unless that parameter is?ver. - Download the website's 404 page. This fails if the website has a page at
404.htmlfor some reason. - Download any extra urls specified in
extra-urls.txt.
The download is followed by three post-processing steps:
- Remove all query parameters (only
?ver) from the downloaded files using.github/bin/cleanup-querystrings.py. - Use
sedto replace the website's URL with the GitHub Pages url in all files. - Minify all HTML files using minify.
After these steps, the files are deployed to GitHub Pages: ftcunion.github.io.