Recently, I needed to clone a website and make a few minor changes to it. I wanted to publish a slightly modified copy of the website. Luckily, it’s easy to do that using wget. Here’s how I did it.
1. Install wget
I’m on Mac, so I installed wget using Homebrew using the command
brew install wget
2. Download site
I wanted to download this small website. I used this command:
wget -p -r https://events.govexec.com/qualys-cyber-risk-conference/
- The -p flag means download all page requisites, such as images, stylesheets, etc.
- The -r flag means recursive.
3. Search and replace
Since I downloaded a bunch of HTML files, if I wanted to replace a common element on multiple pages, the easiest way was to do a search and replace. Using VisualStudio Code, you can easily find all HTML blocks within a particular tag using a multi-line regex. Here are some example regexes:
<footer(.|\n)*?</footer>
<script(.|\n)*?</script>
<a class="popup(.|\n)*?</a>
Note: these regexes only work if the tags don’t have any nested tags with the same name.