AWS S3, CloudFront, static site updates: Remember to invalidate cache!
tl;dr: If you're hosting a static site on AWS S3 behind CloudFront, objects have a default 24 hour expiration date in the CloudFront edge caches before they update. That's why you may see changes in your S3 bucket, but not at your domain. CloudFront is still serving up a cached version.
If you need the update to happen sooner, you need to run an Invalidation in CloudFront on the objects that you've updated. Keep in mind there may be a small charge associated with invalidations after a certain threshold - 1000 paths per month, at the time of writing.
I personally don’t have a great understanding of cloud, so I was baffled when I updated the About page on this blog in my S3 bucket but failed to see the change. What’s more, the changes showed up at
/about/index.html, but not at
/about/. What gives?
A little context helps. What CloudFront actually does is store a cached version of your website in edge caches - many servers around the world. This is why CloudFront helps with load times. If your S3 bucket is in Northern Virginia, but your user is in Europe, chances are that there is a CloudFront edge cache close to them which can serve up a copy of your site more quickly. So even though I updated my S3 bucket, the cached result at
/about/ for the edge cache closest to me will still be serving an old version.
But that begs the question - how often should those edge caches be updated? There’s obviously a tradeoff between serving up stale data to users vs. constantly pushing updates. By default, an object expires from an edge cache in 24 hours.
If you don’t want to wait that long, there are several ways to force an object to be invalidated early, so that any requests for it must go to your S3 bucket for the updated version. One way is to rename the file. Another way is to set caching behaviour for a specific file or set of files. Official AWS documentation goes into much detail about all these.
But if your static site doesn’t get updated too often, you probably just want to push the darn changes out and be done with it. By far the easiest thing to do in a pinch is to just go to the CloudFront console and run an invalidation for the files that you’ve updated:
Toss in the paths to your updated files, let your invalidation run for a few seconds, refresh, and your changes should now be visible at your domain.
Seems like too much trouble? Here’s a little bash script that I use to do everything at once:
# Build the site, do dryrun, get user confirmation, do S3 sync of new files, # and create a Cloudfront cache invalidation. # Remember that to assume a Role (i.e. "some_admin_user"), a Trust # relationship for that user must be added under the Role in IAM console. # Usage: # ./deploy.sh JEKYLL_ENV=production bundle exec jekyll build echo "-----DRYRUN-----" aws s3 sync _site s3://mybucket --profile some_admin_user --dryrun read -p "Deploy files? [y/N]" deploy if [[ "$deploy" =~ ^([yY][eE][sS]|[yY])+$ ]] ; then # sync files aws s3 sync _site s3://mybucket --profile some_admin_user # Run an invalidation to push changes to all CloudFront edge caches aws cloudfront create-invalidation --distribution-id BLABLA123 --paths "/*" echo "Done." fi