Automatically Deploying Website from Git to AWS S3

I am a big fan of Amazon AWS – this blog has been running on it for a few years now. Since moving to AWS S3 (for storage) and CloudFront (as a Content Delivery Network) to host static websites, such as my homepage, I have been trying to work out how to get them to automatically deploy when I update the Git repository I use to manage the source code. I looked in to it in some detail last year and concluded that AWS CodePipeline would get me close, but would require a workaround as it did not support deploying to S3. In the end I decided that a custom AWS Lambda function was needed.

Lambda is a service that hosts your code, in a state where it is ready to run when triggered, without needing to have a server. You are only billed for the time your code is running (above a free threshold), so it is perfect for small infrequent jobs, such as deploying changes to a website or even using it with Alexa for home automation. It seemed like an interesting area to explore and gain some knowledge, but I think I went in at the deep end, trying to develop a complex function, using an unfamiliar language (Node.js) on an unfamiliar platform. Then other tasks popped up and it fell by the wayside.

Then earlier this year I saw an announcement from AWS that CodePipeline would now support deploying to S3 and thought my problem had been solved. Although I must admit that I was a bit disappointed not to have the challenge to code it myself. Fast forward a few months and I had the opportunity to set up the CodePipeline, which was very easy. However, it only supported copying the code from the Git repository to the S3 bucket. It did not refresh Cloudfront, so my problem remained unsolved.

The CodePipeline did allow for an extra step to be added at the end of the process, which could be a Lambda function, so I went off in search of a Lambda function to trigger an invalidation on CloudFront when an S3 bucket has been updated. The first result I found was a blog post by Miguel Ángel Nieto, which explained the process well, but was designed to work for one S3 bucket and one CloudFront distribution. As I have multiple websites, I wanted a solution that I could deploy once, and use for all websites, so my search continued. Next I came across a blog post by Yago Nobre, which looked to do exactly what I needed. Except that I could not get the source code to work. I tried debugging it for a while, but was not making much progress. It did give me an understanding of how to link a bucket to a CloudFront distribution, trigger the Lambda function from the bucket and use the Boto3 AWS SDK for Python to extract the bucket ID and CloudFront distribution from the triggering bucket – all the things that were lacking from the first blog post/sample code. Fortunately both were written in Python, using the Boto3 AWS SDK, so I was able to start work on merging them.

I was not terribly familiar with the Python language, to the point of having to search how to make comments in the code, but I saw it as a good learning experience. What I actually found harder than the new-to-me language, was coding in the Lambda Management Console, which I had to do, due to both the inputs and outputs for the function being other AWS features, meaning I could not develop locally on my Mac. Discovering the CloudWatch logs console did make things easier, as I could use the print() function to check values of variables at various stages of the function running and work out where problems were. The comprehensive AWS documentation, particularly the Python Code Samples for S3 were also helpful. Another slight difficulty I experienced was the short delay between the bucket being updated and the Lambda function triggering, it was only a few minutes, but enough to add some confusion to the process.

Eventually I got to a point where adding or removing a file on an S3 bucket, would trigger an invalidation in the correct CloudFront distribution. In the end I did not need to link it to the end of the CodePipeline, as the Lambda function is triggered by the update to the S3 bucket (which itself is done by CodePipeline). All that was left to do was to tidy up the code, write some documentation, and share it on Github for anyone to use or modify. I have kept this post more about the backgound to this project, the code, and instructions to use it are all on Github.

This code probably only saves a few minutes each time I update one of my websites, and may take a number of years to cancel out the time I spent working on it. Even more if I factor in the time spent on the original version prior to the CodePipeline to S3 announcement, but I find coding so much more rewarding when you are solving an actual problem. I also feel like I have levelled up as a geek, by publishing my first repository on Github. Now with this little project out of the way, I can start work on a new server, and WordPress theme for this blog, which was one of my goals for 2019.

Google Authenticator – How to Backup for Moving to a New Device

Recently I’ve had to start using two factor authentication (2FA), both for my AWS account and Bitcoin wallets. It seemed like there were two main options for apps to run this, Google Authenticator and Authy. Initially Authy looked like a good bet, it could sync across multiple devices, including smart watches, but it turns out this convenience means the security is weakened – to the point that Coinbase advised users not to use it! Google Authenticator goes the other way, it is extremely secure, but if you lose/reset your device the settings, and potentially access to your accounts are lost.

The only way to avoid this situation is to make a backup of your access codes at the time you add them to Authenticator. You can either do this by writing down the seed key, or taking a screenshot of the QR code. It is not advisable to keep these backups with your phone or readily accessible on an online computer, as this is one of the keys to your account. I prefer to print off a couple of copies, write – with a pen, which account the QR code is for and file them away separately. I also keep another copy on an encrypted memory stick. If you are using 2FA to access an online account and have not backed up your access codes – you should do it now!!!

When you get a new device, or wipe your existing device, it is just a case of re-scanning the QR code into Google Authenticator from your backup. You can test your backups by scanning them into Authenticator again, either on your existing device or a separate one – they will give the same six digit code as the original. To test that nothing was linked to my iPhone I also installed Authenticator on my old iPhone and was able to log into my AWS account – AWS is ideal for testing 2FA, as you can create a dummy account with 2FA enabled, without running the risk of losing access to your main account.

WordPress Backups Using UpdraftPlus and Amazon S3

I had a bit of a disaster the other day – I went to link to a blog post from a few months ago and it wasn’t there! I remember writing it, and knew it had posted, because I remembered some of the comments from when it appeared on my Facebook profile. I then remembered that there had been some funny goings on with the WordPress Mac app, I’d had a duplicate post and deleted it manually. However now it seems like the duplicate had also been deleted.

Of course it was at this point I realised that my latest backup was a couple of months before the post and I couldn’t recover it from anywhere. I was particularly annoyed at myself because I have a thorough backup routine for my Macs and especially my photography work, yet virtually nothing for my blog. However, it was the kick up the backside I needed to sort out a decent backup routine for my blog!

Given that I was the weak link when it came to backing up my log I wanted something automatic, that would run regularly and email me when it had completed. As with most things WordPress, there seemed to be loads of plugins available, most of them paid services. In my research I’d read good things about UpdraftPlus, so was pleased to find their free option, which is more than powerful enough for a small blog like mine.

To see if it UpdraftPlus lived up to the hype, I downloaded it onto my WordPress development environment (Chassis running on my iMac) and had a play. Looking at the list of remote storage services Amazon S3 was the obvious choice, as I already use Amazon Web Services to host my blog. Knowing the basics of cyber security, I only wanted UpdraftPlus to have minimal access to AWS, I had got myself lost in a maze of IAM, S3 buckets, users, groups and permissions. I was on the right track but this post on the UpdraftPlus blog, told me exactly what I needed to do. The IAM Policy Simulator on AWS was also a huge help in making sure my policies were both written and applied correctly. I went for the maximum security option, which also gave me a chance to delve into the workings of S3, setting up rules to archive then delete the data after periods of time.

Once deployed and tested on my development environment, it only took a matter of minutes to get working on my live blog, giving me regular, automated backups. Now the only task left to do is do rewrite the post that got lost…