How I Moved From Wordpress to S3

2019-01-06 2162 words 11 minutes

Contents

Being that I recently took the plunge to a static site hosted on AWS S3 I thought I would create a post outlining the high-level process for future reference. There are quite a few blogs in the interwebs outining this process, but if this helps someone else too then it’s a win-win. If you are curious as to WHY I migrated, you can find a short bit about that in this post. I purposefully made this a longer post outlining the entire process versus breaking it into smaller ones so that everything is in the same place and easy to find and understand. Use the ToC to help navigate. If you happen to be reading this and actually have a question, please contact me and I’ll be as forthcoming as I can. Let’s jump in.

Choosing static platform

I started this process knowing I wanted to continue diving into AWS. That meant I needed to use their file storage platform known as Simple Storage Solution (S3) which allows static websites. But I also knew I didn’t want to build everything entirely from scratch, so I started searching for a good static site builder. I already had some experience with Jekyll on GitHub, but wasn’t sure I wanted to use it for this site. I had a few criteria that I wanted it to support out of the box, such as the following:

Contact form
Categories
Static pages
Blog focused
Social icons
Search
Responsive
About section
Comments on posts
RSS
Tag cloud
Featured images

While most builders don’t cover all of those items, some do get close enough. Ultimately, I decided on Hugo with Jekyll still being a close second.

Pulling Data from WordPress

Once I knew what platform and builder I was going to use I could start exporting my data from WordPress and import it into Hugo. Fortunately, Hugo has some great instructions here. The 2 native plugins didn’t work for me that well, so I went with their other suggested method of exporting to Jekyll then converting the data to Hugo format. Either way is relatively simple if you follow their steps. I imagine this will change over time, so I’m not going to offer more detail.

Setting up Hugo

I then followed their instructions to install and start using Hugo. I went through the install and setup on my PC without issue. Importing the data is just a matter of dragging and dropping the files into Hugo’s folder structure.

Selecting an editor

With Hugo installed and my data imported I wanted a better editor. One of the bigger complaints I had with Wordpress was using their built-in editor. I had to login to my website to use the editor, and much of the time the formatting did not align with my theme and what visitors saw. This made it challenging to create a standard feel across the site.

The new editor needed to provide some Syntax help and highlighting, be usable offline, and compatible with GIT. Two of the most prominent editors in this space are Microsoft’s VS Code and Axon. I have used Axon before, so I thought I’d try VS Code this time. So far I am pleased, but either one should at least fulfill most of your requirements.

Editing From Anywhere

Another capability I really wanted was to be able to edit my blog easily from anywhere. To do this I’m using CodeHub and MWeb on the iPad along with FastHub on Android. Both of these sync with GitHub (I’ll talk more about that below) and allow me to easily edit while I’m on the go. I find this to be very convenient. In fact, I’m editing this very post on the iPad while sitting on the couch, and on my phone at lunch, and at my PC for screen shots. The only drawback is I can’t preview my edits as well, but that’s why a dev site/bucket is useful!

Picking a theme

Using VS Code I saw that I could view and edit my files and even build my site using Hugo right from its GUI. Of course, the site won’t build properly without a theme. One of the great things about Hugo is the many themes available to help get started quickly. I browsed [https://themes.gohugo.io] and eventually settled on the Tranquilpeak theme. Each theme comes with its own documentation and customizations, so it took a little time to find one I really wanted.

Previewing the Build

Another positive for hugo is the ability to preview themes, changes, and addons quickly. All you have to do is run the hugo server command. You can leave this running and it will update automatically anytime a file is modified and saved. You can preview your site on the device running Hugo by opening a web browser and entering http://127.0.0.1:1313/. Once satisfied I built the site and prepared it for upload.

Building the Site

Building the site is easier than previewing it! Just run hugo which will build the site using all of the custom configuration and content and then save it to the public/ folder. The contents of this folder can then be copied to your webserver (or in this case an AWS S3 bucket).

Setting up GitHub

One of my other big reasons for moving to a static site was version control. With WordPress everything is live. Sure, you can have drafts, but once you publish it there is no going back. If you want backups of previous versions then you have to do that yourself, or maybe find a plugin if one even exists.

Enter GitHub.

Being that I’m not a developer I entered the GitHub scene late. Now that I’ve discovered its potential though I’m not sure how I ever lived without it! It is completely based around version control and Continuous Deployment. This means my edits are safe and reversable, I have built-in backups, I can edit new posts in seperate dedicated branches, and I can seamlessly integrate with S3! So, I created a new repo for this site and began uploading my data.

Now, you might be thinking why not use AWS CodeBuild? Well, the truth is I was already comfortable with GitHub and knew I could setup this site quickly there. I am using AWS CodePipeline to integrate with GitHub and deploy to S3. I have plans to look into CodeBuild further and decide if moving the site’s code to there makes more sense.

Uploading Data

Back in VS Code I configured it to sync with GitHub. This was rather trivial and I was able to commit my code in no time. Git does need a little configuration of its own though, so here is the contents of my .gitignore file instructing Git on what to ignore when uploading and building code.

.gitignore File

1
2
3
4


/public
LICENSE
README.md
/themes

With my files on GitHub I could rest easy that I had versions and a backup, but this isn’t where I was actually going to publish my site. I needed a way to automatically build the site when a commit was made and push the results to my S3 bucket. This is where CodePipeline becomes useful.

CI/CD with AWS CodePipeline

AWS CodePipeline is a continuous delivery service that automates the building, testing, and deploying of code. It seamlessly integrates with GitHub. I had integration within minutes an a fully working solution shortly after. Now when I make a commit to the master branch in GitHub CodePipeline will build and test the code then deploy the “public” directory to S3. This is all configured in the buildspec.yml file. Here is my working copy:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18


version: 0.2

env:
  variables:
    hugo_version: "0.53"

phases:
  install:
    commands:
      - curl -Ls https://github.com/spf13/hugo/releases/download/v0.53/hugo_0.53_Linux-64bit.tar.gz -o /tmp/hugo.tar.gz
      - tar xf /tmp/hugo.tar.gz -C /tmp
      - mv /tmp/hugo /usr/bin/hugo
  build:
    commands:
      - hugo
  post_build:
    commands:
      - aws --region us-east-2 s3 sync --delete --size-only public s3://chris.theserenos.com/

Here are a couple of screen shots from my PipeLine console:

Note: Make sure you check the local cache option in the artifacts section or your entire GitHub repo will upload to S3 everytime!

Travis-CI

Before AWS CodePipeline I did use Travis-CI for a couple of weeks. It works really well and is what I would advise if you are not using anything else in the AWS platform. It is a little slower than CodePipeline, but not enough to truly complain about. Travis-CI uses the file .travis.yml for its configuration. Here is a copy of the config I used:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25


language: go
go:
  - master
branches:
  only:
  - master
install:
  - go get github.com/spf13/hugo
script:
  - hugo
deploy:
  provider: s3
  bucket: chris.theserenos.com
  skip_cleanup: true
  local_dir: public
  region: us-east-1
  access_key_id: AKIAJ4OUPKW7MXOGYBOA
  secret_access_key:
    secure: MY_ENCRYPTED_KEY
notifications:
  email:
    recipients:
      - MY_EMAIL
    on_success: always # default: change
    on_failure: always # default: always

Configuring S3

Once the GitHub integration was complete and I could see CodePipeline receiving the updates and passing builds I moved on to locking down my S3 bucket. I use CloudFlare for my CDN and wanted an SSL connection to my S3 bucket. This meant I needed to use CloudFront to expose my site to the world. If I find that prices are cheap enough I might move to just using CloudFront, but for now this is what I’m using. So, my S3 bucket only allows connections from CloudFront and CodePipeline. Everything else is denied.

S3 Bucket Policy

I’ve stripped out the more sensitive bits.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46


{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "1",
            "Effect": "Deny",
            "NotPrincipal": {
                "AWS": [
                    "arn:aws:iam::cloudfront:user/CloudFront Origin Access Identity ******",
                    "arn:aws:iam::*******:role/service-role/codebuild-Blog-CodeBuild-service-role",
                    "arn:aws:iam::*******:role/service-role/AWSCodePipelineServiceRole-us-east-2-Blog-CodePipeline",
                    "arn:aws:iam::*******:root"
                ]
            },
            "Action": "s3:*",
            "Resource": "arn:aws:s3:::chris.theserenos.com/*",
            "Condition": {
                "StringNotLike": {
                    "aws:userid": "arn:aws:iam::*********:role/*"
                }
            }
        },
        {
            "Sid": "2",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::cloudfront:user/CloudFront Origin Access Identity ******"
            },
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::chris.theserenos.com/*"
        },
        {
            "Sid": "3",
            "Effect": "Allow",
            "Principal": {
                "AWS": [
                    "arn:aws:iam::********:role/service-role/codebuild-Blog-CodeBuild-service-role",
                    "arn:aws:iam::********:role/service-role/AWSCodePipelineServiceRole-us-east-2-Blog-CodePipeline",
                    "arn:aws:iam::********:root"
                ]
            },
            "Action": "s3:*",
            "Resource": "arn:aws:s3:::chris.theserenos.com/*"
        }
    ]
}

S3 CORS Configuration

1
2
3
4
5
6
7
8
9


<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<CORSRule>
    <AllowedOrigin>*</AllowedOrigin>
    <AllowedMethod>GET</AllowedMethod>
    <MaxAgeSeconds>3600</MaxAgeSeconds>
    <AllowedHeader>Authorization</AllowedHeader>
</CORSRule>
</CORSConfiguration>

Configuring CloudFront

For CloudFront I just needed it to serve the content via HTTPS, provide IPv6, and reroute HTTP to HTTPS. So, I enable those options and elected to use the default SSL cert. I don’t do any caching. The only thing CloudFront does do is refer to a Lambda job that converts standard URL requests and appends a “/“ to make the request Hugo compatible.

Configuring Lambda to Modify Get Requests

This is the code in the Lambda job to convert the URI requests to include a “/“ and make them more suitable to Hugo and a static site.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23


'use strict';
exports.handler = (event, context, callback) => {
    
    // Extract the request from the CloudFront event that is sent to Lambda@Edge 
    var request = event.Records[0].cf.request;

    // Extract the URI from the request
    var olduri = request.uri;

    // Match any '/' that occurs at the end of a URI. Replace it with a default index
    var newuri = olduri.replace(/\/$/, '\/index.html');
    
    // Log the URI as received by CloudFront and the new URI to be used to fetch from origin
    console.log("Old URI: " + olduri);
    console.log("New URI: " + newuri);
    
    // Replace the received URI with the URI that includes the index page
    request.uri = newuri;
    
    // Return to CloudFront
    return callback(null, request);

};

Configuring CloudFlare

Last, but not least, I setup CloudFlare to point to the given CloudFront URL and enable caching. This is also where I take advantage of CloudFlare’s additional security offerings. I have strict SSL enabled, HSTS options, higher TLS versions, further challenges for nefarious countries and IPs, and more. Not only that, but CloudFlare also helps enhance images and improves loading time. CloudFlare is relatively simple to setup if you have done any DNS configuration in the past. Even if not, they help you quite a bit.

Issues I Encountered

Categories needed clean up
Lost built-in search feature
Contact forms and RSS feeds needed updated
Had to fix some URL structures
Lost comments, but I knew this going into it and was okay with it
Some images didn’t convert right and WordPress created a lot of duplicates I had to delete.
I’m still not 100% satisfied with my Hugo file structure. I might have to revisit this.
Still learning some tricks for editing from mobile devices.
Everything is public on GitHub with free repos, so I can’t have a personal or private area if I wanted one.

Helpful Websites

… and so many more!