February 01, 2020     3min read

Migrating WordPress to GatsbyJS - Blog Posts


Table of Contents


Blogging on GatsbyJS isn't an uncommon pattern, in-fact if you google gatsbyjs blog you will find a number of fantastic tutorials on how to get started. Most of these examples are Greenfield projects though, meaning they expect that you're looking to begin your blogging life on GatsbyJS from scratch.

In this post we are going to cover a more complicated example where we want to migrate WordPress blog posts to GatsbyJS in an automated way. We will make use of a number of open source tooling for each step, so you will be able to follow a similar path for your own migration.

GatsbyJS Blogging

To start with it's beneficial if you understand how blogging on GatsbyJS usually works. The most common pattern is to format blog posts in Markdown, which is a lightweight plain-text formatting language commonly used for GitHub README files.

Markdown is used by many static blog platforms as well such as Hugo and Jekyll so it's becoming more of a standard that blog content be formatted in this manner for portability between static site generators.

Below is an example of a snippet from the Markdown file that is actually being used to generate the blog post you are reading right now!

---
title: Migrating WordPress to GatsbyJS - Blog Posts
slug: migrating-wordpress-to-gatsby-js-blog-posts
description: There were 48 WordPress powered blog posts currently hosted that need to be recreated in GatsbyJS. We look into ways of converting these posts Markdown in an Automated way.
date: 2020-02-01 00:00:00
published: true
author: Nathan Glover
tags: ["gatsbyjs", "migration"]
featuredImage: img/migrating-wordpress-to-gatsby-js-blog-posts.jpg
ogImage: img/migrating-wordpress-to-gatsby-js-blog-posts-seo.jpg

---

Hi, this is blog content!

I won't go into too much detail on the semantics of writing Markdown because there's already great guides on how to format available online.

What I really want to articulate is that in order to display our blog content, It is important that our blogs are conform to Markdown.

WordPress Blogging

On the flip side WordPress blogging is highly structured and saved in a completely different structure. This is due to the graphical nature of the blog designer via the GUI.

WordPress blog designer
WordPress blog designer

Behind the scenes WordPress actually uses XML to store the format of your blog posts (and all WordPress content layout for that matter). This XML contains key value pairs that are similar to Markdown however don't map 1-1.

Good news is that you can get access to this XML data pretty easily, and there's a couple fantastic open source projects dedicated to converting XML to Markdown.

WordPress to Markdown

We are going to be making use of the project wordpress-export-to-markdown by Will Boyd for this section.

I will also mention that I made my own changes to the repository to better suit my needs; It can be found at t04glovern/wordpress-export-to-markdown

Pull down a copy of the repository locally and be sure you have Node.js v12.14 or later installed. Next we're going to get to get a copy of our WordPress export; this can be acquired from the Tools > Export menu. Either export All content or just Posts; you will receive an XML dump with your site content embedded.

WordPress exporting content
WordPress exporting content

Make a copy of the export to the same folder as the wordpress-export-to-markdown project and rename it something simple like export.xml, then run the following and select the options that make the most sense for you:

npm install && node index.js

# Starting wizard...
# ? Path to WordPress export file? export.xml
# ? Path to output folder? output
# ? Create year folders? No
# ? Create month folders? No
# ? Create a folder for each post? Yes
# ? Prefix post folders/files with date? No
# ? Save images attached to posts? Yes
# ? Save images scraped from post body content? Yes

# Parsing...
# 47 posts found.
# 495 attached images found.
# 318 images scraped from post body content.

# Saving posts...
# [OK] streamline-your-ssh-workflow-with-ssh-config
# [OK] deploying-a-private-vpn-to-aws-ec2-using-cloudformation
# [OK] flutter-ci-cd-deployments-publication-to-google-play
# [OK] create-a-private-vpn-using-aws-iot-button-sns-cloudformation
# ...

The process will take a while depending on if you selected to download remote images or not. Once complete you should have a complete dump of all blog posts in the correct format for use within GatsbyJS already!

WordPress to Markdown output
WordPress to Markdown output

GatsbyJS Remark

For rendering blog posts I used gatsby-transformer-remark to index all the data within our Markdown files. I won't spend too much time explaining this process as there are already great tutorials on using it:

I will however provide the configuration I used for the content above.

gatsby-config.js

...
{
  resolve: `gatsby-transformer-remark`,
  options: {
    plugins: [
      `gatsby-remark-embedder`,
      {
        resolve: `gatsby-remark-autolink-headers`,
        options: {
          className: `gatsby-remark-autolink`,
          maintainCase: true,
          removeAccents: true,
        },
      },
      {
        resolve: `gatsby-remark-prismjs`,
        options: {
          classPrefix: "language-",
          inlineCodeMarker: null,
          aliases: {},
          showLineNumbers: true,
          noInlineHighlight: false,
        }
      },
      {
        resolve: `gatsby-remark-images`,
        options: {
          maxWidth: 1200,
          showCaptions: true
        }
      },
      {
        resolve: `gatsby-remark-copy-linked-files`,
        options: {
          ignoreFileExtensions: [`png`, `jpg`, `jpeg`],
        },
      },
    ]
  }
},
...

The configuration above will expose your blog data over the allMarkdownRemark source

Markdown blog posts from GraphQL
Markdown blog posts from GraphQL

NOTE: The query you can use to confirm this will depend on the fields you provided at the top of each of the .md files of the posts

Summary

In this post you've learnt how to take existing blog data from a WordPress instance and convert it to Markdown for use with GatsbyJS Remarker. Not only will your posts be easier to version control in source control now, but they are also totally compatible with other static site generators.

This means you are future-proofing your amazing content!

Check out the next post to learn we will be deploying out GatsbyJS static site & incorporating CI/CD.

devopstar

DevOpStar by Nathan Glover | 2020