The High Level Problem:
The client had a legacy Ruby on Rails custom website. The original developers were gone and unavailable, along with the knowledge of the site, other than the source repository. Could the content be extracted from the custom website and moved to a standard low-cost CMS platform?
Selecting a CMS:
Despite having misgivings about WordPress, it is still the #1 most used CMS. Even if it gains the lead postition by being free and quick to set up, there are many large high-traffic media sites running on it. Also most CMS systems will have a Wordpress import tool, if any, so Wordpress could end up being an intermediate solution.
Importing to WordPress:
Common methods used to import into Wordpress:
- Migrate using available Wordpress import tools
- Migrate using database import tools
- Migrate by importing a CSV export into Wordpress
- Migrate by importing existing RSS feed into Wordpress
The free WordPress import tools could import from various sources, but were not really built to import from a custom website. Importing from database to database is problematic, and not something of interest to me. Of the four choices, the RSS import looked most promising. However on the custom website the RSS feed contained only excerpts of the original posts, which ruled out this option.
The WordPress REST API
I had already investigated the WordPress REST API, and found it fairly straight forward to use. The API allows for the import of:
- posts
- authors
- categories
- tags
- images
First Connection to REST API from Rails
Some simple tests from the Rails console:
bundle exec rails c
res = HTTParty.get("http://localhost:8000/wp-json/wp/v2/categories")
Now adding authentication, in order to do secured transactions:
auth = {:username => "user", :password => "password"}
res = HTTParty.get("http://localhost:8000/wp-json/wp/v2/users", basic_auth:auth)
And posting in some data through the body:
res = HTTParty.post("http://localhost:8000/wp-json/wp/v2/tags",
basic_auth: auth, body: {name:'Yoga',slug:'yoga', description:'Yoga'})
Build Minimal Wordpress REST API accessor class using HTTParty
#/model/wordpress.rb
require "httparty"
require "json"
USERNAME = "user"
PASSWORD = "password"
ENDPOINT = "wp-json/wp/v2"
class Wordpress
include HTTParty
format :json
base_uri "localhost:8000"
def initialize(endpoint, username, password)
@endpoint = endpoint
@options = {
basic_auth: {
username: username,
password: password
},
headers: {
"Content-Type" => "application/json"
}
}
end
end
To test open the rails console, and load this class (or just paste it in), then issue:
wp = Wordpress.new(ENDPOINT, USERNAME, PASSWORD)
All this does is initialize HTTParty. Now extend the class to access Wordpress via it's API.
Extend Wordpress class to post a tag to Wordpress
Given a tag from the rails application, load that tag to Wordpress. The mapping for tags is straightforward, name to name and slug to slug.
# model/wordpress.rb
...
# Post a Rails tag into Wordpress
#
# * *Arguments* :
# - +Tag+ - instance of a rails tag to load
# * *Returns* :
# - +HTTParty::Response+
# * *Raises* :
# - +HTTParty::Response+
#
def create_tag(tag)
body = {
name: tag.name,
slug: tag.slug
}
options = @options.merge({ body: body.to_json})
response = self.class.post("/#{@endpoint}/tags/", options)
case response.code
when 200..201
Rails.logger.info "Created tag for id:" + tag.id.to_s
return response
else
Rails.logger.warn "Error posting the tag, id:" + tag.id.to_s + " " + tag.slug
puts "Error posting the tag: #{response.code} #{response.message}"
return response
end
rescue => err
Rails.logger.warn "Rescued create_tag, id:" + tag.id.to_s + " " + tag.slug
puts "create_tag Error: #{err}"
return response
end
To test open the rails console, and load this class (or just paste it in), then issue:
wp = Wordpress.new(ENDPOINT, USERNAME, PASSWORD)
tag = Tag.first
res = wp.create_tag(tag)
Extending Wordpress class to get last tag loaded
Now that a tag has been loaded, I need the ability to get the last tag loaded. This will be used later.
# model/wordpress.rb
...
# add @last_query to the initialize method
...
@last_query = {
"orderby" => "id",
"order" => "desc",
"per_page" => 1
}
...
# Get the last Rails tag loaded in to Wordpress
#
# * *Returns* :
# - the last Rails Tag object already loaded into Wordpress
# * *Raises* :
# - +HTTParty::Response+ -> if any problems
#
def get_last_tag()
options = @options.merge({ query: @last_query})
response = self.class.get("/#{@endpoint}/tags/", options)
if response && response.success?
wordpress_last_tag = response.parsed_response.first
rails_last_tag = Tag.find_by(slug: wordpress_last_tag["slug"])
return rails_last_tag
else
Rails.logger.warn "Problem getting last tag: " + response.code + " " + response.message
return nil
end
rescue => err
Rails.logger.warn "Problem getting last tag"
puts "get_last_tag Error: #{err}"
return response
end
To test open the rails console, and load this class (or just paste it in), then issue:
wp = Wordpress.new(ENDPOINT, USERNAME, PASSWORD)
last_tag = wp.get_last_tag
Add service class to export tags (posts, categories, etc).
The PostExportWorker class ties everything together. It will load the tags, it can be restarted, and it will log any errors to an error file for later review. In order to be restartable it will, when restarted:
- find the last tag posted to Wordpress
- find it's corresponding tag in rails
- start the loading process from the next tag
Making the export process restartable means I can start the process, interrupt it at any point, and resume later, without starting from the beginning again. In this case there are almost 25K tags, so restarting is very helpful.
# services/post_export_worker.rb
#
class PostExportWorker
USERNAME = "user"
PASSWORD = "password"
ENDPOINT = "wp-json/wp/v2"
@@wp = Wordpress.new(ENDPOINT, USERNAME, PASSWORD)
def log_error(response, item)
msg = ""
msg += response.parsed_response.key?("code") ? response.parsed_response["code"] : ""
msg += response.parsed_response.key?("message") ? ": " + response.parsed_response["message"] : ""
slug = item.slug? ? item.slug : "?"
err = {"class" => item.class.to_s, "id" => item.id, "slug" => slug, "message" => msg}
File.open('log/PostExport.log', 'a') do |f|
f.puts JSON.pretty_generate(err)
end
end
def export_create_tags()
Rails.logger.info "export creating tags"
last_tag = @@wp.get_last_tag
last_tag_id = last_tag.nil? ? -1 : last_tag.id
Rails.logger.info "export starting tag: " + last_tag_id.to_s
tags = Tag.where("id > "+last_tag_id.to_s).order(:id).limit(3)
tags.each do |t|
export_create_tag(t)
end
end
def export_create_tag(tag)
Rails.logger.info "export creating tag for id:" + tag.id.to_s
response = @@wp.create_tag(tag)
if response && response.success?
tag_id = response.parsed_response["id"]
Rails.logger.info "export created tag for id:" + tag.id.to_s + ", new id:" + tag_id.to_s
return response.success?
else
log_error(response, tag)
return false
end
end
end
Extending for other Objects
The same pattern is then repeated for categories, users, images, and posts. Each of these other objects had their own wrinkles. Posts were the most complicated of these, in my case creating a specific view to render the post and its related post_items.