During DebConf this year, I converted my blog from Typo to Ikiwiki. After mucking around with various bits of bit-rotting export code from the web (I considered going Typo -> WordPress WXR -> Ikiwiki WordPress import ), I threw up my hands in exasperation and decided to roll my own migration script.
I have my doubts as to the probability of anyone else in the world ever migrating between these two pieces of software, but maybe, just maaybe, this will save someone else a bit of a headache. Here's the route I took.
First, I went to the Typo admin page. I went to the comment moderation page, and clicked "delete all marked as spam". I've found Typo's spam filtering to be pretty good, so it was fine just assuming it was always right without going through its decisions manually.
This purged all spam comments from the database. I then noted the "Total posts" and "Total comments" numbers from the dashboard. These are useful for later verifying that the conversion script doesn't lose posts or comments. Drafts don't show up in the "Total posts" number, though they do show up in the "Your posts" number—-so while the script doesn't migrate drafts, any that exist won't affect the number.
Next I dumped the database in tab-delimited form (easier to parse than a single file full of SQL):
(on my webserver) mysqldump -T /tmp <name-of-my-blog's db> -uroot -p
For some reason, mysqldump pukes when trying to dump to anywhere that's not /tmp. I'm not sure why.
I then copied the dump to my laptop where I was going to run the conversion:
(from my laptop) scp <my-webserver>:/tmp*.txt dump/ scp <my-webserver>:/tmp*.sql dump/
You really only need contents.txt, tags.txt, and feedback.txt. But I wanted to keep the entire dump, just as a backup / in case I need any of the data later.
Then, I ran this script on the dump. You can clone this repo by:
git clone git://git.spang.cc/git/typo2ikiwiki.git
Invoke it from the dump directory with no arguments.
(You can also throw in a -v switch and it will be much more verbose about what it's doing. I mostly used this for debugging since it generates quite a lot of scroll. It will tell you what directories it has generated when it's done; they'll be in the current directory.)
Then I copied the generated directories into the git checkout of my local test ikiwiki blog:
cp -r posts tags 200* ~/blog
And ran:
ikiwiki --setup ~/blog.setup --rebuild
and browsed the blog on my local webserver to make sure everything went alright.
The script preserves old permalinks by creating redirects in the locations of the old posts. There's probably a better strategy, but at this point I just don't care. It works well enough to not want to bother spending any more time on it.
After making sure I got a good conversion, I installed ikiwiki on my webserver via a simple
aptitude install ikiwiki
, disabled my typo vhost and purged a whole lot of ruby packages and mysql, and then copied the dump into the newly created blog git checkout on the webserver. (Good directions for initially setting up the wiki are here.) I also followed the [non-meta-directive part of] this tip to make sure that my old posts don't show up in my new RSS/atom feeds.
After that, it was basically just some lighttpd configuration:
$HTTP["host"] =~ "blog\.spang\.cc" {
server.document-root = "/home/spang/public_html/blog"
url.rewrite-once = (
"^/xml/rss20/tag/planet/feed.xml$" => "/tags/planet-debian/index.rss",
"^/xml/rss20/tag/([^/]+)/feed.xml$" => "/tags/$1/index.rss",
"^/articles.atom$" => "/index.atom",
"^/articles.rss$" => "/index.rss",
)
}
These days, I laugh in joy when my webserver with 512 MB of RAM no longer thrashes on rails processes and sends me logwatch messages telling me that it's killed some processes because it ran out of memory (okay, perhaps that only happened before I upgraded it from ~380 MB of RAM to 512). Also, I wrote this post in vim on a train. And I could clean up my posts' tags by just munging a bunch of text files. I don't know if typo being memory-heavy and slow is rails' fault, but I'm glad I'm done with it. My blog should not hose my webserver.