Importing Movable Type Markdown into WordPress

Once again I’ve up and moved my whole little site here to a new platform — Wordpress 3, this time around, on top of Nginx + PHP-FPM + (obviously) MySQL on FreeBSD 8.1. I think I’m making a tradition of tearing down the site and rebuilding it from scratch once every, what, two years? Or maybe I’ll actually manage to keep it fresh this time around…

Heh.

Anyway, I could ramble on forever about why I ditched Movable Type and went with friggin’ WordPress. (What kind of wannabe hipster web developer doesn’t roll his or her own Django / Rails / whatever blogging software? And PHP?! Son, I am disappoint.) But that’s not what this post is about. This post is about a minor detail of how I exported the old site from MT.

You see, Movable Type supports text entry in Markdown format. And it’s cool and groovy and way better than typing out HTML by hand, but the problem is that when you ask MT to export your site, it exports your Markdown posts as unprocessed Markdown, not as HTML. So when you then attempt to import this into a program that doesn’t understand Markdown (e.g., WordPress), you end up with something resembling an explosion in a punctuation factory.

But as usual, Perl to the rescue. Pipe your Movable Type export file through the following simple script (you may need to install Text::Markdown from CPAN or your friendly neighborhood package manager first):

#!/usr/bin/perl

# mt-export-markdown.pl: Process Movable Type export files
# containing Markdown posts into pure HTML.
#
# Mark Shroyer
# Mon Oct 11 01:37:52 EDT 2010

use warnings;
use strict;

use Text::Markdown qw(markdown);

while (<>) {
    print $_;
    if ( /^(?:EXTENDED )?BODY:$/ ) {
        my $source = '';
        while (<>) {
            last if ( /^-{5}$/ );
            $source .= $_;
        }

        if ( $source =~ /^</ ) {
            # If body looks like it's already HTML, just echo it
            print $source;
        }
        else {
            # Otherwise run it through the Markdown formatter
            print markdown($source);
        }
    }
}

That done, simply import the resulting file as usual with WordPress’s Movable Type and TypePad Importer.

Firefox 3.5 on Debian Lenny AMD64

Now that Firefox 3.5 is out, I wanted to get it running on my 64-bit Debian 5.0 laptop—but there’s no 3.5 package in Sid yet, as of July 2, and mozilla.com only carries 32-bit binary tarballs for Linux. So I had to build Firefox 3.5 from source. Fortunately this turned out to be a lot less painful than I had imagined; this post will show you how I did it, in case you’re in a similar spot.

Build dependencies

The first thing to do is to make sure you’ve installed everything for the build process. Since you’re on a nice, civilized operating system like Debian, the following commands should have you covered:

$ sudo apt-get install build-essential libidl-dev autoconf2.13
$ sudo apt-get build-dep iceweasel

Obtaining the Mozilla source code

Mozilla’s official developer guide recommends downloading individual source archives for those who simply want to build a Firefox release, but I decided to get the source via Mercurial checkout, since this way there will be less to download and re-compile each time a security update comes along. The downside to this is that the initial download takes longer, and the full Mozilla source repository takes up about 686 MB on my hard drive.

First you’ll need to have the Mercurial source control manager installed on your computer, of course. If you don’t have it already, just type the following…

$ sudo apt-get install mercurial

Now change to whichever directory you want to keep the repository in, and run

$ hg clone http://hg.mozilla.org/releases/mozilla-1.9.1 \
mozilla-1.9.1

This will copy the full Firefox 3.5 / Mozilla 1.9.1 development branch to your computer. If you’re on a slow U.S. residential Internet connection like I am, this might be a good time to go do something else.

Once this is done, check out the particular Firefox 3.5 release that you’d like to build. Since there have not yet been any dot releases at the time of this writing, I just did:

$ cd mozilla-1.9.1
$ hg checkout -r FIREFOX_3_5_RELEASE

Configuring, building, and installing

The Mozilla build is controlled by a file called .mozconfig, which can live in your home directory, outside of the source tree. This makes it easy to “set” a particular Firefox build configuration, then reuse it repeatedly in the future without having to dig through old configure.log files or take notes elsewhere.

Create ~/.mozconfig with the following contents:

mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/obj-firefox
ac_add_options --prefix=/opt/firefox
ac_add_options --enable-application=browser
ac_add_options --with-system-zlib
ac_add_options --with-system-jpeg
ac_add_options --enable-optimize
ac_add_options --enable-official-branding
ac_add_options --enable-canvas
ac_add_options --enable-strip
ac_add_options --disable-tests
ac_add_options --disable-installer
ac_add_options --disable-accessibility
ac_add_options --enable-xinerama
ac_add_options --with-default-mozilla-five-home=/usr/lib/firefox-3.5

Edit: Some problems have been reported with these particular configuration options; `see below`_ if this doesn’t work for you…

Now you’re ready to actually build Firefox—another good opportunity to go find something else to do for a little while:

$ make -f client.mk build

When the build is done, install it with

$ sudo make -f client.mk install

Close any running instances of Iceweasel, then try starting up your new browser:

$ /opt/firefox/bin/firefox

Later on: getting Firefox updates

Suppose that two weeks from now there is a security update release, Firefox 3.5.1. Since you already have a copy of the Mozilla source repository on your computer and you’ve already completed a full build, downloading and building any patch releases will take significantly less time. For a hypothetical Firefox 3.5.1 build, you would do:

$ hg pull
$ hg checkout -r FIREFOX_3_5_1_RELEASE
$ make -f client.mk build
$ sudo make -f client.mk install

And then restart your web browser.

Mobile LAN-oriented filtering in iptables

One of the things that I really like about pf, the OpenBSD firewall, is how it lets you define dynamic packet filtering rules — rules that filter based on your network interfaces’ current addresses at the time of filtering. For instance, if I want to allow SSH connections to my laptop only from my local network:

pass in on xl0 inet proto tcp from (xl0:network) to any \
port ssh flags S/SFRA

(xl0:network) is not resolved to a specific address block at configuration load time; if you switch networks — say, if you go from home to work — the rule’s behavior will change accordingly.

Unless I have overlooked some recent change in Linux, this cannot be achieved in a direct fashion with iptables. You can insert a rule to reject non-LAN source addresses, but such a rule is static. When you change network addresses, the rule must be explicitly updated.

In lieu of rewriting all of netfilter to accommodate this use case (*cough*), I just wrote a shell script to help mitigate the pain of manually updating my laptop’s firewall rules — merely a shortcut to cut down on the amount of typing I do on any given day, but if you tend to move around as much as I do, all those keystrokes can add up :) So with this script you can, in one fell swoop, start and open up global access to an SSH server:

# ssh-serve any

Or only allow local access from the networks you’re connected to:

# ssh-serve lan

Or only local network access on a specific interface:

# ssh-serve lan eth0

Or only access from a given set of IP addresses and/or CIDR blocks:

# ssh-serve addr 192.168.0.104 10.18.0.0/16

Better yet, you can make the whole process automagical by hooking into your Linux distribution’s DHCP client. For instance, in Ubuntu Hardy Heron you can automate ssh-serve by creating a file /etc/dhcp3/dhclient-exit-hooks.d/ssh-serve:

# Allow SSH access from your local network only, and keep these filter
# rules up-to-date as you move from one network to another.
case $reason in
BOUND|REBIND|REBOOT)
ssh-serve lan
;;
esac

This script was written (and named) with Secure Shell in mind, but it could just as easily govern over any other service controlled by a standard SysV init script. See below the jump for the code…

#!/bin/sh
# ssh-serve - Manage SSH server status and IP-based access control.
#
# Use this script to manage the state of the system's OpenSSH server, and
# the iptables rules allowing or denying remote access to it, in one fell
# swoop.  Synopsis:
#
# ssh-serve any
#   Start the server and allow access from anywhere.
#
# ssh-serve lan ( <iface-name> )*
#   Start the server and restrict access to clients connecting from from
#   local network addresses on one of any number of the computer's network
#   interfaces.  If no interface names are specified, then all non-loopback
#   interfaces listed by ifconfig will be provisioned for.
#
# ssh-serve addr ( <cidr> )+
#   Start the server and restrict access to clients connecting from one of
#   any number of specified IP addresses or CIDR blocks.
#
# ssh-serve off
#   Shut down the SSH server and close off access in the firewall.
#
# In order to use this, you will need to set up a separate iptables INPUT
# chain to which this script has exclusive write access; it will overwrite
# any other rules that may exist in its chain.  For example, you could do
# the following:
#
# $ sudo iptables -N SSH_ACTION
# $ sudo iptables -A SSH_ACTION -j REJECT
# $ sudo iptables -A INPUT -t tcp --dport 22 -m state --state NEW \
#   -j SSH_ACTION
#
# Then set the variable SSH_CHAIN to 'SSH_ACTION' in the configuration
# section below.
#
# Mark Shroyer
# Tue Sep 23 17:11:25 EDT 2008
### BEGIN CONFIGURATION ###################################################
# SSH connection logic iptables chain.  Only new, TCP port 22 connections
# in the INPUT table should be jumped to this chain.  WARNING: Any
# pre-existing rules in this chain will be overwritten by this script.
SSH_CHAIN=SSH_ACTION
# Jump target for accepted connections (typically ACCEPT)
JUMP_ACCEPT=SSH_ACCEPT
# Jump target for rejected connections (typically DROP or REJECT)
JUMP_REJECT=REJECT
# Where is our ssh init script?
SSH_INIT_SCRIPT=/etc/init.d/ssh
# Where is our iptables?
IPTABLES=iptables
# Where is our ifconfig?
IFCONFIG=ifconfig
### END CONFIGURATION #####################################################
run() {
echo $@
$@
}
usage() {
echo "Usage: $0 ( any | lan (<iface-name>)* | addr (<cidr>)+ | off )"
exit 1
}
mask_to_cidr() {
mask=$( echo "$1" | tr . ' ' )
sum=0
error=0
for part in $mask
do
case $part in
255) sum=$(( $sum+8 )) ;;
254) sum=$(( $sum+7 )) ;;
252) sum=$(( $sum+6 )) ;;
248) sum=$(( $sum+5 )) ;;
240) sum=$(( $sum+4 )) ;;
224) sum=$(( $sum+3 )) ;;
192) sum=$(( $sum+2 )) ;;
128) sum=$(( $sum+1 )) ;;
0) sum=$(( $sum )) ;;
*) error=1 ;;
esac
done
if [ $error -eq 0 ]
then
echo -n $sum
else
return 1
fi
}
ssh_serve_off() {
run $SSH_INIT_SCRIPT stop
run $IPTABLES -F $SSH_CHAIN
run $IPTABLES -A $SSH_CHAIN -j $JUMP_REJECT
}
ssh_serve_lan() {
ifaces=$( $IFCONFIG | awk '
/^[a-zA-Z0-9]+/ {
if ( $1 !~ /^lo[0-9]*$/ ) {
printf "%s ", $1;
}
}
' )
ifaddrs=''
for iface in $ifaces
do
info=$( $IFCONFIG $iface | awk '
/inet addr:/    { info=$0; }
/UP/            { up=1; }
/RUNNING/       { running=1; }
END             {
if ( up && running ) {
printf info;
}
}
' )
if [ $# -gt 1 ]
then
if ! echo " $@ " | grep -q " $iface "
then
continue
fi
fi
if [ ! -z "$info" ]
then
vals=$( echo "${info}" \
| sed -e 's/Bcast:[^\ ]\+//g' -e 's/[a-zA-Z]\+:\?//g' )
addr=$( echo "${vals}" | awk '{ printf "%s", $1; }' )
mask=$( echo "${vals}" | awk '{ printf "%s", $2; }' )
fi
if [ \( ! -z "$addr" \) -a \( ! -z "$mask" \) -a \( ! -z "$info" \) ]
then
cidr=$( mask_to_cidr $mask )
if [ $? -ne 0 ]
then
echo "CIDR conversion error on netmask ${mask}.  Aborting."
exit -1
fi
ifaddrs="${ifaddrs}${iface}:${addr}/${cidr} "
fi
done
run $IPTABLES -F $SSH_CHAIN
for ifaddr in $ifaddrs
do
iface=$( echo ${ifaddr} | awk 'BEGIN { FS=":"; } { printf "%s", $1 }' )
addr=$( echo ${ifaddr} | awk 'BEGIN { FS=":"; } { printf "%s", $2 }' )
run $IPTABLES -A $SSH_CHAIN -i $iface -s $addr -j $JUMP_ACCEPT
done
run $IPTABLES -A $SSH_CHAIN -j $JUMP_REJECT
run $SSH_INIT_SCRIPT start
}
ssh_serve_addr() {
if [ $# -lt 1 ]
then
usage
fi
run $IPTABLES -F $SSH_CHAIN
for addr in "$@"
do
run $IPTABLES -A $SSH_CHAIN -s "$addr" -j $JUMP_ACCEPT
done
run $IPTABLES -A $SSH_CHAIN -j $JUMP_REJECT
run $SSH_INIT_SCRIPT start
}
ssh_serve_any() {
run $IPTABLES -F $SSH_CHAIN
run $IPTABLES -A $SSH_CHAIN -j $JUMP_ACCEPT
run $SSH_INIT_SCRIPT start
}
command="$1"
if [ $# -gt 0 ]
then
shift
fi
case "$command" in
off|lan|addr|any)
ssh_serve_${command} $@
;;
*)
usage
;;
esac

Chrome

This evening I had the chance to download Google’s newly-released (and by “released” I mean “beta”… hey, it’s Google) web browser, Chrome, and give it a try. They weren’t kidding when they said V8, the new JavaScript virtual machine in Chrome, should raise the bar for next-generation JavaScript implementations: it’s fast. How fast?

dramaeo.png

The above results are from Mozilla’s Dromaeo JavaScript performance test suite, so there’s little worry of this test being intentionally biased in Chrome’s favor. The scores above are the averages of five test executions on each web browser, running in the same Windows XP virtual machine on the same computer. Some notes:

  • Each run of the test was performed in a fresh browser instance.
  • IE 7 was unable to complete the test suite without crashing, although I am using a special, standalone version of IE 7 so this may be particular to my installation.
  • In order to prevent IE 8 from complaining about the long JavaScript execution time, I set set the registry value MaxScriptStatements = (DWORD) 0xffffffff in the key \HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\Styles.

So yes, Chrome does in fact have a much faster JavaScript engine than any other current web browser: in this test, more than eight times as fast as Firefox 3.0.1’s engine, and more than thirty times as fast as the latest beta of Internet Explorer 8. But how does the rest of the user experience stack up?

I want to love Chrome, I really do. Although currently a Firefox user, I am a huge fanboy of the KHTML / WebKit rendering engine due to its speed and superior standards compliance, and I was thrilled to see it put to good use as Chrome’s HTML renderer.

But as of yet, the user interface is far too constricting to make this a good general-purpose web browser. Here are some things that one cannot yet do in Chrome:

  • Manage cookie and scripting settings on a per-domain basis…
  • …or heck, disable JavaScript and plugins at all.
  • Synchronize one’s bookmarks with copies of Chrome on other computers, à la Foxmarks or Opera Sync.
  • Interactively inspect a web page’s DOM as with Firefox’s Firebug, or Opera’s Dragonfly.

The dearth of advanced features may be a real gotcha here: Chrome lacks both Firefox’s infinite extensibility and Opera’s rich built-in feature set, so power users spoiled by Opera or Firefox may never be satisfied with Google’s new browser, no matter how well it performs.

But even those of us with no interest in using Chrome itself stand to benefit from it in the long run. My hope is that Mozilla and others will take the best ideas in Chrome — most notably, V8’s performance optimizations and the browser’s comprehensive sandboxing model — and adopt them for future releases of their own web browsers. That way, we all win.

Patch for segfault in OpenBSD 4.3’s pfctl

A couple of months ago, I upgraded an old PowerPC machine from OpenBSD 4.2 to 4.3, and I discovered that the new version of pfctl in 4.3 would segfault when reading my old pf.conf file. Some brief poking around with GDB revealed the root of the problem, an uninitialized variable in the new configuration file parser.

If you’ve been bitten by this as well, here’s a patch with the minor change that solved the problem for me:

--- sbin/pfctl/parse.y  Sat Feb 23 15:31:08 2008
+++ sbin/pfctl/parse.y  Thu May 15 08:55:38 2008
@@ -3487,9 +3487,11 @@
qname          : QUEUE STRING                          {
$$.qname = $2;
+                       $$.pqname = NULL;
}
| QUEUE '(' STRING ')'                  {
$$.qname = $3;
+                       $$.pqname = NULL;
}
| QUEUE '(' STRING comma STRING ')'     {
$$.qname = $3;

To apply this patch, perform the following (assuming that you have the OpenBSD 4.3 source code tree at /usr/src on your system):

# cd /usr/src
# patch -p0 </path/to/above/patch
# cd sbin/pfctl
# make && make install

My ISP blocks outbound SMTP traffic, unfortunately, and I didn’t feel like setting up Sendmail relaying just so I could submit a sendbug report, so I couldn’t open a ticket for the bug. I did send this patch to the bugs@ mailing list, but it was unable to generate any interest there; if someone stumbles across this who has a functional sendbug on their system, I’d be grateful if you could submit this patch in a proper bug report.

The segmentation fault doesn’t occur on the i386 port of OpenBSD (as far as I can tell), nor does it occur on the macppc port unless you use the “queue ( qname, pqname )ALTQ syntax, so it’s easy to see why the hordes aren’t exactly beating down the OpenBSD folks’ doors about this one. So I figured I should post this here, where people might find it, until someone gets around to committing an official fix.

Pagination