Wellfire Interactive // Expertise for established Django SaaS applications

Content migration: site links and SEO

Protecting valuable links for users and search engines is an important step in any site migration

Once you’ve migrated your content to a brand new content management system you may very well end up saving your valuable content but hiding it from hungry users. You finally got rid of those ugly parameter based links (you know, http://mysite.com/?product=espressomakers) but two of those old links were top hits on Google, and who knows how many inbound links you have - you don’t control those.

There are a couple of solutions for this problem. The first and preferred method is to redirect all of the old links to the new locations. The second is to maintain a parallel and linked copy of the old content. More on that later.

If your site has a manageable number of pages (let’s say 2-3 dozen for now, excluding a blog), you could map all of those pages to their new home, if they have one. What you’ll need to do is add some rewrite rules in your Apache .htaccess file that create permanent redirections from the old locations to the new ones. Note: you could do this elsewhere, too, e.g. by using HttpResponsePermanentRedirect in a Django based system. Whatever you do you want to set up permanent redirects. We’ll just use .htaccess here.

For our earlier example link, you might set up a redirect like so:

RewriteCond %{QUERY_STRING} ^product=(espressomakers)$
RewriteRule ^(.*)$ http://mysite.com/products/espressomakers? [R=301,L]

This rule consists of two lines, necessary because the redirect rule line by itself ignores everything in URL query strings. If we redirecting a file name or a path without queries, it might look more like this:

RewriteRule ^espressomakers.html$ http://mysite.com/products/espressomakers [R=301,L]

Notice we removed that trailing question mark from the new URL. That was added in to tell the first rule not to add the query expression back into the URL, which might be useful in some cases. In both cases the R=301 reports an HTTP 301 status code, which means permanent redirect, and the L tells Apache server that this is the last rule to evaluate. No need to keep checking other rewrite rules if we don’t need to.

You have two options: you can map every URL, or you can not map every URL. If you have a lot of pages to map, and the URLs don’t follow a consistent pattern that can be applied as a rule, you might not care to map them all. What you will want to do at the very least is map the pages that serve as popular landing pages. You can discover these through your analytics reporting (which hopefully you were able to set up before migration) or by using Google Webmaster Tools. This will show you top queries to your site (from which you can find those key pages) as well as pages receiving lots of inbound links. Those are the pages that you need to redirect.