Wellfire Interactive // Expertise for established Django SaaS applications

The Old Pony: But first the backups, or losing data

This week’s tip isn’t Django specific, but it’s important to every Django project none the less. When you first approach an existing production app for the first time, what’s one of the first things you should do before starting new feature development? Run the tests? Well, yes, but also make sure there is a database backup strategy in place and that it’s tested!

Surely this is old hat, but it can’t be repeated enough, especially for legacy (for any flavor of “legacy”) applications.

Let me share a story - a bit of a mea culpa - that should drive home the importance of this and illustrate what I mean by strategy.

We were building one of the first SaaS style applications we worked on for a customer, this was back in the Django 1.1 days mind you. A partner was managing the hosting on co-located machines - this was before cloud computing became the de facto _choice - using a production and staging server, respectively, that shared a networked drive mounted for PostgreSQL’s file storage. The _idea was to allow reasonably easy failover to the staging machine if the production machine went down. This is a big no-no which we were soon to discover.

A few weeks into running the application with our customer’s own “beta” customer the drive’s network connection went down which in turn resulted in a corrupted database. As in, entirely unusable. In tandem with our partner we raced to identify solutions for bringing it back up only to learn later from a PostgreSQL core developer that what happened meant the database was totally, completely, and utterly hosed, “Hope you have good backups.”

We had made occasional manual backups for development but our partner had taken responsibility for setting up automated backups. Turns out they got busy… and there were no other backups. With our development backups and by replaying stored ingressed API data we were able to restore a substantial amount of the data, but not entirely. Needless to say we made sure to set up backups ourselves from then onwards.

So that’s the story, but what did I mean about “strategy”? Well you have a number of decisions to make, like what kind of backup to make, the frequency of backup, how to restore, and also what other data to keep around. Much of the core and most valuable data in the application in question was from data sent over API to our customer’s application, and we made the _wise _decision to maintain copies of all of this data in file format. This turned out helpful later on for other reasons (more about that in another email) but allowed us to fill some of the gaps between the last backup and the failure state by scripting the data back into the system. That should not be your primary backup strategy(!) but it’s a useful backup.

Make a backup! Ensure it’s automated! And test periodically that you can bring your database back from whatever backups you make.

Safe restorations,
Ben

Learn from more articles like this how to make the most out of your existing Django site.