Wellfire Interactive // Expertise for established Django SaaS applications

This Old Pony: SELECT * FROM DATABASE_LOCKIN;

After a week off, we’re back to wrap up the topic of “lock-in” again this week.

Oh, congratulations! You have a perfectly abstracted application with swappable payment systems and an elegantly modular architecture such that you can make changes willy-nilly. Not only that but it’s super fast because you’ve managed to identify key operations which could be moved to database stored procedures… 

And there’s the rub.
 

Your database and Roman chariots

You’ve probably heard the backstory about the space shuttle boosters and how they’re sized based on the width of horse posteriors. If not it goes like this in brief: the size of these rockets was based on the need to travel through mountain tunnels sized for trains which run on train tracks, and track gauge was sized for carriage wheels which in turn were sized for ruts which were created by Roman chariots which were wide enough to be pulled by two horses side-by-side.

Chariots of fire

It’s a just-so story but it illustrates a point: there’s a lot of path dependence based on infrastructure decisions.

Your application doesn’t run on rails (I mean, c’mon!). Instead backing systems, specifically databases, including search engines, caches, and file systems are the main infrastructural components in your Django app.
 

Beyond the veil of abstraction

The nice thing about using an ORM like Django’s is that it provides a layer of abstraction between your application and the database. This allows reusable Django apps to be included in projects regardless of the project’s database of choice, and even for individual Django projects to switch databases.

However there are always features that don’t fit into this shared abstraction layer and once you head down this path that you’re accepting lock-in.

Again, this isn’t necessarily bad! But it helps to understand the scope of the benefits, the tradeoffs, and the anticipated transition costs. ##
The forms of database lock-in There are two obvious ways of introducing database lock-in into your application:

  1. Using unique databases or database features
  2. Directly accessing database features without using an application abstraction layer

The former might be supported by Django (as we’ll see below) or it could mean relying on some new ancillary backing service, like a specialized time series database[0]. A less typical but equally significant example is pushing application logic to the database, in the form of triggers and stored procedures.

The latter typically looks like raw SQL queries. Given the subtle differences in SQL between databases this usually points to using a single database or family of databases for the project lifetime.
 

The geospatial database (a short example)

The functionality required of most Django applications use a pretty narrow set of functionality that can be ported across databases, from MySQL to Sqlite (no, really). There’s always an exception.

What happens when you want to start adding geospatial data? If you’re adding latitude and longitude pairs you can get away adding a couple of numeric columns. But for indexing and creating more complex data like lines and shapes, you’ll need to use a geospatial module.

There are several choices, but most signs point to PostGIS, the PostgreSQL module for supporting geospatial data types and indexes. It’s considered among the best geospatial databases and is the only geospatial DB to support all of Django’s spatial lookups[1]. 

The upshot is that if you move to PostGIS and start using these features in the database, you risk getting locked-in unless you want to abandon a lot of functionality. Or if you’ve committed to a different database you’re now locked-_out _of a lot of functionality.
 

So what?

“I don’t want to switch databases, stored procedures are faster than application code, and I can do everything with standard SQL.”

Great! For most applications database choice is a simple one-and-done affair. But these decisions always set you down a path that can be very hard to deviate from. You’ll be using those features, and hooked onto that release schedule, for a long time (that’s the goal!).

SELECT * FROM,
Ben

 
[0] As an example https://github.com/influxdata/influxdb
[1] GeoDjango compatability tables https://docs.djangoproject.com/en/1.11/ref/contrib/gis/db-api/#spatial-lookup-compatibility

Learn from more articles like this how to make the most out of your existing Django site.