Wellfire Interactive // Expertise for established Django SaaS applications

Faster, better, happier Django development with project onboarding (This Old Pony #41)

Chances are you’re already familiar with the term user onboarding. In brief, user onboarding is how new users are brought on board to new applications, usually web-based SaaS apps, including the visual, process, and communication design before, during, and after. A well or poorly designed onboarding experience can significantly affect a customer’s perceptions of your app and as a result what people in the SaaS “industry” refer to as churn.[0]

But that’s now what I want to talk with you about today.

Today we’re going to discuss project onboarding which is how developers get started with and familiarize themselves with your application. It doesn’t have the outsized - and obvious - business impact that user onboarding does, and yet it still represents real, material, costs.

Depending on how complex your project is and the amount of developer turnover or addition, sub-optimal project set up could cost minor headaches or tends of thousands of dollars in cumulative lost developer time.

What costly project onboarding looks like

  • “It says pyscopg2 needs to be installed… ?”
  • “I’m not sure which directory is the one to use”
  • “Can I borrow the AWS credentials to use S3 for uploads?”
  • “Yep, got started Monday, should have the app running by the end of the week” Wat.

Most people expect to need a little setup time with a new app, of course. The problem is that this compounds every time a developer starts on an application, including the very first time, setting up on a new computer, and sometimes even restarting after a hiatus.

The half baked prototype app that only ever ran on SQLite with no dependencies took little time at all to get set up. The half million LOC monolith took a couple days a for a small team, not including the time to configure the dedicated computer to access the VPN. That’s time, money, and persistent developer frustration.

The triangle of documentation, configuration, and automation

It all comes down to the right optimization of documentation, automation, and configuration.

Every project will need at least some documentation to get started, even if it’s just a list of a few commands. And in many cases, documentation by itself will do the trick. This includes what Python requirements the project has or where to find them, what service dependencies the project has and how to run them, how to configure third party services, how to change local settings, etc.

Configuration here refers to those settings, including what can and cannot be changed. This includes things like database names, logging levels, and various backends (e.g. email). Some projects present everything completely statically without even any modularity by environment, and others still require everything be defined locally.

For any but the simplest projects, automation ties together the prose of documentation with the data of configuration to ensure that it’s easy to follow the steps time and time again to get the project up and to the same state, every time. This can come in the form of scripts, including shell scripts and Fabric files[x], to configuration management tooling like Ansible, and some form of virtualization or containerization (e.g. Vagrant or Docker).

Pick three

Unlike the iron triangle, “good, fast, or cheap, pick two”, the titanium triangle lets us pick all three. The key point is optimizing how much effort you put into each.

Your starting point should always be documentation. Absent any kind of automation, start by describing and writing the steps necessary to get the project installed and running - including tests! Someone should be able to follow the instructions without any other help and get the app running. E.g.

  1. Create a new virtual environment using Python 3.5
  2. Install project requirements, pip install -r requirements.txt
  3. Create a new PostgreSQL database, psql -c ‘CREATE DATABASE myapp;’
  4. Migrate the database, ./manage.py migrate
  5. Set up some sample data using the sample script, ./manage.py seed_data

And remember: it’s better to have complex, exact documentation that’s correct than an automated system that’s wrong.

Once the documentation is accurate you can and should start automating. This can be as simple as script that executes the exact steps like above. And you should quickly start thinking about using a VM or containers if you need more than one or two external services or processes, or if your team has heterogeneous workstations. 

I saved configuration for last because it’s the most complicated. Or more accurately, lends itself to the most bike shedding. There’s a lot of room to try to decide how to set up your settings and what should work how in a given environment.

The keys to good configuration are knowing which settings to make configurable, picking sane and safe defaults, and which defaults to pick for specified environments.

A sane default is one that makes sense if you don’t have anything specified. Using the Django template system is a sane default, as is having DEBUG enabled. A safe default is one that keeps you out of trouble if you forget to set it, even if it causes you a little trouble. Requiring the SECRET_KEY specified on an individual deployment is safe default, disable DEBUG - especially in production - is a safe default. Sane and safe defaults are not necessarily the same.

A good measurement of defaults is how many individual settings someone is required to fill out when they start up the app and whether it can be sensibly run without extensive individual settings. Developer friendly defaults do not require using S3 or other cloud storage for local development, for example, they use the console email backend, the dummy caching backend, etc. Again, if you’re going to start relying on more services all the time then you should give serious consideration to running your app across VMs or containers with maintained Ansible playbooks (or whatever flavor of automated configuration management your prefer).

Make it work remotely

My unscientific observation is that distributed teams are better at this for reasons of necessity. When you can’t sit beside a coworker for an afternoon to guide them, you need to anticipate problems.

One problem we worked on recently was taking a Django app that runs in AWS and moving it so that a ministry of health can run it on their own server. The app needed to be coupled with another Django app and a PHP app, too, and set up so that the systems administrators can get it going without back and forth across five time zones. Now it’s slightly different since the primary target is deployment rather than development, but it’s a forcing function of the same flavor. All of the configuration values need defaults and documentation, and the automation process allows any developer to generate a working and fully configured VM without having to follow the detailed instructions.

Idempotently yours,

[0] Check out Samuel Hulick’s epic onboarding teardowns

Learn from more articles like this how to make the most out of your existing Django site.