Wellfire Interactive // Expertise for established Django SaaS applications

Robots in the Cloud, or, Automating Django with Build Systems (This Old Pony #83)

In the last edition I alluded to my deep love for that otherwise banal kitchen appliance, the dishwasher.

So in the next issue we’re going to examine how my favorite home appliances, like the dishwashing machine, are related to automation services (and some strategies and tips for working with them to test and deploy your Django application).

I’ve not always had a dishwasher available. It’s not really that big of a deal. If you develop a system with the right sink and the right drying set up then hand washing dishes isn’t that bad. But it’s tedious. And if you’re collecting dishes over the course of a day you either need to wash them throughout the day or leave let them pile up in the sink for later. Neither is particularly appealing. And in terms of pure time to completion, using a dishwasher is slower than washing by hand. I haven’t timed it, but I’m pretty confident our own dishwasher needs at least 2 hours to complete the full cleaning and drying cycle. We could wash several fold more dishes than that by hand in the same amount of time!

However I can take a full dishwasher and start the cycle and then do other things like start my day or even go to bed. And then later at a convenient time I just have to take the two and half minutes to unload the dishwasher (I have timed this).

The dishwasher performs the task in parallel, no matter who runs it, using the same exact process each time, allowing me to focus on other things. Like writing a newsletter or trying to figure out how a handful of Cheerios ended up in the dog’s water bowl. What a mystery, indeed.

Build servers >>> dishwashers

I’m not going to make the argument that you can stop running tests locally once you have a continuous integration service in place running your tests on every check in. However most production test suites take more than a minute or two to run, and during development you can focus on running only tests related to the core feature you’re working on while allowing the CI server to continue running the full test suite when you push changes to ensure that you have not created any regressions in unrelated code or “downstream” uses.

It’s hopefully obvious, but this also ensures that the tests are always run, whether you remember to run them locally or not. This is a little fringe benefit if you’re working on your own, but it’s a benefit that scales up with each additional developer. You no longer need to worry if someone else ran the tests before pushing up their code or submitting a pull request, you can see the results for yourself.

This probably reads a little bit like an exercise in automated accountability, and to a degree it is, but it also reduces the a significant element of psychological friction. That’s a benefit of any such automation. It removes the need to remember, and the regret of forgetting; it removes the need to nag and the problems that creates on teams; and it moves the “correction” away from another team member or team lead to the result of an impersonal rule system that simply says, “Here are your errors” without any risk of damaging an ego.

What to build

What should you have your system run? As much as you can.

Tests are the obvious choice. I’d add in deployment, too. And testing should include any other code quality checks including code linting and type hint checking, even spell checking (we’ve found it beneficial at least to introduce scripts looking for things like company or product name variations to ensure these are not misspelled in public facing code).

There are a few ways to look at automating deployment. You could have all passing builds deployed directly to a single [production] environment, push them automatically to a staging environment and then use either an “eyeball” decision to deploy this to production or some automated rules (e.g. successfully running a set of live-environment tests).

A Heroku deployment is among the simplest to include in a CI system since in its most basic form it just involves pushing to another Git repo. This still requires some configuration since you don’t to include your username and password in the CI configuration to log into Heroku for Git SSH access.

If you’re deploying on a self-managed virtual server (or even “bare metal” as it were) you can deploy with Fabric or Ansible (or a host of other options) directly from your CI server. The critical thing here is to either create a system user for your CI system (or a CI system specific SSH key a deployment user).

An actual example

The following is an example adapted from an older client project. This is an out of date format from Circle CI[0] - version 2 of their config is container based and on the whole is more sensible, but this one is handy to show a few of the things you can include, and it’s slightly less verbose than version 2 configuration. We won’t dive too deep into the nuts and bolts of the format anyhow. And for context, this was for a large CMS site, so there was a significant amount of human review required for moving from staging to production.

 

machine:
	timezone: America/New_York
	node:
		version: "6.1.0"
	environment:
		DEBUG: True
		DATABASE_URL: postgres://localhost/circle_test
		COMPRESS_ENABLED: "False"
		COMPRESS_OFFLINE: "False"
		THUMBNAIL_DEBUG: "False"

dependencies:
	override:
		- pip install -r requirements.txt
		- pip install -r requirements/test.txt
		- pip install ansible==2.4.0.0
		- npm install qa-screenshots -g

test:
	override:
		- CIRCLECI=True flake8 app
		- ./scripts/check-customer-name-spelling.sh
		- cd app && python manage.py test --settings=settings.testing
	post:
		- sphinx-build -b html docs $CIRCLE_ARTIFACTS/docs
		- tar -cvzf $CIRCLE_ARTIFACTS/docs.$(date +"%Y-%m-%d").tar.gz -C $CIRCLE_ARTIFACTS docs
		- rm -rf $CIRCLE_ARTIFACTS/docs

deployment:
	staging:
		branch: master
		commands:
			- /bin/bash deploy.sh staging -vv
			- pyresttest http://$STAGING_USERNAME:$STAGING_PASSWORD@$STAGING_HOST scripts/postdeploy.yaml
			- qa-screenshots -c scripts/screenshots-staging.yaml -u $STAGING_USERNAME -p $STAGING_PASSWORD --path $CIRCLE_ARTIFACTS

 

The first thing you should be able to do in any CI system is configure your environment (and for what it’s worth, that’s a lot more straightforward and predictable if you’re configuring Docker containers, even just on the CI platform). This should include setting environment variables at some point.

Install dependencies

You’ll also want dependency management. This includes installing runtime dependencies and test dependencies. In this case, you’ll see that Ansible is also included for deployment, and then a single Node dependency which is used later for a deployment step.

Test

The most important phase is the test phase. This would be run every single time code was pushed to the hosted Git repository. This one runs three steps:

  1. Linting and static analysis with flake8.
  2. A custom script that looks for common misspellings of key names related to the customer
  3. And then the full Django test suite

The order here is such that the test suite isn’t even run if the linter or spell check fail. This is intentional. It ensures faster feedback and doesn’t waste build time. These are run on the CI server mainly as a backstop, and your team should be able to run these before pushing code.

In this configuration there’s a separate test phase after the tests have run called post. Here we build the associated docs with Sphinx. This ends up creating a gzipped tarball file with fully built project documentation. It’s available as a downloadable link when the build is complete.

Deploy

And lastly the deployment step. In this project there’s only one build system managed deployment, to the staging server. Now every time a build passes on the master branch it’s deployed right to the staging server. It’s a little unfair - and for this I’m sorry - that the deployment basically just aliases to a bash script here, but suffice to say the script essentially just manages which hosts file was used by Ansible and then called the right playbook.

After the main deployment step, you’ll notice two more steps though.

The first used pyresttest, an API testing library, to run a series of smoke tests on the deployment. The staging site used HTTP basic auth, so this line uses environment variables provided via the CI system itself (thus kept out of source code) to build up the full URL. This provided an opportunity not just to verify the expected healthy responses (complex CMS’s provide myriad opportunities for page configuration to cause problems) but also critical redirections as new campaigns launched or closed.

The second step made use of that Node dependency installed earlier. This step takes screenshots of the rendered target system using a matrix including screen dimensions and URLs. This meant we could immediately see unexpected visual changes for critical sections of the site, and had a reference for changes for subsequent deployments.

That’s a lot of steps. In the grand scheme of things it’s pretty simple! But there are still a lot of steps to go through if you have to remember to do all of this. And it’s only a few dozen lines of configuration for the build system!

When you start thinking about doing this with containers it makes it easier to start thinking about how you might build and deploy static assets from your build system, or build and deploy containers, or even update a full suite of services.

Got more questions? Ask away! And if your team could use some professional guidance, happy to help.

Repeatedly, repeatedly, repeatedly yours,
Ben

[0] We’ve been happy customers of Circle CI for a while: (https://circleci.com/) I’ve also heard positive things about Semaphore CI (https://semaphoreci.com/) but not had a recent opportunity to check them out. I looked at CodeShip (https://cms.codeship.com/) when we originally looked at Circle CI but at least at that time CodeShip managed all the configuration in their app and I find that *very unappealing*. I want that version controlled in human and machine readable format. That said many people seem to be happy with them. GitLab (https://about.gitlab.com/) is another great option, provided you’re using GitLab for your Git hosting as well. GitHub Actions (https://github.com/features/actions) will add a lot of build server like functionality, but my understanding so far is that GitHub are not aiming to displace companies providing services like CI.
[1] Even in this last section I’ve been careless about “CI”, “CI system”, and “build system”. Really we should be considering all of these systems “build systems”. They can be used for continuous integration, but technically continuous integration means continually working off a main (master) branch. You don’t have to do that to use a build system. And I think the value of CI *as originally proposed* is far less today with DVCS like Git since you can and should be practicing what I’d call “CI local” by continually pulling down upstream changes into your feature branches so that your eventual PR already has all of the conflicts worked out well in advance.

Learn from more articles like this how to make the most out of your existing Django site.