A fistful of refactorings: practical improvements for Django apps (This Old Pony #32)

Like a modern factory, there are a multitude of places you can improve and optimize a Django based web application. There are also improvements with more impact than others, and improvements that are significantly easier to introduce.

When it comes to improvements designed to reduce risk, enhance readability, and improve testability, there’s a common basket of issues that crop up in Django projects. Today we’re going to highlight just a few common ones that have straightforward implementations, including what they are, why they matter, and how to implement (albeit in brief).

By the way, these issues come from projects we’ve worked on crafted by solo developers, dedicated product development teams, and the spectrum in between.

Manager/queryset methods

The most common issue with manager methods is that they’re not used sufficiently. If a non-trivial queryset expression or specific filter is used more than once, it belongs in a named queryset method.

The reason is three-fold:

Specifically by using queryset methods[0], you gain the ability to chain logical groupings. This often means breaking up a single custom manager method into two or more individually reusable and testable methods.
Encapsulating the logic in a named queryset method makes the intent clearer.
It can be tested more reliably The how is straightforward: ensure you have methods defined for filters found in forms and views.

As an end note, methods for _creating and updating _data are grossly underused from what I’ve seen. If you have any kind of logic for creating a new model instance in a form or view it could very well be moved into a manager/queryset method.

Enforce data rules in models

The common issue here is that data constraints in the database don’t match the application. This includes null fields[1], uniqueness, and value constraints.

It makes sense to test for these things in forms, for instance, when validating data, but form validation shouldn’t be the crutch your application relies on. It’s too easy to accidentally create workarounds or miss validation elements and end up with sort-of-not-quite-right data.

Fitting the wrong constraints

The solution is to update your models: add field level constraints first, then Meta options, and lastly as necessary hook cleaning logic into the save method. When using specific constraints (e.g. minimums and maximums) avoid using literals, instead using named values which can be referenced as necessary by forms and other classes.

It’s natural to be worried about this causing problems compared to the lax constraints you had before. Aside from the fact that it won’t, the problems stemming from invalid data not getting into the database are typically far less pernicious than adding bad data.

If you already do have data in your database that doesn’t meet the hard constraints you’ve set up, you can enforce these changes on the existing data or fall back on model level validation methods. Actually getting the rules into the database schema is preferred, however.

Add logging to tasks (and commands)

Tasks and management commands run outside of the request/response cycle. While this is probably not news to you, it does come with some implications. As a user, you’re not getting immediate feedback about an action, but as a developer, admin, or product owner you’re often not getting insight into what’s going on. This leads to false positives in assuming things are “working okay” and lost developer time debugging black boxes.

Async tasks, whether run by a task queue or management commands over cron, need logging!

At a base level, this is really simple. The challenge here is usually deciding what data to include and how to structure it. At a minimum want to know:

That a task started - preferably with an ID unique to that task execution instance (see below)
That a task completed, and how - again, with the same task ID
If there’s a web request ID that you can pass through to tie everything together, do so Sure, there’s lots of other information you’ll want to glean from your tasks, but having the knowledge that tasks are getting kicked off and successfully finishing is a significant win.

Happy refactoring,
Ben

[0] Django docs: https://docs.djangoproject.com/en/2.0/ref/models/querysets/#django.db.models.query.QuerySet.as_manager
[1] There _are _reasons to allow nullable fields for data that should be required, including as part of the migration process.

Wellfire Interactive // Expertise for established Django SaaS applications

A fistful of refactorings: practical improvements for Django apps (This Old Pony #32)

Manager/queryset methods

Enforce data rules in models

Add logging to tasks (and commands)

More like this