CNK's Blog

Trimming Wagtail Migration Cruft

Django creates migrations for Django model changes that do not alter the database, for example, changes to help text or verbose names. In my previous post, I shared code for telling Django not to track non-database attributes in its migrations. This post is about something similar for Wagtail’s migrations.

At work, we are using Wagtail as our Content Management System (CMS). The Wagtail core team decided to follow Django’s example and record all model changes in migrations - including ones that do not change the database schema. Unfortunately for us, this means that when we add new blocks to our pages, “makemigrations” thinks it should make a new version of our StreamField - even though no SQL will be run when the migration is installed. We have a lot of blocks and they change fairly frequently, so these StreamField migrations take up a lot of space. And because they are large, they are nearly impossible to diff, so even if we kept them, it would be hard to use them to track down changes to our StreamField definitions.

For the most part, we just ignore it when “manage.py migrate” tells us we have changes in our code that are not reflected in our migrations. But when we do need to create a migration for database schema changes, we either need to accept a new large chunk of code that doesn’t do anything - or we have to manually remove those lines before committing the migration to version control.

I have read the discussion about the issue on the Wagtail issue queue. And, while I tend to agree with the policy decision, I still want to see what life is like without including StreamField definitions in our migrations. So I added the following monkey patch to the app we already have for all of our monkey patches.

  # wagtail_patches/monkey_patches.py

  import wagtail.core.fields

  def deconstruct_without_block_definition(self):
      name, path, _, kwargs = super(wagtail.core.fields.StreamField, self).deconstruct()
      block_types = list()
      args = [block_types]
      return name, path, args, kwargs
  wagtail.core.fields.StreamField.deconstruct = deconstruct_without_block_definition

This is simply a copy of the StreamField deconstruct method but I replaced “block_types = self.stream_block.child_blocks.items()” with an empty list. Now any field defined as:

  body = wagtail.core.fields.StreamField([ <large list of blocks here> ])

will be represented in the migration file as the following - with no list of blocks:

  ('body', wagtail.core.fields.StreamField([]))

Then I went through all the apps in our project and squashed migrations. This automatically ‘removed’ the StreamField definitions in the migrations included in the squash. Then I manually edited any migrations prior to the current squashing to remove the StreamField definitions from them. Deploying the squashed migrations went smoothly. Now we just need to do some development and see if there is any reason to want to change our minds and start tracking StreamField definitions in our migration files once more.


Adendum

2022-04-12 Per @tbrlpld on the Wagtail Slack: overriding the StreamField deconstruct method as I did above breaks StreamField data migrations. In the context of a data migration the body field in the example above will always return an empty list - so you will not be able to itterate over it to make changes like those seen in this example from the CFPB.

At work, we have been moving away from using data migrations (which stick around and are run each time you build your test database) towards writing “one time” management commands that we run in the needed places and then delete. I haven’t had occaision to do any StreamField manipulations using this technique since we added these monkey patches. So I don’t know if we will face the same problem in that context.