CNK's Blog

Wagtail: Dynamically Adding Admin Menu Items

I am in the process of upgrading to Wagtail 2.16. One of the new features is a slim admin menu which I am sure many of my laptop users will really like - or would really like - if I had not just added a chunk of code that violates the last item in the exceptions list: MenuItem can no longer be sub-classed to customise its HTML output or load additional JavaScript

I had had an item that was restricted to be “one page of this type per site” and so it was easy to construct a menu item to display all the subpages that could be under that page - I just need to find the PersonIndexPage2 for the current site, and then create a url for the page explorer for that page.

  class PeoplePages2MenuItems(MenuItem):
       def __init__(self):
           super().__init__(
               label="People Pages",
               url=None,
               classnames="icon icon-user",
               order=300,
           )

       def is_shown(self, request):
           """
           The PeoplePages2MenuItem is only shown if there is a PersonPage2Template in the site.
           """
           return PersonPage2Template.objects.in_site(Site.find_for_request(request)).exists()

      def get_context(self, request):
          """
          Constructs the url for listing PersonPage2 pages
          """
          page = PersonIndexPage2.objects.descendant_of(Site.find_for_request(request).root_page).first()
          self.url = reverse('wagtailadmin_explore', args=[page.id]) + "?ordering=title&people_pages_only=True"
          return super().get_context(request)


  @hooks.register('register_people_admin_menu_item')
  def register_people_pages_v2_template_menu_item():
      return PeoplePages2TemplateMenuItem()

But then someone asked me if they could add more than one PersonIndexPage2 per site. So we will need more than one menu item for “People Pages” - and we’ll need more than one link per site. So I had a look at the MenuItem class and there is the render code, just begging me to hijack it. so I removed the get_context method above and did all the dirty work in the render_html method.

      def render_html(self, request):
          pages = PersonIndexPage2.objects.descendant_of(Site.find_for_request(request).root_page).all()
          items = []
          for page in pages:
              context = self.get_context(request)
              context['url'] = reverse('wagtailadmin_explore', args=[page.id]) + "?ordering=title&people_pages_only=True"
              context['label'] = page.title
              items.append(render_to_string(self.template, context, request=request))
          return (' ').join(items)

That was great - for about 2 weeks. Then I started my Wagtail 2.16 upgrade and suddenly my “People Pages” links go to /admin/null.

So I went poking around in the Wagtail source code and found what I probably should have been using all the time. The Menu class has a method menu_items_for_request. This is where the is_shown rules are enforced - but more important for my current issue is the section where it executes any hooks registered by a menu’s construct_hook_name. I have lots of code that uses hooks configured with register_hook_name but it hadn’t occurred to me to look for a request-time equivalent.

So, first I need to define a construct hook:

  class PeopleAdminMenu(Menu):
      def __init__(self):
          super().__init__(
              register_hook_name='register_people_admin_menu_item',
              construct_hook_name='construct_people_admin_menu_item',
          )

Then I replaced my PeoplePages2MenuItems class and the register_people_admin_menu_item hook that added it to the correct top level menu item with a method to add the menu items.

  @hooks.register('construct_people_admin_menu_item')
  def add_people_pages2_menu_items(request, items):
      site = Site.find_for_request(request)
      if PersonPage2Template.objects.in_site(site).exists():
          for page in PersonIndexPage2.objects.descendant_of(site.root_page).all():
              pp2_menu_item = MenuItem(
                  page.title,
                  reverse('wagtailadmin_explore', args=[page.id]) + "?ordering=title&people_pages_only=True",
                  icon_name='icon icon-user',
                  order=300,
              )
              items.append(pp2_menu_item)

This contains all the same logic as the previous class. The if clause contains the logic from the is_shown method and the class’s init parameters are combined with the dynamic url and label items from the render_html method to instantiate a MenuItem. So much cleaner! I should have been doing it like this all along.

Trimming Wagtail Migration Cruft

Django creates migrations for Django model changes that do not alter the database, for example, changes to help text or verbose names. In my previous post, I shared code for telling Django not to track non-database attributes in its migrations. This post is about something similar for Wagtail’s migrations.

At work, we are using Wagtail as our Content Management System (CMS). The Wagtail core team decided to follow Django’s example and record all model changes in migrations - including ones that do not change the database schema. Unfortunately for us, this means that when we add new blocks to our pages, “makemigrations” thinks it should make a new version of our StreamField - even though no SQL will be run when the migration is installed. We have a lot of blocks and they change fairly frequently, so these StreamField migrations take up a lot of space. And because they are large, they are nearly impossible to diff, so even if we kept them, it would be hard to use them to track down changes to our StreamField definitions.

For the most part, we just ignore it when “manage.py migrate” tells us we have changes in our code that are not reflected in our migrations. But when we do need to create a migration for database schema changes, we either need to accept a new large chunk of code that doesn’t do anything - or we have to manually remove those lines before committing the migration to version control.

I have read the discussion about the issue on the Wagtail issue queue. And, while I tend to agree with the policy decision, I still want to see what life is like without including StreamField definitions in our migrations. So I added the following monkey patch to the app we already have for all of our monkey patches.

  # wagtail_patches/monkey_patches.py

  import wagtail.core.fields

  def deconstruct_without_block_definition(self):
      name, path, _, kwargs = super(wagtail.core.fields.StreamField, self).deconstruct()
      block_types = list()
      args = [block_types]
      return name, path, args, kwargs
  wagtail.core.fields.StreamField.deconstruct = deconstruct_without_block_definition

This is simply a copy of the StreamField deconstruct method but I replaced “block_types = self.stream_block.child_blocks.items()” with an empty list. Now any field defined as:

  body = wagtail.core.fields.StreamField([ <large list of blocks here> ])

will be represented in the migration file as the following - with no list of blocks:

  ('body', wagtail.core.fields.StreamField([]))

Then I went through all the apps in our project and squashed migrations. This automatically ‘removed’ the StreamField definitions in the migrations included in the squash. Then I manually edited any migrations prior to the current squashing to remove the StreamField definitions from them. Deploying the squashed migrations went smoothly. Now we just need to do some development and see if there is any reason to want to change our minds and start tracking StreamField definitions in our migration files once more.


Adendum

2022-04-12 Per @tbrlpld on the Wagtail Slack: overriding the StreamField deconstruct method as I did above breaks StreamField data migrations. In the context of a data migration the body field in the example above will always return an empty list - so you will not be able to itterate over it to make changes like those seen in this example from the CFPB.

At work, we have been moving away from using data migrations (which stick around and are run each time you build your test database) towards writing “one time” management commands that we run in the needed places and then delete. I haven’t had occaision to do any StreamField manipulations using this technique since we added these monkey patches. So I don’t know if we will face the same problem in that context.

Trimming Django Migration Cruft

Django creates migrations for Django model changes that do not alter the database, for example, changes to help text or verbose names. In most cases when I see a migration for a change I am pretty sure doesn’t run any SQL, I check my assumption using python manage.py sqlmigrate <app> <migration_name>, and if it does not produce any SQL, then I edit the most recent migration to have touched that column to match the “changes” Django wants to make. For the most part that isn’t difficult but it is sometimes annoying. Other people have a similar opinion and one of them shared the following code on a Slack channel I am on.

WARNING: I have included the code as it was from the shared file, but my application had some data migrations with RunPython commands that invoke related_name. So in our application, we deleted the code below that removed attributes in MIGRATION_IGNORE_RELATED_FIELD_ATTRS.

  # app/management/commands/__init__.py

  """
  Django creates redundant migrations for Django model changes that do not alter the database.
  Here we patch Django's migration machinery to ignore attrs.

  The management commands `makemigrations` and `migrate` will ignore the attrs defined in:

      - MIGRATION_IGNORE_MODEL_ATTRS
      - MIGRATION_IGNORE_FIELD_ATTRS
      - MIGRATION_IGNORE_FILE_FIELD_ATTRS
      - MIGRATION_IGNORE_RELATED_FIELD_ATTRS

  This will reduce the number of migrations and therefore speed-up development
  """

  import logging
  from functools import wraps

  from django.db.migrations.operations import AlterModelOptions
  from django.db.models import Field, FileField
  from django.db.models.fields.related import RelatedField

  logger = logging.getLogger(__name__)

  MIGRATION_IGNORE_MODEL_ATTRS = ["verbose_name", "verbose_name_plural"]
  MIGRATION_IGNORE_FIELD_ATTRS = ["validators", "choices", "help_text", "verbose_name"]
  MIGRATION_IGNORE_FILE_FIELD_ATTRS = ["upload_to", "storage"]

  MIGRATION_IGNORE_RELATED_FIELD_ATTRS = ["related_name", "related_query_name"]

  for attr in MIGRATION_IGNORE_MODEL_ATTRS:
      logger.info(f"Model {attr} attr will be ignored.")

  for attr in MIGRATION_IGNORE_FIELD_ATTRS:
      logger.info(f"Field {attr} attr will be ignored.")

  for attr in MIGRATION_IGNORE_FILE_FIELD_ATTRS:
      logger.info(f"File field {attr} attr will be ignored.")

  for attr in MIGRATION_IGNORE_RELATED_FIELD_ATTRS:
      logger.info(f"Related field {attr} attr will be ignored.")


  def patch_ignored_model_attrs(cls):
      for attr in MIGRATION_IGNORE_MODEL_ATTRS:
          cls.ALTER_OPTION_KEYS.remove(attr)


  def patch_field_deconstruct(old_func):
      @wraps(old_func)
      def deconstruct_with_ignored_attrs(self):
          name, path, args, kwargs = old_func(self)
          for attr in MIGRATION_IGNORE_FIELD_ATTRS:
              kwargs.pop(attr, None)
          return name, path, args, kwargs

      return deconstruct_with_ignored_attrs


  def patch_file_field_deconstruct(old_func):
      @wraps(old_func)
      def deconstruct_with_ignored_attrs(self):
          name, path, args, kwargs = old_func(self)
          for attr in MIGRATION_IGNORE_FILE_FIELD_ATTRS:
              kwargs.pop(attr, None)
          return name, path, args, kwargs

      return deconstruct_with_ignored_attrs


  def patch_related_field_deconstruct(old_func):
      @wraps(old_func)
      def deconstruct_with_ignored_attrs(self):
          name, path, args, kwargs = old_func(self)
          for attr in MIGRATION_IGNORE_RELATED_FIELD_ATTRS:
              kwargs.pop(attr, None)
          return name, path, args, kwargs

      return deconstruct_with_ignored_attrs


  Field.deconstruct = patch_field_deconstruct(Field.deconstruct)
  FileField.deconstruct = patch_file_field_deconstruct(FileField.deconstruct)
  RelatedField.deconstruct = patch_related_field_deconstruct(RelatedField.deconstruct)
  patch_ignored_model_attrs(AlterModelOptions)

And now, create override files for the two manage commands we need to load our patches: migrate and makemigrations.

  # app/management/commands/makemigrations.py

  """
  Override of Django makemigrations. When we use this version, we
  will load the __init__ file above that patches models.Field.
  """

  from django.core.management.commands.makemigrations import Command  # noqa
  # app/management/commands/migrate.py

  """
  Override of Django migrate. When we use this version, we
  will load the __init__ file above that patches models.Field.
  """

  from django.core.management.commands.migrate import Command  # noqa

Django manage.py Commands

sendtestemail

Someone on the PyDev Slack channel was having trouble getting emails from the Django admin system and another member mentioned there is a sendtestemail manage.py command that can help one debug your email settings. That made me wonder what else is available that I didn’t know about.

diffsettings

The Django settings object is a bit odd in that you can’t do print(settings.__dict__) to figure out what is available - and even if you know (or guess) the name of a setting, how do you know if the value is the default or something you have overridden in your app? There is a manage.py command for that! The most useful version is python manage.py diffsettings --output unified. That gives you all the settings - with the overridden versions in red.

showmigrations

I have used the default version to check to see if I have applied all the existing migrations in my test and prod environments - that’s the python manage.py showmigrations --list version. But there is also a python manage.py showmigrations --plan version. That will show you the order in which Django will apply migrations.

inspectdb

If you run python manage.py inspectdb against an existing database, it will generate the Django models that would have created those tables (and indexes and constraints). This command is for projects that must use a legacy database so all of the models are created with a Meta class with managed = False.

ping_google

If your site has had a bunch of changes that you want Google to recrawl, you can use this command to submit your sitemap: python manage.py ping_google. If you have the Django sitemaps app installed and a url configured for your sitemap, this command will figure out what the url should be.

New Django Query Tricks

Union queries

Union queries are surprisingly easy to create. I need a list of ids and content type ids from a bunch of different models. I was very surpised at how straightforward it is in Django 3.2 to create the UNION query I want.

  union_query = None
  for content_type in <queryset of content types>:
      model = apps.get_model(content_type.app_label, content_type.model)
      query = model.objects.filter(<criteria>).values('pk', <content_type.id>)
      if union_query is None:
         union_query = query
      else:
          union_query = union_query.union(query, all=True)

Note: I used all=True because I will never have duplicates in my (id, content_type_id) tuples and UNION ALL is faster than UNION in this case because we can skip the DISTINCT operation on the final result.

The observant among you will have noticed a bit of pseudocode in the example above. I want to insert the content_type_id from python into my query. In SQL this would be something like:

  SELECT id, 99 FROM myapp_model;

In the Django ORM, that turns out to be something I didn’t know how to do. I can’t leave it as a bare name and I can’t quote it or the ORM tries to turn it into a column name or relation that could be turned into a column name. Turns out I need to use Value:

  query = model.objects \
               .filter(<criteria>) \
               .values('pk', Value(content_type.id, output_field=IntegerField()))

OK so that now will give me a queryset that produces a list of dicts like: [{pk: 3, content_type_id: 44}, {pk: 3, content_type_id: 48}] But when I tried to use those results in the filter section of another query… I had my next problem.

Querying by value - without Foreign Key relationships

So now I need to use those ids and content_type_ids to filter another model that has rows with content_type_id and object_id columns. I want all the lines in the table for the ModelLogEntry model where the (object_id, content_type_id) tuple is in the list of (pk, content_type_id) tuples created by our UNION query above.

If I only needed to match on a single value, I would probably evaluate the UNION query, and then do something like .filter(pk__in=<list of pks>) - as I did to get the list of content types I need. But I need to match the id and content_type_id fields. In SQL, I would do:

  SELECT wagtailcore_modellogentry.*
    FROM wagtailcore_modellogentry
    INNER JOIN (
    (
    ((SELECT `link_farms_audience`.`id`, 104 AS `content_type_id`  FROM `link_farms_audience` WHERE `link_farms_audience`.`site_id` =  12)
     UNION
     (SELECT `link_farms_collection`.`id`, 105 AS `content_type_id` FROM `link_farms_collection` WHERE `link_farms_collection`.`site_id` = 12))
     UNION
     (SELECT `link_farms_link`.`id`, 106 AS `content_type_id` FROM `link_farms_link` WHERE `link_farms_link`.`site_id` = 12))
     UNION
     (SELECT `core_didyouknowitem`.`id`, 110 AS `content_type_id` FROM `core_didyouknowitem` WHERE `core_didyouknowitem`.`site_id` = 12 ORDER BY `core_didyouknowitem`.`text` ASC)
    ) AS models
    ON models.id = wagtailcore_modellogentry.object_id
    AND models.content_type_id = wagtailcore_modellogentry.content_type_id

This was relatively straightforward to write in SQL, so I tried using raw SQL, e.g. ModelLogQuery.objets.raw('<query here>'). That definitely gave me the data I was looking for when I ran it in shell_plus. But when I tried to use it in my monkey patch, the calling function wanted to use values(), which is a function only defined on real ORM QuerySets - and not available when using raw.

At this point I suspect I won’t want to use this in production. Goodness only knows how big the union query is likely to get. But it is bothering me that I don’t know how to get Django to let me do a relatively straightforward join without having defined a ForeignQuery relationship in my Python model code.

I still don’t know how to tell Django “do this join damn it!”, but after some reading and thinking of alternate ways to write the SQL, I think I have found a way to write this in the ORM using Exists to create a correlated subquery.

    from django.apps import apps
    from django.db.models import Exists, IntegerField, OuterRef, Value
    from django.contrib.contenttypes.models import ContentType

    request = get_current_request()
    site = Site.find_for_request(request)
    union_query = None
    content_types = (
        ContentType.objects
                   .filter(id__in=ModelLogEntry.objects.values_list('content_type_id', flat=True).distinct())
    )
    for content_type in content_types:
        model = apps.get_model(content_type.app_label, content_type.model)
        query = (model.objects.filter(site_id=site.id)
                 .values('pk', content_type_id=Value(content_type.id, output_field=IntegerField()))
                 )
        if union_query is None:
            union_query = query
        else:
            union_query = union_query.union(query, all=True)

    return ModelLogEntry.objects.filter(Exists(
        union_query.filter(pk=OuterRef('object_id'), content_type_id=OuterRef('content_type_id'))
    ))

Sigh. One can’t combine .filter with a union query.

    NotSupportedError at /admin/reports/site-history/
    Calling QuerySet.filter() after union() is not supported.

I tested the Exists query by setting the union_query to be just one type and it works fine. So I learned something useful about the Django ORM - even if I can’t apply that knowledge in the context in which I wanted to to use it this time.