CNK's Blog

Site Creator

The key to running so many sites in a single Wagtail installation is they all need to be the same (or nearly the same) except for content. And the best way to make something uniform is to manage it in code. The code that manages our site setup (and tear down) lives in our site creator. This is a Django app that overrides Wagtail’s site management forms to add the logic we use to enforce our ideas about multitenancy.

Our site_creator app doesn’t have any models of its own and it only does a little bit of customization to Wagtail’s SiteViewSet. The vast majority of our customizations are implemented via our create and edit forms.

The Wagtail SiteForm has the following fields: “hostname”, “port”, “site_name”, “root_page”, and “is_default_site”. In our multitenanted environment all sites are created as subdomains for the instance. So if our instance is called sites.example.com, then all new sites will have names like foo.sites.example.com. So we do not ask for the hostname, our form asks for the subdomain and then builds the hostname by appending the base url for the instance, e.g. “foo” + “sites.example.com”. We don’t for the port; we use port 443 + a wildcard SSL certificate everywhere. And we don’t ask for a root page because we are going to create that as part of our set up script. Our SiteCreationForm does some basic validations on the subdomain and site name and then passes that information to our create_site script.

The nice thing about having all our site creation logic in a separate function is that we can use it from non-form, non-view contexts. So we can use this exact same script to create sites in tests or use it from manage.py commands to create new sites as part of an export/import process.

    def create_site(owner, form_data):
        """
        Create a new Site with all the default content and content specified by
        various hooks. "form_data" should be a dict with the following information:

        hostname: full hostname including subdomain, e.g. foo.sites.example.com
        site_name: string for site name
        theme: one of v7.0, v6.5 or v6.1
        """
        # If anything fails, make sure it ALL gets rolled back, so the db won't be corrupted with
        # partially created sites
        with transaction.atomic():
            site = Site()
            # Generate the Site object from the form fields.
            site.hostname = form_data['hostname']
            site.site_name = form_data['site_name']
            site.port = 443

            # Generate the default Page that will act as the Homepage for this Site.
            home_page = get_homepage_model(form_data['theme'])()
            home_page.title = home_page.nav_title = generate_homepage_title(site.site_name)
            home_page.show_title = False
            home_page.nav_title = site.site_name
            home_page.breadcrumb_title = 'Home'
            home_page.owner = owner
            home_page.show_in_menus = False
            home_page.latest_revision_created_at = now()
            home_page.first_published_at = now()

            # We save the home_page by adding it as a child to Page 1, the ultimate root of the page tree.
            tree_root = Page.objects.first()
            home_page = tree_root.add_child(instance=home_page)
            site.root_page = home_page
            site.save()

            site.settings = get_settings_model()()
            site.settings.save()

            # Execute all registered site_creator_settings_post hooks.
            # This allows apps that need to do additional work after the site settings object has been created.
            #
            # All implementations of site_creator_create_site_post must accept one positional parameter:
            # site: a Wagtail Site object
            for func in hooks.get_hooks('site_creator_settings_post'):
                func(site)

            # Generate a blank Features for this Site.
            Features.objects.get_or_create(
                site=site,
                site_theme=form_data['theme']
            )

            # Generate a Collection for this Site.
            collection = Collection()
            collection.name = site.hostname
            # Much like the homepage, we need to create this Collection as a child of the root Collection.
            collection_root = Collection.objects.first()
            collection_root.add_child(instance=collection)

            admins = Group.objects.create(name=f'{site.hostname} Admins')
            apply_default_permissions(admins, site, 'admin')
            admins.save()

            editors = Group.objects.create(name=f'{site.hostname} Editors')
            apply_default_permissions(editors, site, 'editor')
            editors.save()

            # Viewers group doesn't get any permissions; they can log in and look at pages but can't access admin interface.
            Group.objects.create(name=f'{site.hostname} Viewers')

            # Execute all registered site_creator_default_objects hooks. This hook allows apps to tell
            # site_creator to create pages or other objects the site may need. All implementations of
            # site_creator_default_objects will receive the newly created Site (from which the function
            # can derive site.root_page)
            for func in hooks.get_hooks('site_creator_default_objects'):
                func(site)

            return site

If you read the code above, you will notice use creating a associated Features record for each site and that record contains a site_theme. As much as we would like to have a single idea of what a site is, that isn’t the real world. Our multitenanted CMS was created as a proof of concept a year or two before our last redesign and uses a variation on what was then our main web site’s design. Since that was the sixth iteration of our main web site, it was know as Theme 6 - and the redesign, when it happened, was called Theme 7. We didn’t want to change Wagtail’s Site model, so we created a 1:1 model named Features to keep track of the site them (and some feature flags for sites).

The other thing you will have noticed is it delegating the hard work of assigning group permissions to apply_default_permissions. This is where the real work of setting up our standard groups takes place.

    def apply_default_permissions(group, site, group_type):
        """
        Applies the default permissions to the given Group.
        """
        assert group_type in ('admin', 'editor')

        # Allow all groups to access the Wagtail Admin.
        wagtail_admin_permission = Permission.objects.get(codename='access_admin')
        group.permissions.add(wagtail_admin_permission)

        # Gives Admins and Editors full permissions for pages on this Site EXCEPT Bulk Delete. This prevents
        # anyone from accidentally erasing the entire site by deleting the homepage.
        if group_type in ('admin', 'editor'):
            for perm_type, short_label, long_label in PAGE_PERMISSION_TYPES:
                if perm_type != 'bulk_delete_page':
                    permission = Permission.objects.get(content_type__app_label="wagtailcore", codename=perm_type)
                    GroupPagePermission.objects.get_or_create(group=group, page=site.root_page, permission=permission)

        perm_types = ['add', 'change', 'view', 'delete', 'choose']
        # Note we are using the built in image/document content types; this is
        # because the CollectionOwnershipPermissionPolicy uses those models in its checks
        image_ct = ContentType.objects.get(app_label='wagtailimages', model='image')
        doc_ct = ContentType.objects.get(app_label='wagtaildocs', model='document')

        # Give all groups full permissions on the Site's Image and Document Collections.
        collection = Collection.objects.get(name=site.hostname)
        if group_type in ('admin', 'editor'):
            # images
            for perm in perm_types:
                perm = Permission.objects.get(content_type=image_ct, codename=f'{perm}_image')
                GroupCollectionPermission.objects.get_or_create(group=group, collection=collection, permission=perm)
            # documents
            for perm in perm_types:
                perm = Permission.objects.get(content_type=doc_ct, codename=f'{perm}_document')
                GroupCollectionPermission.objects.get_or_create(group=group, collection=collection, permission=perm)

        # Give site admins permission to manage collections under their site's root collection
        if group_type == 'admin':
            for codename in ['add_collection', 'change_collection', 'delete_collection']:
                perm = Permission.objects.get(content_type__app_label='wagtailcore', codename=codename)
                GroupCollectionPermission.objects.get_or_create(group=group, collection=collection, permission=perm)

        # Apply all model-level permissions for the new groups
        if group_type in ('admin', 'editor'):
            permissions = default_model_permissions(group, group_type, settings.SITE_TYPE)
            group.permissions.set(permissions)


    def default_model_permissions(group, group_type, site_type):
        """
        Collects the model permissions for the given group type.
        """
        wagtail_admin_permission = Permission.objects.get(codename='access_admin')
        group_permissions = [wagtail_admin_permission]

        # Omitted: lots of model permissions that are assigned to both admin and editor groups

        if group_type == 'admin':
            admin_models = [
                ('core', 'DisplayLocation', 'all'),
                ('core', 'SyncTag', 'all'),
                ('custom_auth', 'User', 'all'),
                ('www', 'Settings', ['view', 'change']),
            ]
            group_permissions.extend(__permission_objects(admin_models))

        if group_type == 'editor':
            editor_models = [
                ('core', 'DisplayLocation', ['view']),
                ('core', 'SyncTag', ['view']),
            ]
            group_permissions.extend(__permission_objects(editor_models))

        return group_permissions


    def __permission_objects(config_list):
        """
        Look up the correct permissions objects and return a list of them
        """
        output = []
        for app_label, model_name, perms in config_list:
            try:
                ct = ContentType.objects.get(app_label=app_label, model=model_name)
            except ContentType.DoesNotExist:
                logger.error(f'Could not find content type for {app_label} {model_name}')
                continue

            if perms == 'all':
                output.extend(Permission.objects.filter(content_type=ct).all())
            else:
                for perm in perms:
                    try:
                        perm = Permission.objects.get(content_type=ct, codename__startswith=perm)
                        output.append(perm)
                    except Permission.DoesNotExist:
                        logger.error(f'Could not find permission {perm} for {app_label} {model_name}')
        return output

Because create_site set up collections and user groups based on the site hostname, our edit form is going to have to do some work to keep those names in sync.

    class SiteEditForm(SiteForm):
        def save(self, commit=True):
            instance = super().save(commit)
            if 'hostname' in self.changed_data:
                # The hostname has been changed, so we need to do a bunch of internal renames to account for that.
                old_hostname = self['hostname'].initial
                new_hostname = instance.hostname

                # Change all the places where the old hostname appears which wouldn't otherwise be changed by this form.
                update_db_for_hostname_change(old_hostname, new_hostname)

                messages.success(
                    get_current_request(),
                    "{} has been moved from {} to {}.".format(instance.site_name, old_hostname, new_hostname)
                )
            return instance


    def update_db_for_hostname_change(old_hostname, new_hostname):
        """
        This function updates all the tables in the database that utilize the string value of a Site's hostname.
        Those tables are:

        auth_group - We can't define a custom Group class, so we need to use their name as a connection to the related Site.
        wagtailcore_collection - Same as above.

        Note: This function DOES NOT rename the Sites themselves. The code that calls this function is expected to do that.
        """
        commands = [
            "UPDATE auth_group SET `name` = REPLACE(`name`, %(old_hostname)s, %(new_hostname)s)",
            "UPDATE wagtailcore_collection SET `name` = REPLACE(`name`, %(old_hostname)s, %(new_hostname)s)",
            "UPDATE custom_auth_user SET `username` = REPLACE(`username`, %(old_hostname)s, %(new_hostname)s)",
        ]

        # Add the commands returned by all registered site_hostname_change_additional_sql hooks.
        # This allows apps to add commands to rename fields in their own tables.
        for func in hooks.get_hooks('site_hostname_change_additional_sql'):
            commands.extend(func())

        with connection.cursor() as cursor:
            for command in commands:
                try:
                    cursor.execute(command, {'old_hostname': old_hostname, 'new_hostname': new_hostname})
                except Exception as e:
                    logger.error(f'Exception raised in update_db_for_hostname_change while running "{command}": {e}')

Maintenance Considerations

Note that because we use the full hostname in our naming convention, when we copy data between environments, for example from prod to test or test to dev, we will need to call update_db_for_hostname_change to change the base domain. Because we are using MySQL’s REPLACE function, this can be used to replace substrings in bulk; notice the lack of loop around update_db_for_hostname_change in the code below?

    # core/management/commands/convert_server_domain
    from django.conf import settings
    from django.core.management.base import BaseCommand
    from wagtail.models import Site

    from ...utils import update_db_for_hostname_change


    class Command(BaseCommand):
        help = ("After loading a dump from a test/staging/prod DB, this command converts all domains to SERVER_DOMAIN.")

        def add_arguments(self, parser):
            parser.add_argument(
                dest='old_server_domain',
                action='store',
                default=None,
                help="OPTIONAL. The script will auto-detect the old domain, but you can force it to use a different value"
                     " if needed. This is useful if you need to run convert_server_domain after it's already run once.",
                # This setting makes old_server_domain an _optional_ positional argument. This maintains backwards
                # compatibility with the old way of calling this command: `manage.py convert_server_domain old.domain`.
                nargs='?'
            )

        def handle(self, **options):
            old_server_domain = options['old_server_domain']
            if not old_server_domain:
                # The user didn't specify an old domain, so auto-detect it from the existing default Site.
                old_server_domain = Site.objects.get(is_default_site=True).hostname

            print(f"Converting from {old_server_domain} to {settings.SERVER_DOMAIN}...")

            # Update the database to match the new SERVER_DOMAIN
            update_db_for_hostname_change(old_server_domain, settings.SERVER_DOMAIN)

            # update_db_for_hostname_change() was designed to be called from within the Site change form, so it doesn't
            # do the last thing we need, which is renaming every Site's hostname.
            for site in Site.objects.filter(hostname__contains=old_server_domain):
                site.hostname = site.hostname.replace(old_server_domain, settings.SERVER_DOMAIN)
                site.save()

Because our associations between sites and collections depend on a naming convention, we added a check to the CollectionForm to prevent renaming a site’s top-level collection.

    def patched_CollectionForm_clean_name(self):
        """
        Monkey patch Wagtail's Collection mechanism to prevent Collections created through the Site
        Creator from being renamed or deleted before their associated Site is deleted. This is
        necessary because several mechanisms assume that a Collection named "blah.example.com" will
        exist alongside the site hosted as "blah.example.com".

        NOTE: There's no "original" CollectionForm.clean_name() function. We are adding it from scratch.
        """
        if self.instance.name in [site.hostname for site in Site.objects.all()]:
            raise ValidationError('Collections named after Sites cannot be renamed.')
        return self.cleaned_data['name']


    # Import the module or class we're patching, then patch it with the above function(s).
    from wagtail.admin.forms.collections import CollectionForm
    CollectionForm.clean_name = patched_CollectionForm_clean_name

Site Deletion

In the Wagtail data model, pages may belong to more than one site so deleting a site does not automatically delete the site’s root_page (and its subpages). In our multitenanted set up, we never allow pages to belong to more than one site, so we will want to delete the pages along with the site. And because our groups and collections do not have foreign keys to the Site model, when we delete a site, we will also need to delete the related objects. We use a post_delete signal to do this work.

    # site_creator/signals.py
    from django.apps import apps
    from django.contrib.auth import get_user_model
    from django.db.models.signals import post_delete, pre_delete


    def post_site_delete_cleanup(sender, instance, **kwargs):
        """
        Makes sure Site-specific Collections and Groups are removed after deleting the associated Site.
        """
        hostname = instance.hostname
        Group = apps.get_model('auth', 'Group')
        Collection = apps.get_model('wagtailcore', 'Collection')
        Page = apps.get_model('wagtailcore', 'Page')

        # Delete Local users that were created for this Site. They are identified by having a username prefixed
        # with the Site's hostname.
        for user in get_user_model().objects.filter(username__startswith=hostname):
            user.delete()

        # Delete the Groups and Collections for this Site, which also deletes the contents of those Collections.
        Group.objects.filter(name__startswith=instance.hostname).all().delete()
        Collection.objects.filter(name__startswith=instance.hostname).all().delete()

        # Delete the homepage and all its children.
        Page.objects.descendant_of(instance.root_page, inclusive=True).all().delete()


    post_delete.connect(signals.post_site_delete_cleanup, sender=Site)