/var/

Various programming stuff

Hello! If you are using an ad blocker but find something useful here and want to support me please consider disabling your ad blocker for this site.

Thank you,
Serafeim

Multiple storages for the same FileField in Django

When you need to support user-uploaded files from Django (usually called media) you will probably use a FileField in your models. This is translated to a simple varchar (text) field in the database that contains a unique identifier for the file. This usually would be the path to the file, however this is not always the case!

What really happens is that Django has an underlying concept called a File Storage which is a class that has information on how to talk to the actual storage backend, and particularly how to translate the unique identifier stored on the db to an actual file object. By default Django stores files in the file system using the FileSystemStorage however it is possible to use different backends through an add-on (for example Amazon S3) or even write your own.

Each FileField can be configured to use a different storage backend by passing the storage parameter; if you don’t use this parameter then the default storage backend is used. So you can easily configure a FileField that would upload files to your filesystem and another one that would upload files to S3.

However, one thing that is not supported though is to use multiple storages for the same FileField depending on some parameter of the model instance. Unfortunately, in a recent project I had to do exactly that: We had a FileField on a model that contained hundreds of GBs of files stored on the filesystem; we wanted to be able to upload the files of new instances of that model on S3 but also wanted to keep the old files on the filesystem to avoid moving all these to S3 (which would result to a lot of downtime). I also wanted a way to be “flexible” on this i.e to be able to change again the storage backend for some instances if needed and definitely not move/copy all these files!

If you take a peek at the FileField options you’ll see that there’s a storage parameter that can be a callable. However this callable is initialized with the models and is not evaluated again until the app is restarted so it can’t be used to decide on the storage for each model instance.

The only thing that is evaluated each time a file is uploaded through the FileField is when upload_to is a function. This function receives the model instance and returns the path that the file will be uploaded to.

The idea is to use this upload_to function to return a different path depending on the model instance and then use a custom storage backend that will use the path to decide on the actual storage backend to use.

This is the code I ended up with for the upload_to function:

def file_upload_path(instance, filename):
    dt_str = app.created_on.strftime("%Y/%m/%d")
    file_storage = ""

    if instance.id >= settings.STORAGE_CHANGE_ID: 
        file_storage = settings.STORAGE_SELECTION_STR + "/"

    return "protected/{0}{1}/{2}/{3}".format(file_storage, dt_str, instance.id, filename)

class Model(models.Model):
    file = models.FileField(upload_to=file_upload_path)

What happens here is that I have a setting STORAGE_CHANGE_ID that is the id of the instance after which all instances will use the different storage backend. You can use whatever method you want here to decide on the storage that would be used; the only thing to keep in mind is to put the storage somewhere on the returned path.

I also have a setting STORAGE_SELECTION_STR that is the string that will be used in the path to differentiate the storage backend. The STORAGE_SELECTION_STR has the value of minios3 for this project.

Using this function the paths of the instances that are >= STORAGE_CHANGE_ID will be of the form protected/minios3/2021/04/11/1234/filename.ext while for the old files these will be of the form protected/2021/04/11/1234/filename.ext. Notice the minios3 string in between.

Of course this is not enough. We also need to tell Django to use the different storage backend for the new files. In order to do this we have to implement a custom storage class like this:

from django.core.files.storage import FileSystemStorage, Storage
from storages.backends.s3boto3 import S3Boto3Storage
from django.conf import settings


class FilenameBasedStorage(Storage):
    minio_choice = settings.STORAGE_SELECTION_STR

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def _open(self, name, mode="rb"):
        if self.minio_choice in name:
            return S3Boto3Storage().open(name, mode)
        else:
            return FileSystemStorage().open(name, mode)

    def _save(self, name, content):
        if self.minio_choice in name:
            return S3Boto3Storage().save(name, content)
        else:
            return FileSystemStorage().save(name, content)

    def delete(self, name):
        if self.minio_choice in name:
            return S3Boto3Storage().delete(name)
        else:
            return FileSystemStorage().delete(name)

    def exists(self, name):
        if self.minio_choice in name:
            return S3Boto3Storage().exists(name)
        else:
            return FileSystemStorage().exists(name)

    def size(self, name):
        if self.minio_choice in name:
            return S3Boto3Storage().size(name)
        else:
            return FileSystemStorage().size(name)

    def url(self, name):
        if self.minio_choice in name:
            return S3Boto3Storage().url(name)
        else:
            return FileSystemStorage().url(name)

    def path(self, name):
        if self.minio_choice in name:
            return S3Boto3Storage().path(name)
        else:
            return FileSystemStorage().path(name)

This class should be self explainable: It uses the settings.STORAGE_SELECTION_STR we mentioned above to decide which storage backend to use and then it forwards each method to the corresponding backend (either the filesystem storage or the S3 storage).

One thing to notice is that the django.core.files.storage.Storage class this class inherits from has more methods that can be implemented (and would raise if called without implementing them) however this implementation works fine for my needs.

One question some readers may have is what happens if the user uploads a file named test-minios3.pdf (i.e a file containing the STORAGE_SELECTION_STR). Well you may just as well ignore it; it will just be saved always on the minio-s3 storage backend. Or you can make sure to remove that string from the filename before saving it on the file_upload_path. I chose to ignore it since it doesn’t matter for my use case.

Finally, we need to tell Django to use this storage class for the file field. We can do this by adding it to the FileField like:

file = models.FileField(upload_to=file_upload_path, storage=FilenameBasedStorage)

or we can configure it on the DEFAULT_FILE_STORAGE setting (for Django < 4.2) or on the STORAGES dict (for Django >= 4.2).

I hope this helps someone else that needs to do something similar!

Using Unpoly with Django

Over the past few years, there has been a surge in the popularity of frontend frameworks, such as React and Vue. While there are certainly valid use cases for these frameworks, I believe that they are often unnecessary, as most web applications can be adequately served by traditional request/response web pages without any frontend framework. The high usage of these frameworks is largely driven by FOMO and lack of knowledge about alternatives. However, using such frameworks can add unnecessary complexity to your project, as you now have to develop two projects in parallel (the frontend and the backend) and maintain two separate codebases.

That being said, I understand that some projects may require extra UX enhancements such as modals, navigation and form submissions without full page reloads, immediate form validation feedback, page fragment updates etc. If you want some of this functionality but do not want to hop on the JS framework train, you can use the Unpoly library.

Unpoly is similar to other libraries like intercooler, htmx or turbo however I find it to be the easiest to be used in the kind of projects I work on. These libraries allow you to write dynamic web applications with minimal changes to your existing server-side code.

In this guide, we’ll go over how to use Unpoly with Django. Specifically, we’ll cover the following topics:

  • An unpoly demo
  • Integrating unpoly with Django
  • Navigation improvements
  • Form improvements
  • Modal improvements (layers)
  • Integration with (some) django packages
  • More advanced concepts

The unpoly demo

Unpoly provides a demo application written in Ruby. You can go on and play with it for a bit to understand what it offers compared to a traditional web app.

I’ve re-implemented this in Django so you can compare the code with a non-unpoly Django app. It can be found on https://github.com/spapas/django-unpoly-demo and the actual demo in Django is at: https://unpoly-demo.spapas.net or https://unpoly-demo.fly.dev/ (deployed on fly.io) or https://unpoly-demo.onrender.com/ (deployed on render.com; notice the free tier of render.com is very slow, this isn’t related to the app). The demo app uses an ephemeral database so the data may be deleted at any time.

Try navigating the demo site and you’ll see things like:

  • Navigation feedback
  • Navigation without page reloads
  • Forms opening in modals
  • Modals over modals
  • Form submissions without page reloads
  • Form validation feedback without page reloads

All this is implemented mostly with traditional Django class based views and templates in addition to a few unpoly attributes.

To understand how much of a difference this makes, after you have taken a peek at the “companies” functionality in the demo, take a look at the actual code that implementation (I’m only pasting the views, the other components are exactly the same as in a normal Django app):

class FormMixin:
    def form_valid(self, form):

        if form.is_valid() and not self.request.up.validate:
            if hasattr(self, "success_message"):
                messages.success(self.request, self.success_message)
            return super().form_valid(form)
        return self.render_to_response(self.get_context_data(form=form))

    def get_initial(self):
        initial = super().get_initial()

        initial.update(self.request.GET.dict())
        return initial


class CompanyListView(ListView):
    model = models.Company


class CompanyDetailView(DetailView):
    model = models.Company


class CompanyCreateView(FormMixin, CreateView):
    success_message = "Company created successfully"
    model = models.Company
    fields = ["name", "address"]


class CompanyUpdateView(FormMixin, UpdateView):
    model = models.Company
    success_message = "Company updated successfully"
    fields = ["name", "address"]


class CompanyDeleteView(DeleteView):
    model = models.Company

    def get_success_url(self):
        return reverse("company-list")

    def form_valid(self, form):
        self.request.up.layer.emit("company:destroyed", {})
        messages.success(self.request, "Company deleted successfully")
        return super().form_valid(form)

Experienced Django developers will immediately recognize that the above code has only two small diferences from what a traditional Django app would have:

  • the check for self.request.up.validate on the form_valid of the FormMixin
  • the self.request.up.layer.emit on the DeleteView form_valid

We’ll explain these later. However the thing to keep is that this is the same as a good-old Django app, without the need to implement special functionality like checks for ajax views, fragments, special form handling etc.

Integrating unpoly with Django

To integrate unpoly with Django you only need to include the unpoly JavaScript and CSS library to your project. This is a normal .js file that you can retrieve from the unpoly install page. Also, if you are using Bootstrap 3,4 or 5 I recommend to also download the corresponding unpoly-bootstrapX.js file.

Unpoly communicates with your backend through custom X-HTTP-UP headers. You could use the headers directly however it is also possible to install the python-unpoly library to make things easier. After installing that library you’ll add the unpoly.contrib.django.UnpolyMiddleware in your MIDDLEWARE list resulting in an extra up attribute to your request. You can then use this up attribute through the API for easier access to the unpoly headers.

To access up through your Django templates you can use request.up or add it to the default context using a context processor so you can access it directly.

To make sure that everything works, add the up-follow to one of your links, i.e change <a href='linkto'>link</a> to <a up-follow href='linkto'>link</a>. When you click on this link you should observe that instead of a full-page reload you’ll get the response immediately! What really happens is that unpoly will make an AJAX request to the server, retrieve the response and render it on the current page making the response seem much faster!

Unpoly configuration

The main way to use unpoly is to add up-x attributes to your html elements to enable unpoly behavior. However it is possible to use the unpoly js API (window.up or up) to set some global configuration. For example, you can use up.log.enable() and up.log.disable() to enable/disable the unpoly logging to your console. I recommend enabling it for your development environment because it will help you debug when things don’t seem to be working.

To use up to configure unpoly you only need to add it on a <script> element after loading the unpoly library, for example:

<script src="{% static 'unpoly/unpoly.min.js' %}"></script>
<script src="{% static 'unpoly/unpoly-bootstrap4.min.js' %}"></script>
<script src="{% static 'application.js' %}"></script>

And in application.js you can use up directly, for example to enable logging:

  up.log.enable()

We’ll see more up configuration directives later, however keep in mind that for a lot of up-x attributes it is possible to use the config to automatically add that attribute to multiple elements using a selector.

Using the up-follow directive you can start adding up-follow to all your links and you’ll get a much more responsive application. This is very simple and easy.

One interesting thing is that we didn’t need to change anything on the backend. The whole response will be retrieved by unpoly and will replace the body of the current page. Actually, it is possible to instruct unpoly to replace only a specific part of the page using a css selector (i.e replace only the #content div). To do this you can add the up-target attribute to the link, i.e <a up-target='#content' up-follow href='linkto'>link</a>. When unpoly retrieves the response, it will make sure that it has an #content element and put its contents to the original page #content element.

This technique is called linking to fragments in the unpoly docs. To see this in action, try going to the tasks in the demo and add a couple of new task. Then try to edit a that task. You’ll notice that the edit form of the task will replace the task show card! To do that, unpoly loads the edit task form and matches the .task element there with the current .task element and does the replacement (see here for rules on how this works).

Beyond the up-follow, you can also use two more directives to further improve the navigation:

  • up-instant to follow the link on mousedown (without waiting for the user releasing the mouse button)
  • up-preload to follow the link when the mouse hovers over the link

Using the up-main

To make things simpler, you can declare an element to be the default replacement target. This is done by adding the up-main attribute to an element. This way, all up-follow links will replace that particular element by default unless they have an up-target element themselves.

What I usually do is that I’ve got a base.html template looking something like this:

    {% include "partials/_nav.html" %}
    <div up-main class="container">
      {% include "partials/_messages.html" %}
      {% block content %}
      {% endblock %}
    </div>
    {% include "partials/_footer.html" %}

See the up-main on the .container? This way, all my up-follow links will replace the contents of the .container element by default. If I wanted to replace a specific part of the page, I could add the up-target attribute to the link.

If there’s no up-main element, unpoly will replace the whole body element.

It is possible to make all links (or links that follow a selector) followable by default by using the up.link.config.followSelectors option. I would recommend to only do this on greenfield projects where you’ll test the functionality anyway. For existing projects I think it’s better to add the up-follow attribute explicitly to the links you want to make followable.

This is recommend it because there are cases where using unpoly will break some pages, especially if you have some JavaScript code that relies on the page being loaded. We’ll talk about this in the up.compiler section.

If you have made all the links followable but you want to skip some links and do a full page reload instead, add the up-follow=false attribute to the link or use the up.link.config.noFollowSelectors config to make multiple links non-followable.

You can also make all links instant or preload for example by using up.link.config.instantSelectors.push('a[href]') to make all followable links load on mousedown. This should be safe because it will only work on links that are already followable.

One very useful feature of unpoly is that it adds more or less free navigation feedback. This can be enabled by adding an [up-nav] element to the navigation section of your page. Unpoly then will add an up-current class to the links in that section that match the current URL. This works no matter if you are using up-follow or not. You can then style .up-current links as you want.

If you are using Bootstrap along with the unpoly-bootstrap integrations you’ll get all that without any extra work! The unpoly-bootstrap has the following configuration:

up.feedback.config.currentClasses.push('active');
up.feedback.config.navSelectors.push('.nav', '.navbar');

So it will automatically add the up-nav element on .nav and .navbar elements and will add the active class to the current link (in addition to the .up-current class). This is what happens in the demo, if you take a peek you’ll see that there are no up-nav elements (since these are marked by the unpoly-bootstrap integration) in the navigation bar and we style the .active nav links.

Aliases for navigation feedback

Unpoly also allows you to add aliases for the navigation feedback. For example, you may have /companies/ and /companies/new and you want the companies nav link to be active on both of them. To allow that you need to use the up-alias attribute on the link like

<a class='nav-item nav-link' up-follow href='{% url "company-list" %}' up-alias='{% url "company-list" %}new/'>Companies</a>

(notice that in my case the url of company-list is /companies/ that’s why I added {% url "company-list" %}new/ on the alias so the resulting alias path would be /companies/new/), or even add multiple links to the alias

<a class='nav-item nav-link' up-follow href='{% url "company-list" %}' up-alias='{% url "company-list" %}*'>Companies</a>

This will add the up-current class to the a element whenever the url starts with /companies/ (i.e /companies/, /companies/new, /companies/1/edit etc).

Please notice that it is recommended to have a proper url hierarchy for this to work better. For example, if you have /companies_list/ and /add_new_company/ you’ll need to add the aliases like up-alias='/companies_list/ /add_new_company/' (notice the space between the urls to add two aliases). Also, if you want to also handle URLS with query parameters i.e /companies/?name=foo then you’ll need to add ?* i.e /companies/?*. These urls aren’t aliased by default so /companies/ doesn’t match /companies/?name=foo unless you add an alias.

One final remark is that it is possible to do some trickery to automatically add up-alias to all your nav links. This is useful in case you have many nav elements and you don’t want to add aliases to each one of them, for example, using this code:

  up.compiler('nav a[href]', (link) => {
    if(!link.href.endsWith('#')) link.setAttribute('up-alias', link.href + '*')
  })

an up-alias attribute will be added to all links. The callback of the compiler will be called when the selector is matched and in this case add the up-alias attribute to the link. We’ll talk later about compilers more.

Handling forms

Unpoly can also be used to handle forms without page reloads, similar to following links. This is simple to do by adding an up-submit attribute to your form. Also similar to links you can make all your forms handled by unpoly but I recommend to be cautious before doing this on existing projects to make sure that stuff doesn’t break.

When you add an up-submit to a form unpoly will do an AJAX post to submit the form and replace the contents of the up-target element with the response (if you don’t specify an up-target element, it will use the up-main element in a similar way as links). This works fine with the default Django behavior, i.e when the form is valid Django will do a redirect to the success url, unpoly will follow that link and render the response of the redirect.

Integrating with messages

Django has the messages framework that can be used to add one-time flash messages after a form is successfully submitted. You need to make sure that these messages are actually rendered! For example, in the base.htm template I mentioned before, we’ve got the following:

    {% include "partials/_nav.html" %}
    <div up-main class="container">
      {% include "partials/_messages.html" %}
      {% block content %}
      {% endblock %}
    </div>
    {% include "partials/_footer.html" %}

please notice that we’ve got the partials/_messages.html template included in the up-main element (inside the container). This means that when unpoly replaces the contents of the up-main element with the response of the form submission, the messages will be rendered as well. So it will work fine in this case.

However, if you are using up-target to render only particular parts of the page the flash messages will be actually lost! This happens because unpoly will load the page with the flash messages normally, so these messages will be consumed; then it will match the .target and display only that part of the response.

To resolve that you can use the up-hungry attribute on your messages. For example, in the partials/_messages.html template we’ve got the following:

<div id='flash-messages' class="flash-messages" up-hungry>
    {% for message in messages %}
        <div class="alert fade show {% if message.tags %} alert-{% if 'error' in message.tags %}danger{% else %}{{ message.tags }}{% endif %}{% endif %}">
            {{ message }}
        </div>
    {% endfor %}
</div>

The up-hungry attribute will make unpoly refresh that particular part of the page on every page load even if it’s not on the target. For example notice how the message is displayed when you edit or mark as done an existing task in the demo.

However also notice that no messages are displayed if you create a new task! This happens because the actual response is “eaten” by the layer and the messages are discarded! We’ll see how to fix that later.

Immediate form validation

Another area in which unpoly helps with our forms is that if we add the up-validate attribute to our form, unpoly will do an AJAX post to the server whenever the input focus changes and will display the errors in the form without reloading the page. For this we need a little modification to our views to check if the unpoly wants to validate the form. I’m using the following form_valid on a form mixin:

def form_valid(self, form):

    if form.is_valid() and not self.request.up.validate:
        if hasattr(self, "success_message"):
            messages.success(self.request, self.success_message)
        return super().form_valid(form)
    return self.render_to_response(self.get_context_data(form=form))

So if the form is not valid or we get an unpoly validate request from unpoly we’ll render the response - this will render the form with or without errors. However if the form is actually valid and this is not an unpoly validate request we’ll do the usual form save and redirect to the success url. This is enough to handle all cases and is very simple and straightforward. It works fine without unpoly as well since the up.validate will be always False in this case.

One thing to keep in mind is that this works fine in most cases but may result to problematic behavior if you use components that rely on javascript onload events. The up-validate will behave more or less the same as with up-follow links.

Other form helpers

Beyond these, unpoly offers a bunch of form helpers to run callbacks or auto-submit a form when a field is changed. Most of this functionality can be replicated by other js libraries (i.e jquery) or even by vanilla.js and is geared towards the front-end so I won’t cover it more here.

Understanding layers

One of the most powerful features of unpoly is layers. To understand the terminology, a layer is any page that is stacked on top of another. The initial page is called the root layer, all other layers are called overlays. Layers can be arbitrary opened and stacked, there’s no limit on the number of layers that can be opened.

An overlay can be rendered like a modal / popup / drawer. The simplest way to use an overlay is to add an up-layer='new' attribute to a link. For example, in the demo app, the link to open a company is like this:

  <a
    up-layer='new'
    up-on-dismissed="up.reload('.table', { focus: ':main' })"
    up-dismiss-event='company:destroyed'
    href="{% url 'company-detail' company.id %}">{{ company.name }}</a>

(ignore the dismiss-related attributes for now). This opens a new modal dialog with the contents of the company detail. It will render the whole contents of the up-main element inside the modal since we don’t provide an up-target. If we added an up-target='.projects' attribute to this it would render only the .projects element inside the modal (but remember that it will retrieve the whole response since the /companies/detail/id is a normal django DetailView). So with up-layer='new' we open a page on a new overlay/modal. If we also add an up-target to it we’ll open only a particular part of that page.

You can use up-mode attribute to change the kind of overlay; the default is a modal. Also if you want to configure the ways this modal closes you can use the up-dismissable attribute, for example add up-dismissable='button' to allow closing only with the X button on the top right. Another useful thing is that there’s an up-size attribute for changing the size of the overlay. I recommend playing a bit with these options to have a feel on how they are working and what you can do with them.

Static overlay content

An overlay can also contain “static” content (i.e not follow a link but display some html) by using the up-content attribute. This is how the green dots are implemented, their html is similar to this:

<a href="#" class="tour-dot viewed" up-layer="new popup" up-content="<p>Navigation links have the <code>[up-follow]</code> attribute. 
    <p>
        <a href=&quot;#&quot; up-dismiss class=&quot;btn btn-success btn-sm&quot;>OK</a>
    </p>
    " up-position="right" up-align="top" up-class="tour-hint" up-size="medium">
</a>

Notice that the up-content contains a whole html snippet. This is implemented in Django using the following template tag:

@register.tag("tourdot")
def do_tourdot(parser, token):
    nodelist = parser.parse(("endtourdot",))
    parser.delete_first_token()
    return TourDotNode(nodelist)


class TourDotNode(template.Node):
    def __init__(self, nodelist):
        self.nodelist = nodelist

    def render(self, context):
        rendered = self.nodelist.render(context).strip()
        size = "medium"
        if len(strip_tags(rendered)) > 400:
            size = "large"
        if not rendered.startswith("<p"):
            rendered = "<p>{}</p>".format(rendered)

        rendered += """
        <p>
            <a href="#" up-dismiss class="btn btn-success btn-sm">OK</a>
        </p>
        """
        from django.utils.html import escape

        output = escape(rendered)
        return """
        <a 
            href="#" class="tour-dot" up-layer="new popup" 
            up-position="right" up-align="top" up-class="tour-hint"
            up-content="{}"
            up-size="{}"
            >
        </a>
        """.format(
            output, size
        )

So we can do something like this in our Django templates:

{% tourdot %}
  <p>Navigation links have the <code>[up-follow]</code> attribute. Clicking such links only updates a <b>page fragment</b>. The remaining DOM is not changed.</p>
{% endtourdot %}

Advanced layers/overlays

Opening overlays for popups or for modals to view links that don’t have interactivity is simple. However, when you open forms with overlays and need to handle these (i.e close the modals only when the form is submitted succesfully) the situation unfortunately starts to get more complex. I recommend to start by reading the subinteractions section of the unpoly documentation to understand how these things work. In the following subsections we’ll talk about specific cases and how to handle them with unpoly layers and Django.

Opening new layers over existing ones

How opening a new layer over an existing layer (i.e a modal inside a modal) would work? All links and forms that are handled in an existing layer will be handled in the same layer. So if we have opened a layer and there are up-follow links in the html of the layer, the user would be able to follow them normally inside that layer (of course if there are non up-follow links then a full page reload will be performed and the layer will disappear without a trace).

If we want to open a new layer we need to use the up-layer='new' attribute on that link; it doesn’t matter if this is inside an already opened layer, it will work as expected and open a layer-in-a-layer. If the parent layer is an overlay then it will open an overlay-in-an-overlay.

In the demo, if you click on an existing company to see its details you’ll get an overlay. If you try to edit that company the edit for will be opened in the same layer (notice that if you press the X button to close it you’ll go back to the company list without layers). Compare this with the behavior when adding a new project or viewing an existing one. You’ll get an overlay inside the parent overlay (both overlays should be visible). You need to close both overlays to go back to the company detail at the root layer.

Even more impressive: Go to the company detail layer, click an existing project to get to the project detail layer, click edit; this will be opened on the project detail layer. You can edit the project or even delete it, when you click that overlay the company overlay will be updated with the new data and work fine! All this also works fine from the project detail list without any modifications on the Django code.

The thing to remember here is that the layer behavior is very intuitive and is compatible with how a server side application works. Everything should work the same no matter if the link is opened in an overlay or in a new page or even an overlay over an overlay.

Closing overlays

There are three main ways to close an overlay (beyond of course using the (X) button or esc etc):

  • Visiting a pre-defined link
  • Explicitly closing the overlay from the server
  • Emitting an unpoly event

Also, when an overlay is closed we can decide if the overlay did something (i.e the user saved the form) or not (i.e the user clicked the X button). This is called accepted or dismissed respectively. We can use this to do different things. All the methods of closing an overlay have a version for accepting or dismissing the overlay.

To close the overlay on visiting a link we’ll use the up-accept-location and up-dismiss-location respectively. For example, let’s take a peek on the new company link:

  <a
    class='btn btn-primary'
    up-layer='new'
    up-on-accepted="up.reload('.table', { focus: ':main' })"
    up-accept-location='/core/companies/detail/$id/'
    href='{% url "company-create" %}'>New company</a>

The important thing here is the up-accept-location. When Django creates a new object it redirects to the detail view of that object. In our case this detail view is '/core/companies/detail/$id/'; the $id is an unpoly thingie that will be replaced by the id of the new object and will be the result value of the overlay. This value (the id) can then be used on the up-on-accepted callback if we want.

Now, let’s suppose that we want to close the overlay when the user clicks on a cancel button that returns to the list of companies. We can do that by adding the up-dismiss-location attribute to that <a>

up-dismiss-location='{% url "company-list" %}'

The difference between these two is that the up-on-accepted event will only be called when the overlay is accepted and not on dismissed.

Handling hardcoded urls

One thing that Django developers may not like is that the url is hardcoded here. This is because using {% url "company-detail" "$id" %} will not work with our urls since we use have the following path for the company detail "companies/detail/<int:pk>/". We can change it to "companies/detail/<str:pk>/", to make it work but then it will allow strings in the url and it will throw 500 error instead of 404 when the user uses a string there (to improve that we have to override the get_object of the DetailView to handle the string case). Another way to improve that is to create a urlid template tag like this:

from django.urls import reverse

@register.simple_tag
def urlid(path, arg):
    if arg == "$id":
        arg = 999

    url = reverse(path, args=[arg])
    return url.replace("999", "$id")

And then using it like this on the up-accept-location:

up-accept-location='{% urlid "company-detail" "$id" %}'

Explicitly closing the layer

To close the layer from the server you can you use the X-Up-Accept-Layer or X-Up-Dismiss-Layer response header. When unpoly sees this header in a response it will close the overlay by accepting/dismissing it.

To do that from Django if you have integrated the unpoly middleware, call request.up.layer.accept() and request.up.layer.dismiss() respectively (passing an optional value if you want).

The same feature can be used to close the overlay from the client side. For example, if you want to close the overlay when the user clicks on a cancel button that returns to the list of companies you can do that by adding the up-accept or up-dismiss attribute, like:

<a href='{% urlid "company-detail" "$id" %}' up-dismiss>Return</a>

Please notice that the href here could be like href='#' since this is javascript only to close the overlay, however we added the correct href to make sure the return button will also work when we open the link in a new page (without any overlays).

Please notice that difference between this and up-accept-location or up-dismiss-location we mentioned before. In this case the up-accept/dismiss directive in placed in the a link that closes the overlay. In the former case the up-accept/dismiss-location directive is placed in the link that opens the overlay.

Closing the layer by emitting an unpoly event

The final way to close an overlay is by emitting an event. Unpoly can emit events both from the server, using the X-Up-Event response header or using request.up.emit(event_type, data) from the unpoly Django integration. Also events can be emitted from the client side using the up-emit attribute.

To close the overlay from an event we need to use up-accept-event and up-dismiss-event on the link that opens the overlay.

Let’s see what happens when we delete a company. We’ve got a form like this:

<form up-submit up-confirm='Really?' class="d-inline" method='POST' action='{% url "company-delete" company.id %}'>
  {% csrf_token %}
  <input type='submit' value='Delete' class='btn btn-danger mr-3' />
</form>

This form asks the user for confirmation (using the up-confirm directive) and then submits the form on the company delete view. The CompanyDeleteView is like this:

class CompanyDeleteView(DeleteView):
    model = models.Company

    def get_success_url(self):
        return reverse("company-list")

    def form_valid(self, form):
        self.request.up.layer.emit("company:destroyed", {})
        return super().form_valid(form)

So, it will emit the company:destroyed event and redirect to the list of companies (this is needed to make sure that delete works fine if we call it from a full page instead of an overlay). The company detail view overlay is opened from the following a link:

<a
  up-layer='new'
  up-on-dismissed="up.reload('.table', { focus: ':main' })"
  up-dismiss-event='company:destroyed'
  href="{% url 'company-detail' company.id %}">{{ company.name }}</a>

Notice that we have the up-dismiss-event here. If we didn’t have that then the overlay wouldn’t be closed when we deleted the company but we’d see the list of companies on the overlay because of the redirect on the Django side! Also, instead of the up-dismiss-event we could use the up-dismiss-location='{% url "company-list" %}' similar to how we discussed before. If we did it this way we wouldn’t even need to do anything unpoly related in our DeleteView, however using events for this is useful for educational reasons and we’ll see later how events will help us to dispaly a message when companies are deleted.

Doing stuff when a layer is closed

After a layer is closed (and depending if it was accepted or dismissed) unpoly allows us to use callbacks to do stuff. The most obvious things are to reload the list of results if a result is added/edited/deleted or to choose a result in a form if we used the overlay as an object picker.

The callbacks are up-on-accepted and up-on-dismissed.

Let’s see some examples from the demo.

On on the new company link we’ve got up-on-accepted="up.reload('.table', { focus: ':main' })". However on the show details company link we’ve got up-on-dismissed="up.reload('.table', { focus: ':main' })". This is a little strange (why up-on-accepted on the new vs up-on-dismissed on the detail) at first but we can explain it.

First of all, the up.reload method will do an HTTP request and reload that specific element from the server (in our case the .table element that contains the list of companies). The focus option that is passed instructs unopoly to move the focus to (that element)[https://unpoly.com/focus-option].

For the “Add new” company we reload the companies when the form is accepted (when the user clicks on the “Save” button). However for the show details we’ll reload every time the overlay is dismissed because when the user edits a company the layer will not be closed but will display the edit company data. Also when we delete the company the layer will be dismissed.

Notice that if the user clicks the company details and then presses the (X) button we’ll still do a reload even though it may not be needed because we can’t know if the user actually edited the company or not before closing the overlay. This is a little bit of a tradeoff but it’s not a big deal.

Actually, it is possible to know if the overlay was dismissed because the user clicked the (X) button (or pressed escape) or if the overlay was dismissed because the object was deleted. This is useful if we wanted to display a message to the user when the company was deleted since we’d need to differentiate between these cases. We’ll see how the section about overlays and messages.

On the company detail we’ve got up-on-accepted='up.reload(".projects")' for adding a new project but same as before we’ve got up-on-dismissed='up.reload(".projects")' for viewing the project detail. The .projects element is the projects holder inside the company detail. This is exactly the same behavior we explained before.

On the project form we’ve got up-on-accepted both on the suggest name and on the new company button. In the first case, we are opening the name suggestion overlay like this:

<a
    up-layer='new popup'
    up-align='left'
    up-size='large'
    up-accept-event='name:select'
    up-on-accepted="up.fragment.get('#id_name').value = value.name"
    href='{% url "project-suggest-name" %}'>Suggest name</a>

Notice that this overlay will be accepted when it receives the name:select event. This event passes it the selected name so it will put it on the #id_name input. The up.fragment.get is used to retrieve the input. To understand how this works we need to also see the name suggestion overlay. This is more or less similar to:

  {% for n in names %}
    <a up-emit="name:select"
       up-emit-props='{"name": "{{ n }}"}'
       class="btn btn-info text-light mb-2 mr-1"
       tabindex="0">
      {{ n }}
    </a>
  {% endfor %}

So we are using the up-emit directive here to emit the name:select event and we pass it some data which must be a json object. Then this data will be available as a javascript object named value on the up-on-accepted callback.

So the flow is:

  1. When we click the suggest name link we open a new overlay and wait for the name:select event to be emitted. We don’t care if we are a full page or already inside an overlay
  2. The suggest name overlay displays a link of <a> elements that emit the name:select event when clicked and also pass the selected name as data on the event
  3. The overlay opener receives the name:select event and closes the overlay. It then uses the data to fill the #id_name input

The second case is similar but instead of filling an input it opens a new overlay to create and select a new company. This is the create company link from inside the project form:

  <a href='{% url "company-create" %}'
      up-layer='new'
      up-accept-location='{% urlid "company-detail" "$id" %}'
      up-on-accepted="up.validate('form', { params: { 'company': value.id } })"
  >
      New company
  </a>

Nothing extra is needed from the company form side! We use the up-accept-location to accept the overlay when the company is created (so the user will be redirect to the company-detail view). Then we call up.validate('form', { params: { 'company': value.id } }) after the overlay is accepted. First of all, please remember that when we use the up-accept-location the overlay result will be an object with the captured parts of the url. In this case we capture the new company id. Then, we call up.validate passing it the form and the company id we just retrieved (i.e the id of the newly created company).

It is important to understand that we do up.validate here instead of simply setting the value of the select to the newly created id (similar to what we did before with the name) because the newly created value is not in the options that this select contains so it can’t be picked at this time; when the validate returns though it will contain the newly created company to the options so it will be selected then.

If we wanted to select the newly created company without doing the validate instead we’d need to first add a new option to the select with the correct id and then set it to that value (which is a little bit more complex since we don’t know the name of the new company at this point). To properly implement that and to further understand how unpoly works, we’d need to emit a company:create event from our CreateCompanyView which would contain as data both the id and the name of the newly created company. Then we’d change our accept condition to up-accept-event='company:create'. Finally, our up-on-accepted would add a new option with the value.name and value.id it received from the event and select that option.

Overlays and messages

This probably is the most complex part of integrating unpoly with Django. The problem is that when we do an action the messages will be displayed on the page that our response redirects to. If we don’t display that page but we only use it as on-accept-location we’ll miss these messages. There are various solutions on how this can be fixed, and there also is a long discussion in the unpoly repo discussions about that.

We’ve already discussed about the up-hungry in your messages container element that will reread its contents from all responses. This will resolve all cases where the response is not discarded. For example, try to edit a project and you’ll see the edit message on the overlay (instead of the main page). This is because the overlay is contained in the up-main element so it will be rendered in the response in the overlay.

The problematic behavior is when creating a new project or deleting one. In both these cases we discard the response so the messages are lost. The simplest way to actually fix this is to ignore the server side message and render again the message from unpoly. This avoids changing anything on your server-side code. So, in order to implement this, we’ll use this function which we add on application.js (after a suggestion on the afforementioned discussion):

async function reloadWithFlash(selector, flash) {
  await up.reload(selector)
  up.element.affix(document.getElementById('flash-messages'), '.alert.fade.show.alert-success', { text: flash })
}

This function will call up.reload with a selection we pass to it (f.e up.reload('.table')) and wait until this function finished. Then it will add a new element on the flash-messages container with the flash text we pass to it. In order to use it, we’ll change the new company link to:

      <a
        class='btn btn-primary'
        up-layer='new'
        up-on-accepted="reloadWithFlash('.table', 'Company created!')"
        up-accept-location='{% urlid "company-detail" "$id" %}'
        href='{% url "company-create" %}'>New company</a>

(remember that up-on-accepted before was up-on-accepted="up.reload('.table', { focus: ':main' })"), let’s skip focus for now it’s not important). If we try it this way we’ll notice that we’ll get the Company created message after the overlay is closed! As I said before, the problem with this is that we ignore the server side message and duplicate the message both on server and on client side. The Django side message will be used when we open the /companies/new link on a new page (not an overlay) so the overlay functionality won’t be used and the message will be rendered properly on the response. When we use an overlay the client side message will be rendered instead.

Another solution would be to change our CompanyCreateView to redirect to the companies list page (instead of the newly created page). In this case, we can change the new company form like:

<form up-submit up-validate method="POST" up-layer='root'>
  ...
</form>

Adding the up-layer='root' will render the response in the root layer which will close the overlay and render everything on the up-main element. Since we redirect to the companies list, we’ll get the list of companies along with the server-side message. This solution is actually simpler but modifies our server-side app (instead of the usual behavior of redirecting to the new company detail well’ll redirect to the companies list).

Let’s now talk about delete. As we’ve already discussed above, the company detail overlay will be closed either when the user closes it explicitly by clicking the (X) button or because the company was deleted. In both cases we want to reload the companies but when the company is deleted we also want to display a message. So we need to know when the overlay was closed because the company was deleted vs when the overlay was closed explicitly by the user.

Right now we’ve got

<a
  up-layer='new'
  up-dismissable='button'
  up-on-dismissed="up.reload('.table', { focus: ':main' })"
  up-dismiss-event='company:destroyed'
  href="{% url 'company-detail' company.id %}">{{ company.name }}</a>

For starters, we’ll add the following function:

async function reloadWithFlashIfEvent(selector, flash, value) {
  await up.reload(selector, { focus: ':main' })
  if(value instanceof Event) {
    up.element.affix(document.getElementById('flash-messages'), '.alert.fade.show.alert-danger', { text: flash })
  }
}

and change up-on-dismissed to up-on-dismissed="reloadWithFlashIfEvent('.table', 'Company deleted!', value)" on the open overlay link.

The up-on-dismissed and up-on-accepted callbacks are passed these paremeters by unpoly: * this The link that originally opened the overlay * layer An up.Layer object for the dismissed overlay * value The overlay’s dismissal value * event An up:layer:dismissed event

If the event was dismissed because the user clicked the (X) button, the value would have a similar to :button (there are same string values for pressing escape or clicking outside the modal). However if it was dismissed because of the company:destroyed event, the value would be an Event object. So we pass the value to our reloadWithFlashIfEvent callback and check if the value is an Event object. If it is, we know that the overlay was dismissed because the company was deleted and we can display the flash message. If it’s not, we know that the overlay was dismissed because the user clicked the (X) button and we won’t display the flash message.

Another way we could implement this would be if we closed the company detail overlay when the company was deleted and returned the response (which is the company list view) to the root layer. Something like this:

<form up-submit up-confirm='Really?' class="d-inline" method='POST' action='{% url "company-delete" company.id %}' up-layer='root'>

(notice we added the up-layer='root' attribute). For this to work we need to not reload in the up-on-dismissed function because if we reload the companies list the contents of the flash-messages will be re-read (because it has the up-hungry attr) and be immediately cleared out! However in this case we need to reload because a company may be edited!

Improving delete

Right now, the delete button is a form, similar to this:

<form up-submit up-confirm='Really?' class="d-inline" method='POST' action='{% url "company-delete" company.id %}'>
  {% csrf_token %}
  <input type='submit' value='Delete' class='btn btn-danger mr-3'  />
</form>

So this is an unpoly-handled form and will display a Really? javascript prompt to make sure the user really wants to delete the company.

I have to confess that I don’t like javascript prompts because they can’t be styled and seem out of context from the app. However we can improve that behavior with unpoly. Here’s an improved version of the delete functionality:

<a class='btn btn-danger' up-layer="new" up-content='
      <h3>Delete company {{ company.name }}</h3>
      Do you want to delete the company? 
      <form up-submit up-target=".table" up-layer="root" class="d-inline" method="POST" action="{% url "company-delete" company.id %}">
        {% csrf_token %}
        <input type="submit" value="Yes" class="btn btn-danger mr-3"/>
        <a href="#" class="btn btn-secondary"  up-dismiss>No</a>
      </form>
      '>Delete</a>

We changed the delete button to open a new layer. Instead of having a special view for the delete confirmation, we’re using the up-content attribute to directly pass the static HTML for the confirmation, which actually includes the delete form like before. Notice that we also include an up-dismiss button that clears the overlay when the user presses No. The up-layer of the form is root so when the form is submitted it will close both the confirmation overlay and the company detail overlay! Now, we’ll change the reloadWithFlashIfEvent like this:

async function reloadWithFlashIfEvent(selector, flash, value) {
  await up.reload(selector, { focus: ':main' })
  if(value instanceof Event || value == ':peel') {
    up.element.affix(document.getElementById('flash-messages'), '.alert.fade.show.alert-danger', { text: flash })
  }
}

This checks if the value is an event or :peel; this is the value that is passed when the overlay is dismissed because we use the up-layer='root' from the delete form.

Improving interaction with Django packages

There are two very important packages that I use on almost all my projects: django-tables2 and django-filter. You can see these in action at the /core/tf/ path on the demo app. You’ll see that:

  • Filtering is instant (when entering a character it will filter without the need to submit the form explicitly )
  • The row detail links open in an overlay
  • Sorting and pagination are handled by unpoly (so they don’t do full page reloads)

To have the instant filtering we’ve changed our filter form like this:

<form up-autosubmit up-delay='250' class='form-inline' method='GET' action='' up-target='.form-data'>
    {{ filter.form|crispy }}
    <input class='btn btn-info' type='submit' value='Filter'>
    <a up-follow href='{{ request.path }}' class='btn btn-secondary'>Reset</a>
</form>

Notice the up-autosubmit; this will submit the form when a field changes. Also the up-delay adds a small delay before the form is submitted so when the user writes foo it will do 1 query instead of 3 (if he writes fast enough of course). The up-target attribute is used to specify the element that will be updated with the response. In this case we’re using a form-data element that includes the whole table (using django-tables2 of course):

<div class='form-data'>
  {% render_table table %}
</div>

To open the links in a layer we only need to pass the correct parameters to the table field, for example in our case the table is like this:

class CompanyTable(tables.Table):
    id = tables.LinkColumn(
        "company-detail",
        args=[A("id")],
        attrs={
            "a": {
                "class": "btn btn-primary btn-sm",
                "up-on-dismissed": "up.reload('.table', { focus: ':main' })",
                "up-layer": "new",
            }
        },
    )

    class Meta:
        model = models.Company
        template_name = "django_tables2/bootstrap4.html"
        fields = ("id", "name", "address")

So we a pass the attributes directly to the link’s a element. Nothing really fancy is needed.

Furthermore, notice that we use the builtin bootstrap4 template. We don’t change the template at all. The original django-tables2 template does not have unpoly interation! So if we leave it like this the pagination and header links will start a full request/response. To fix that, we could override the template with our own however this is not the ideal solution for me.

Instead, we can use up.compiler:

up.compiler('.pagination .page-item a.page-link', (link) => {
  link.setAttribute('up-target', ".table-container")
})

up.compiler('th.orderable a[href]', (link) => {
  link.setAttribute('up-target', ".table-container")
})

The up.compiler function takes a CSS selector and a callback function. The callback function is called when a snippet matching the selector is added to the DOM. In this case we’re adding the up-target='.table-container' unpoly attributes to both the pagination and the table header order links. The .table-container is the element that contains the table (it is added by django-tables2).

This way, when unpoly sees these links it will add the up-target attribute (and functionality) to these without the need to override any templates.

Advanced concepts

We’ll discuss some more advanced concepts of unpoly now.

More about up.compiler

The up.compiler function is very powerful. We already used it to add functionality to all our nav links (see the navigation aliases before) to avoid forgetting it and to add the up-target to our table links to avoid overriding the django-table2 templates.

Beyond these, the most important functionality of up.compiler is to replace the javascript on load (or jquery $(function() {})) event. Most common javascript libraries will be initialized when the document is ready. Unfortunately, when a page is loaded through unpoly this event will not be trigger, so the javascript elements will not be initialized! Let’s suppose that we’ve got a bunch traditional jquery ui datepicker elements and all these have the .datepicker css class. Normally we’d initialize it like

$(function() {
  $('.datepicker').datepicker()
})

If we are to load a form with these elements through unpoly we won’t get the datepicker functionality. To fix this we can use up.compiler:

up.compiler('.datepicker', (element) => {
  $(element).datepicker()
})

So when unpoly sees a .datepicker element it will call that callback function and initialize it! This will work properly if you follow links through up-follow or open new overlays with up-layer='new'.

Passing context from unpoly to server

Unpoly has an up-context attribute that can be used to pass context to the server. This must be a json object and can then be used to change the response based on that context. If we are using the unpoly python package then the context will be available in the request.up.context dictionary.

Let’s see a particular example from the demo. When we create a new task we’ve got the following link:

<a
    class='btn btn-primary'
    up-layer='new'
    up-context='{"new_task": true}'
    up-accept-location='/core/tasks/detail/$id/'
    up-on-accepted="reloadWithFlash('.tasks', 'Task created!')"
    href='{% url "task-create" %}'>New task</a>
</div>

Compare this with the edit link:

<a up-target='.task' href='{% url "task-update" task.id %}' class='btn btn-sm btn-outline-secondary'>Edit</a>

Notice that the up-context is only included in the new link. Now, let’s see how the task form is implemented:

<form up-submit {% if not up.context.new_task %}up-target='.task'{% endif %} class='task card' method="POST">
    {% csrf_token %}
    <div class="card-body d-flex flex-column">
        <div class="form-group flex-grow-1 mb-0">
            {{ form|crispy }}
        </div>
        <div class="flex-grow-0">
            <input type='submit' class='btn btn-primary mt-2' value='Save'>
        </div>
    </div>
</form>

So, using Django we check to see if there’s new_task in the context and add an up-target='.task' if not. This way, we’ll get an up-target='.task' in the form only if we click the edit task button. Beyond this the form is the same for both the new and edit links.

This is needed because when we open the edit task it will be loaded in the same .task element we clicked the edit link from (remember that unpoly is smart enough to match closer elements). When the form is submitted we want the detail view of the task to also be rendered on the same .task element so we use the up-target='.task'. This isn’t needed in the create new since it will be rendered in a new layer and we want to reload the the tasks with the flash message when the new task is created.

Please notice that if we were to use up-target='.task' for both the new and edit form we’d get an error when the new task form was submitted because it wouldn’t be able to match the target .task element!

Listening to unpoly events

For most things happening in unpoly you’ll find out that there are events that you can listen to and add behavior. There are cases where handling these is useful.

For example, I’ve observed that if you’re using bootstrap dropdowns and click a link while the dropdowns are opened, the dropdowns will remain open when the fragment has been loaded! This is very annoying. One simple way to resolve that is include the navigation inside your up-main element so the dropdowns will be reloaded. However there’s a better way by using unpoly events like in the following snippet:

    up.on('up:link:follow', function(event, link) {
      // Hide visible dropdowns
      const dropdownElementList = document.querySelectorAll('.dropdown-toggle.show')
      const dropdownList = [...dropdownElementList].map(dropdownToggleEl => new bootstrap.Dropdown(dropdownToggleEl))
      dropdownList.forEach(dropdown => dropdown.hide())
    })

Please notice that this code is for bootstrap 5 (not 4 as the remaining code in the demo since it’s from a different project). So what happens is that whenever a link is followed from unpoly we’ll clear the open dropdowns (the code isn’t very important here).

Updating history

One thing to consider when using unpoly is when we actually need to update the browser history and url. By default, unpoly will update the url only if the up-target matches up-main (so if there’s no up-target the url will always be upgraded).

This can be configured through the up-history attribute. By default this has the value 'auto' and we can set to 'true' or 'false' if we want to configure it so that unpoly updates the history or not for a particlar link or form submission.

Let’s see a particular example from the demo. Because the up-target of the filter form on the is set to .form-data:

<form up-autosubmit up-delay='250' class='form-inline' method='GET' action='' up-target='.form-data'>

the url will not be updated when the filter is changed. This is contrary to the usual way these kind of filters work (i.e update the url with the filter parameters). So we can add the up-history attribute:

<form up-autosubmit up-history='true' up-delay='250' class='form-inline' method='GET' action='' up-target='.form-data'>

The same applies for the pagination and header ordering links. They update the .table-container element so we’ll need to add up-history=true also to them. Thus we’ll change the up.compiler for these elements like this:

up.compiler('.pagination .page-item a.page-link', (link) => {
  link.setAttribute('up-follow', link.href)
  link.setAttribute('up-target', ".table-container")
  link.setAttribute('up-history', "true")
})

up.compiler('th.orderable a[href]', (link) => {
  link.setAttribute('up-follow', link.href)
  link.setAttribute('up-target', ".table-container")
  link.setAttribute('up-history', "true")
})

This way, both the ordering links and the selected page will be reflected on the url history.

The final result is that this filter/table page will have the usual functionality of updating the url when the filter is changed or the table sorting/pagination links are used.

Troubleshooting

As I’ve already mentioned, the most common problem you are going to have with unpoly is when you use javascript on your page ready event. Unfortunately there’s a lot of functionality that relies on that event and pages will break when you use unpoly in these cases. That’s why I recommend to use up-follow and up-submit for your links and forms on a case-by-case basis on non greenfield projects so you’ve got more control on what works with unpoly and what is not working. Another thing that is very important to notice here is that I’ve stumbled upon libraries that not only rely on the load event but actually there’s no other way to initialize them! For example, there’ are js libraries that have code like

$(function() {
  let initElement = function(el) {
    ...
  }
  $('.selected-elements').each(function() {
    initElement(this)
  })
})

so the actual function that does the initialization (initElement) isn’t public and you can’t call it from the up.compiler. In these cases you’ll need to somehow make the initElement public so you can call it from the up.compiler or use a different library!

The other major case for headaches in unpoly overlays. Although they are very powerful I recommend to not abuse them and use them only when you feel that are really needed and would improve the UX of the user. For example, I’d recommend using them to add new options on a select list (similar to how the project form works for companies and of course similar to how django admin does it). Also you could use overlays to have a functionality similar to django-inlines (see how the projects are added to the company) however I’d probably prefer to do that using normal django inlines especially when the standalone child edit functionality isn’t needed.

Special care must be taken for the integration between layers and messages. I have tried to provide a solution in the previous sections by proposing flashing the message with javascript on the cases where the message will be “eaten” by a discarded response however I’m afraid that depending on how you’ve architectured your app you’ll may still get problems. The important thing is to understand how messages work (or not) and in which cases you may skip using messages at all since the feedback would be immediate and the users don’t really need messages.

Conclusion

In conclusion, using unpoly with your Django apps can enhance the UX of your users by reducing page reloads and providing a more responsive and intuitive interface with little work from the developer. I recommend everybody to start integrating unpoly in their projects and see how it can improve the UX of your users!

Accessing MS Access databases from Python and Django

Have you ever needed to use data from a Microsoft Access database on a Django project? If so, you might have found that the data is contained in an .accdb file, which can make accessing it a challenge. Actually, access files are either .mdb (older versions) and .accdb (more current versions); although the .mdb files is easier to access from python, most Access database would be .accdb nowadays.

The naive/simple way to access this data is to bite the bullet, install Access on your computer and use it to export the tables one by one in a more “open” format (i.e xlsx files). After some research I found out that there are ways to connect to this Access database through python and querying it directly. Thus I decided to implement a more automatic method of exporting your data. In this article, I’ll walk you through the steps to accomplish this, specifically we’ll cover how to:

  • Connect to an accdb database
  • Export all the tables of that database to a json file
  • Create a models.py based on the exported data
  • Import the json data that was exported into the newly created models

For the first two steps we’ll only use python. For the last two we’ll also need some Django. By the end of this article, you’ll have a streamlined process for accessing your Microsoft Access data in your Django projects.

Connecting to a Microsoft Access database from python

To be able to connect to an Access database from python you can use the pypyodbc. This is a pure-python library that can connect to ODBC. To install it run pip install pypyodbc; since this is a pure-python the installation should be always successfull.

However, there is an important caveat when working with .accdb databases: You must use Windows and install Microsoft Access Database Engine 2010 Redistributable. This is necessary to ensure that the correct drivers are available, you can also take a look at the instructions from pyodbc here.

When installing the Access Redistributable, it’s crucial to remember that you need to install either the 32-bit or 64-bit of the Access Redistributable depending on your python (you can’t install both). To check if your python is 32bit or 64bit run python and check if it says 32 or 64 bit. Also please notice that you may be able to install only the 32bit or only the 64bit version if you have an MS Office installed (the Access Redistributable will match the MS Office bitness).

If that’s the case then my recommendation would be to generate a new virtualenv (similiar to the bitness of the Access Redistributable you’ve installed on your system). Then install pypyodbc on that virtualenv and you should be fine.

If you want to use a .mdb database you should be able to do it without installing anything on Windows and it also should be possible on Unix (I haven’t tested it though).

To ensure that you’ve got the correct drivers, run the following snippet on a python shell:

import pypyodbc
print('\n'.join(pypyodbc.drivers()))

If you have installed the correct Microsoft Access Database Engine 2010 Redistributable you should see .accdb somewhere in the output, like this:

Microsoft Access Driver (*.mdb)
Microsoft dBase Driver (*.dbf)
Microsoft Excel Driver (*.xls)
Microsoft ODBC for Oracle
Microsoft Paradox Driver (*.db )
Microsoft Text Driver (*.txt; *.csv)
SQL Server
Oracle in OraClient10g_home1
SQL Server Native Client 11.0
ODBC Driver 17 for SQL Server
Microsoft Access Driver (*.mdb, *.accdb)
Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)
Microsoft Access dBASE Driver (*.dbf, *.ndx, *.mdx)
Microsoft Access Text Driver (*.txt, *.csv)

If on the other hand you can’t access .accdb files you’ll get much less options:

SQL Server
PostgreSQL ANSI(x64)
PostgreSQL Unicode(x64)
PostgreSQL ANSI
PostgreSQL Unicode
SQL Server Native Client 11.0
ODBC Driver 17 for SQL Server

In any case, after you’ve installed the correct drivers you can connect to the database (let’s suppose it’s named access_data.accdb on the parent directory) like this:

import pypyodbc

pypyodbc.lowercase = False
conn = pypyodbc.connect(
    r"Driver={Microsoft Access Driver (*.mdb, *.accdb)};"
    + r"Dbq=..\\access_data.accdb;"
)
cur = conn.cursor()

for row in cur.tables(tableType="TABLE"):
    print(row)

If everything’s ok the above will print all the tables that are contained in the database.

Exporting the data from the Access database to a json file

After you’re able to connect to the database you can export all the data to a json file. Actually we’ll export both the data of the database and a “description” of the data (the names of the tables along with their columns and types). The description of the data will be useful later.

The general idea is:

  1. Connect the database
  2. Get the names of the tables in a list
  3. For each table
    • Export a description of its columns
    • Export all its data
  4. Write the description and the data to two json files

This is done by running the following snippet:

import pypyodbc
import struct
import json
from datetime import datetime, date
import decimal

print("running as {0}-bit".format(struct.calcsize("P") * 8))

def normalize(s):
    """A simple function to normalize table names"""
    return s.lower().replace(" ", "_")


conn = pypyodbc.connect(
    r"Driver={Microsoft Access Driver (*.mdb, *.accdb)};"
    + r"Dbq=..\\access_data.accdb;"
)
cur = conn.cursor()
tables = []
for row in cur.tables(tableType="TABLE"):
    # Only get the table names
    tables.append(row[2])

# data will contain the data of all tables. It will have the following structure:
# {"table_1": [{"column_1": value, "column_2": value}, ...], "table_2": ...}
data = {}
# descriptions will have  a description of all the tables. It will have the following structure:
# [
#   {
#       "table_name": "table 1",
#       "fixed_table_name": "table_1",
#       "columns": [
#           {"name": "column_1", "fixed_name": "column_1","type": "str"},
#           {"name": "column_2", "fixed_name": "column_2","type": "int"},
# ]
descriptions = []

for table_name in tables:
    fixed_table_name = normalize(table_name)
    print(f"~~~~~~~~~~~~~{table_name} {fixed_table_name}~~~~~~~~~~~~~")
    q = f'SELECT * FROM "{table_name}"'
    description = {
        "table_name": table_name,
        "fixed_table_name": fixed_table_name,
        "columns": [],
    }
    descriptions.append(description)

    cur.execute(q)
    # Here we get the description of the columns of the table from the cursor; we'll use that to fill the description.columns list
    columns = cur.description
    for c in columns:
        description["columns"].append(
            {"name": c[0], "fixed_name": normalize(c[0]), "type": c[1].__name__}
        )

    print("")

    # And here we retrieve the data of the whole table
    # Notice we use some double for loop comprehension to 
    # create a json object with a column_name: value structure
    # for each row
    data[fixed_table_name] = [
        {normalize(columns[index][0]): column for index, column in enumerate(value)}
        for value in cur.fetchall()
    ]

cur.close()
conn.close()

# This is a function to serialize datetime and decimal objects 
# to json; without it the json.dump function will fail if the 
# results contain dates or decimals
def json_serial(obj):
    """JSON serializer for objects not serializable by default json code"""

    if isinstance(obj, (datetime, date)):
        return obj.isoformat()
    elif isinstance(obj, decimal.Decimal):
        return str(obj)
    raise TypeError("Type %s not serializable" % type(obj))


with open("..\\access_description.json", "w") as outfile:
    # Notice the default=json_serial 
    json.dump(descriptions, outfile, default=json_serial)

with open("..\\access_data.json", "w") as outfile:
    json.dump(data, outfile, default=json_serial)

If you run the above code and don’t see any errors you should have two json files in the parent directory: access_description.json and access_data.json. The dump of your access database is complete!

Creating a models.py based on the exported data

Now that we have the description of the data in our database it is possible to create a small script that would help us generate the models for importing that data into Django. This could by done by a snippet similar to this:

import json

def get_ctype(t):
    """Depending on the type of each column add a different field in the model"""
    if t == "str":
        return "TextField(blank=True, )"
    elif t == "int":
        return "IntegerField(blank=True, null=True)"
    elif t == "float":
        return "FloatField(blank=True, null=True)"
    elif t == "bool":
        return "BooleanField(blank=True, null=True)"
    elif t == "datetime":
        return "DateTimeField(blank=True, null=True)"
    elif t == "date":
        return "DateField(blank=True, null=True)"
    elif t == "Decimal":
        return "DecimalField(blank=True, null=True, max_digits=15, decimal_places=5)"
    else:
        print(t)
        raise NotImplementedError

# Load the descriptions we created in the previous step
descriptions = json.load(open("..\\access_description.json"))

# mlines will be an array of the lines of the models.py file
mlines = ["from django.db import models", "", ""]

for d in descriptions:
    # Create a model for each table
    mname = d["fixed_table_name"].capitalize()
    mlines.append(f"class {mname}(models.Model):")
    for c in d["columns"]:
        ctype = get_ctype(c["type"])
        mlines.append(f"    {c['fixed_name']} = models.{ctype}")
    mlines.append("")
    mlines.append("    class Meta:")
    mlines.append(f"        db_table = '{d['fixed_table_name']}'")
    mlines.append(f"        verbose_name = '{d['table_name']}'")

    mlines.append("")
    mlines.append("")


with open("..\\access_models.py", "w", encoding="utf-8") as outfile:
    outfile.write("\n".join(mlines))

This will generate a fine named access_models.py. You should edit this file a to add your primary and foreign keys. In an ideal world this would be done automatically, however I couldn’t find a way to extract the primary and foreign keys of the tables from the Access database. Also by default I’ve set all fields to allow blank and null values; please you should fix that according to your needs.

After you edit the file, create a new app in your Django project and copy the file to the models.py file of the new app. Add that app to your INSTALLED_APPS in the settings.py file and run python manage.py migrate to create the tables in your database.

Import the json data to Django

The final piece of the puzzle is to import the data we extracted before directly in Django. Because we know the names of all the models and fields this is trivial to do:

from django.core.management.base import BaseCommand
import json
from django.db import transaction
from access_data import models

# This is a list of all the fields that are foreign keys; these need special handling
FK_FIELDS = [
    # ...
]

# You need to add the table names from the access database here. This is required
# if you have relations in order to add first the tables without dependencies and last
# the tables that belong on these 
TABLE_NAMES = [
    # ...
]

def fix_fks(k):
    """Add _id to the end of the field name if it is a foreign key to pass the pk of the
    foreign key instead of the whole object"""
    if k in FK_FIELDS:
        return k + '_id'
    return k

def get_model_by_table(table):
    """Get the model by the table name"""
    return getattr(models, table.capitalize())

class Command(BaseCommand):

    @transaction.atomic
    def handle(self, *args, **options):

        with open("..\\access_data.json") as f:
            j = json.load(f)

        # Delete the existing data before importing. This is optional but I find it useful
        # Notice that we delete the tables in reverse order to avoid foreign key errors
        for table in reversed(TABLE_NAMES):
            get_model_by_table(table).objects.all().delete()

        for table in TABLE_NAMES:
            for row in j[table]:
                # Create a dictionary with the column name: column value;
                # notice the fix_fks to add the _id to the column
                row_ok = {fix_fks(k): v for k,v in row.items()}
                print(row_ok)
                # Create the object; we could add thse to an array and do a bulk_create instead
                get_model_by_table(table).objects.create(**row_ok)

The above code may error out if you have missing or bad data in your database. You should fix accordingly.

Conclusion

In conclusion, accessing and using data from Microsoft Access databases in Django may seem daunting at first, but with the right tools and techniques, it can be a straightforward process. By using the pypyodbc library and following the instructions outlined in this post, you can connect to your .mdb or .accdb database and export its tables and schema to JSON files. From there, it is trivial to create a models.py file for Django and a management command to import the data.

Although I’ve presented these steps as separate snippets, you could also combine them into a single management command within Django. The possibilities are endless, and with a little bit of creativity, you can tailor this approach to your specific needs and data.

My essential guidelines for better Django development

Introduction

In this article I’d like to present a list of guidelines I follow when I develop Django projects, especially projects that are destined to be used in a production environment for many years. I am using django for more than 10 years as my day to day work tool to develop applications for the public sector organization I work for.

My organization has got a number of different Django projects that cover its needs, with some of them running successfully for a lot of years, since Django 1.4. I have been involved in the development of all of them, and I have learned a lot in the process. I understand that some of these may be controversial but they have served me well over these years.

Model design guidelines

Avoid using choices

Django has the convenient feature of allowing you to define choices for a field by defining a tuple of key-value pairs the field can take. So you’ll define a field like choice_field = models.CharField(choices=CHOICES) with CHOICES being a tuple like

CHOICES = (
    ('CHOICE1', 'Choice 1 description'),
    ('CHOICE2', 'Choice 2 description'),
    ('CHOICE3', 'Choice 3 description'),
)

and your database will contain CHOICE1, CHOICE2 or CHOICE3 as values for the field while your users will see the corresponding description.

This is a great feature for prototyping however I suggest to use it only on toy-prototyping-MVP projects and use normal relations in production projects instead. So the choice field would be a Foreign Key and the choices would be tuples on the referenced table. The reasons for this are:

  • The integrity of the choices is only on the application level. So people can go to the database and change a choice field with a random value.
  • More general, the database design is not normalized; saving CHOICE1 for every row is not ideal.
  • Your users may want to edit the choices (add new ones) or change their descriptions. This is easy with a foreign key through the django-admin but needs a code change with choices.
  • It is almost sure that you will need to add “properties” to your choices. No matter what your current requirements are, they are going to change. For example, you may want to make a choice “obsolete” so it can’t be picked by users. This is trivial when you use a foreign key but not very easy when you use choices.
  • The values of the choices is saved only inside your app. The database has only the 'CHOICE1', 'CHOICE2' etc values, so you’ll need to re-use the descriptions when your app is not used. For example, you may have reports that are generated directly from database queries so you’ll need to add the description of each key to your query using something like CASE.
  • It easier to use the ORM to annotate your queries when you use relations instead of the choices.

The disadvantage of relations is of course that you’ll need to follow the relation to display the values. So you must be careful to use select_related to avoid the n+1 queries problem.

So, in short, I suggest to use choices only for quick prototyping and covert them to normal relations in production projects. If you already are using choices in your project but want to convert them to normal relations, you can use take a look at my Django choices to ForeignKey article.

Always use surrogate keys

A surrogate key is a unique identifier for a database tuple which is used as the primary key. By default Django always adds a surrogate key to your models. However, some people may be tempted to use a natural key as the primary key. Although this is possible and supported in Django, I’d recommend to stick to integer surrogate keys. Why ?

  • Django is more or less build upon having integer primary keys. Although non-integer primary keys are supported in core Django, you can’t be assured that this will be supported by the various addons/packages that you’ll want to use.
  • I understand that your requirements say that “the field X will be unique and should be used to identify the row”. This is never true; this can easily be changed in the future and your primary key may stop being unique! It has happened to me and the solution was not something I’d like to discuss here. If there’s a field in the row that is guaranteed to be unique you can make it unique in the database level by adding unique==True; there’s no reason to also make it a primary key.
  • Relying on all your models having an id integer primary key makes it easier to write your code and other people reading it.
  • Using an auto-increment primary key is the fastest way to insert a new row in the database (when compared to, for example using a random uuid)

An even worse idea is to use composite keys (i.e define a primary key using two fields of your tuple). There’s actually a 17-year an open issue about that in Django! This should be enough for you to understand that you shouldn’t touch that with a 10-foot pole. Even if it is implemented somehow in core django, you’ll have something that can’t be used with all other packages that rely on primary key being a single field.

Now, I understand that some public facing projects may not want to expose the auto-increment primary key since that discloses information about the number of rows in the database, the number of rows that are added between a user’s tuples etc. In this case, you may want to either add a unique uuid field, or a slug field, or even better use a library like hashid to convert your integer ids to hashes. I haven’t used uuids myself, but for a slug field I had used the django-autoslug library and was very happy with it.

Concerning hashids, I’d recommend reading my Django hashids article.

Always use a through model on your m2m relations

To add a many-to-many relation in Django, you’ll usually do something like toppings = models.ManyToManyField(Topping) (for a pizza). This is a very convenient but, similar to the choices I mentioned above, it is not a good practice for production projects. This is because your requirements will change and you’ll need to add properties to your m2m relation. Although this is possible, it definitely is not pretty so it’s better to be safe than sorry.

When you use the ManyToManyField field, django will generate an intermediate table with a name similar to app_model1_model2, i.e for pizza and topping it will be pizzas_pizza_topping. This table will have 3 fields - the primary key, a foreign key to the pizza table and a foreign key to the topping table. This is the default behavior of Django and it is not configurable.

What happens if you want to add a relation to the pizzas_pizza_topping table? For example, the amount of each topping on a pizza. Or the fact that some pizzas used to have that topping but it has been replaced now by another one? This is not possible unless you use a through table. As I said it is possible to fix that but it’s not something that you’ll want to do.

So, my recommendation is to always add a through table when you use a m2m relation. Create a model that will represent the relation and has foreign keys to both tables along with any extra attributes the relation may have.

class PizzaTopping(models.Model):
    pizza = models.ForeignKey(Pizza, on_delete=models.CASCADE)
    topping = models.ForeignKey(Topping, on_delete=models.CASCADE)
    amount = models.IntegerField()

and define your pizza toppings relation like toppings = models.ManyToManyField(Topping, through=PizzaTopping).

If the relation doesn’t have no extra attributes don’t worry: You’ll be prepared when these are requested!

A bonus to that is that now you can query directly the PizzaTopping model and you can also add an admin interface for it.

There are no disadvantages to adding the through model (except the 1 minute needed to add the through model minor) since Django will anyway create the intermediate table to represent the relation so you’ll still need to use prefetch_related to get the toppings of a pizza and avoid the n+1 query problem.

Use a custom user model

Using a custom user model when starting a new project is already advised in the Django documentation. This will make it easier to add custom fields to your user model and have better control over it. Also, although you may be able to add a Profile model with an one to one relation with the default django.auth.User model you’ll still need to use a join to retrieve the profile for each user (something that won’t be necessary when the extra fields are on your custom user model).

Another very important reason to use a custom user model is that you’ll be able to easily add custom methods to your user model. For example, there’s the get_full_name method in builtin-Django that returns the first_name plus the last_name, with a space in between so you’re able to call it like {{ user.get_full_name }} in your templates. If you don’t have a custom user model, you’ll need to add template tags for similar functionality; see the discussion about not adding template tags when you can use a method.

There’s no real disadvantage to using a custom user model except the 5 minute it is needed to set it up. I actually recommend create a users app that you’re going to use to keep user related information (see the users app on my cookiecutter project).

Views guidelines

Use class based views

I recommend to prefer using class-based views instead of function-based views. This is because class-based views are easier to reuse and extend. I’ve written an extensive comprehensive Django CBV guide that you can read to learn everything about class based views!

Also, by properly using CBVs people reading your code will use sensible defaults and you be able to understand what you or others are doing much easier. Consider this

class FooDetailView(DetailView):
    model = Foo

vs

def object_detail_view(request, pk):
    foo = get_object_or_404(Foo, pk=pk)
    return render(request, 'foo/foo_detail.html', {'foo': foo})

These are more or less the same. However in the function-based view you need to actually write some logic for retrieving the Foo instance and then define the name of the template and the context object. Also notice that you use the get_object_or_404 function that helps you being DRY. Whereas in the class based view this is already done for you using well-known defaults. So, for example you’ll know which is the name of the template without the need to check the code.

View method overriding guidelines

It is important to know which method you need to override to add functionality to your class based views. You can use the excellent CBV Inspector app to understand how each CBV is working. Also, I’ve got many examples in my comprehensive Django CBV guide.

Some quick guidelines follow:

  • For all methods do not forget to call the parent’s method by super().
  • Override dispatch(self, request, *args, **kwargs) if you want to add functionality that is executed before any other method. For example to add permission checks or add some attribute (self.foo) to your view instance. This method will always run on both HTTP GET/POST or whatever. Must return a Response object (i.e HttpResponse, HttpResponseRedirect, HttpResponseForbidden etc)
  • You should rarely need to override the get or post methods of your CBVs since they are called directly after dispatch so any code should be there.
  • To add extra data in your context (template) override get_context_data(self, **kwargs). This should return a dictionary with the context data.
  • To pass extra data to your form (i.e the current request) override get_form_kwargs(self). This data will be passed on the __init__ of your form, you need to remove it by using something like self.request = kwargs.pop('request') before calling super().__init(*args, **kwargs)
  • To override the initial data of your form override get_form_initial(self). This should return a dictionary with the initial data.
  • You can override get_form(self, form_class=None) to use a configurable form instance or get_form_class(self) to use a configurable form class. The form instance will be generated by self.get_form_class()(**self.get_form_kwargs()) (notice that the kwargs will contain an initial=self.get_form_initial() value)
  • To do stuff after a valid form is submitted you’ll override form_valid(self, form). This should return an HttpResponse object and more specifically an HttpResponseRedirect to avoid double form submission. This is the place where you can also add flash messages to your responses.
  • You can also override form_invalid(self, form) but this is rarely useful. This should return a normal response (not a redirect)
  • Override get_success_url(self) if you only want to set where you’ll be redirected after a valid form submission (notice this is used by form_valid)
  • You can use a different template based on some condition by overriding get_template_names(self). This is useful to return a partial response on an ajax request (for example the same detail view will return a full html view of an object when visited normally but will return a small partial html with the object’s info when called through an ajax call)
  • For views that return 1 or multiple objects (DetailView, ListView, UpdateView etc) you almost always need to override the get_queryset(self) method, not the get_object. I’ll talk about that a little more later.
  • The get_object(self, queryset=None) method will use the queryset returned by get_queryset to get the object based on its pk, slug etc. I’ve observed that this rarely needs to be overridden since most of the time overriding get_queryset will suffice. One possible use case for overriding get_object is for views that don’t care at all about the queryset; for example you may implement a /profile detail view that will pick the current user and display some stuff. This can be implemented by a get_object similar to return self.request.user.

Querying guidelines

Guidelines for the n+1 problem

The most common Django newbie mistake is not considering the n+1 problem when writing your queries.

Because Django automatically follows relations it is very easy to write code that will result in the n+1 queries problem. A simple example is having something like

class Category(models.Model):
    name = models.CharField(max_length=255)

class Product(models.Model):
    name = models.CharField(max_length=255)
    category = models.ForeignKey(Category, on_delete=models.CASCADE)

    def __str__(self):
        return "{0} ({1})".format(self.name, self.category.name)

and doing something like:

for product in Product.objects.all():
    print(product)

or even having products = Product.objects.all() as a context variabile in your template:

{% for product in products %}
    {{ product }}
{% endfor %}

If you’ve got 100 products, the above will run 101 queries to the database: The first one will get all the products and the other 100 will return each product’s category one by one! Consider what may happen if you had thousands of products…

To avoid this problem you should add the select_related, so products = Product.objects.all().select_related('category'). This will do an SQL JOIN between the products and categories table so each product will include its category instance. Now, when you’ve got a many to many relation the situation is a little different. Let’s suppose you’ve got a tags = models.ManyToManyField(Tag) field in your Product model. If you wanted to do something like {{ product.tags.all|join:", " }} to display the product tags you’d also get a n+1 situation because Django will do a query for each product to get its tags. To avoid this you cannot use select_related but should use the prefetch_related method so products = Product.objects.all().prefetch_related('tags'). This will result in 2 queries, one for products and one for their tags, the joining will be done in python.

One final comment about the prefetch_related is that you must be very careful to use what you prefetch. Let’s suppose that we had prefeched the tags but we wanted to display them ordered by name: Doing this ", ".join([tag for tag in product.tags.all().order_by('name')]) will not use the prefetched tags but will do a new query for each product to get its tags resulting in the n+1 problem! Django has tag.objects.all() for each product, not tag.objects.all().order_by('name'). To fix that you need to use Prefetch like this:

Product.objects.prefetch_related(Prefetch('tags', queryset=Tag.objects.order_by('name')))

The same is true if you wanted to filter your tags etc.

Now, one thing to understand is that this behavior of Django is intentional. Instead of automatically following the relationships, Django could throw an exception when you tried to follow a relationship that wasn’t in a select_related (this how it works in other frameworks). The disadvantage of this is that it would make Django more difficult to use for new users. Also, there are cases that the n+1 problem isn’t really a big deal, for example you may have a DetailView fetching a single object so in this case the n+1 problem will be 1+1 and wouldn’t really matter. So, at least for Django, it’s a case of premature optimization: Write your queries as good as you can (but keep in mind the n+1 problem), if you miss some cases that actually make your views slow, you can easily optimize them later.

Re-use your queries

You should re-use your queries to avoid re-writing them. You can either put them inside your models (as instance methods) or in a mixin for the queries of your views or even add a new manager for your model. Let’s see some examples:

Let’s suppose I wanted to get the tags of my product: I’d add this method to my Product model:

class Product(models.Model):
    # ...

    def get_tags(self):
        return self.tags.all().order_by('name')

Please notice that if you haven’t used a proper prefetch this will result in the n+1 queries problem. See the discussion above for more info. To get the products with their tags I could add a new manager like:

class ProductWithTagManager(models.Manager):
    def get_queryset(self):
        return super().get_queryset().prefetch_related(Prefetch('tags', queryset=Tag.objects.order_by('name')))

class Product(models.Model):
    # ...

    products_with_tags = ProductWithTagManager()

Now I could do [p.get_tags() for p in Product.products_with_tags.all()] and not have a n+1 problem.

Actually, if I knew that I would always wanted to display the product’s tags I could override the default manager like

class Product(models.Model):
    # ...

    objects = ProductWithTagManager()

However I would not recommend that since having a consistent behavior when you run Model.objects is very important. If you are to modify the default manager then you’ll need to always remember what your default manager does. This is very problematic in old projects and when you want to quickly query your database from a shell. Also, even more problematic is if you override your default manager to filter (hide) objects. Don’t do that or you’ll definitely regret it.

The other query re-use option is through a mixin that would override the get_queryset of your models. Let’s suppose that each user can only see his products: I could add a mixin like:

class ProductPermissionMixin:
    def get_queryset(self):
        return super().get_queryset().filter(created_by=self.request.user)

Then I could inherit my ListView, DetailView, UpdateView and DeleteView i.e ProductListView(ProductPermissionMixin, ListView) from that mixin and I’d have a consistent behavior on which products each user can view. More on this can be found on my comprehensive Django CBV guide.

Forms guidelines

Always use django-forms

This is a no-brainer: The django-forms offers some great class-based functionality for your forms. I’ve seen people creating html forms “by hand” and missing all this. Don’t be that guy; use django-forms!

I understand that sometimes the requirements of your forms may be difficult to be implemented with a django form and you prefer to use a custom form. This may seem fine at first but in the long run you’re gonna need (and probably re-implement) most of the django-forms capabilities.

Overriding Form methods guidelines

Your CustomForm inherits from a Django Form so you can override some of its methods. Which ones should you override?

  • The most usual method for overriding is clean(self). This is used to add your own server-side checks to the form. I’ll talk a bit more about overriding clean later.
  • The second most usual to override is __init__(self, *args, **kwargs). You should override it to “pop” any extra kwargs from the kwargs dict before calling super().__init__(*args, **kwargs). See the view method overriding guidelines for more info. Also you’ll use it to change.
  • I usually avoid overriding the form’s save() method. The save() is almost always called from the view’s form_valid method so I prefer to do any extra stuff from the view. This is mainly a personal preference in order to avoid having to hop between the form and view modules; by knowing that the form’s save is always the default the behavior will be consistent. This is personal preference though.

There shouldn’t be a need to override any other method of a Form or ModelForm. However please notice that you can easily use mixins to add extra functionality to your forms. For example, if you had a particular check that would be called from many forms, you could add a

class CustomFormMixin:
    def clean(self):
        super().clean() # Not really needed here but I recommend to add it to keep the inheritance chain
        # The common checks that does the mixin

class CustomForm(CustomFormMixin, Form):
    # Other stuff

    def clean(self):
        super().clean() # This will run the mixin's clean
        # Any checks that only this form needs to do

Proper cleaning

When you override the clean(self) method of a Form you should always use the self.cleaned_data to check the data of the form. The common way to mark errors is to use the self.add_error method, for example, if you have a date_from and date_to and date_from is after the date_to you can do your clean something like this:

def clean(self):

    date_from = self.cleaned_data.get("date_from")
    date_to = self.cleaned_data.get("date_to")

    if date_from and date_to and date_from > date_to:
        error_str = "Date from cannot be after date to"
        self.add_error("date_from", error_str)
        self.add_error("date_from", error_str)

Please notice above that I am checking that both date_from and date_to are not null (or else it will try to compare null dates and will throw). Then I am adding the same error message to both fields. Django will see that the form has errors and run form_invalid on the view and re-display the form with the errors.

Beyond the self.add_error method that adds the error to the field there’s a possibility to add an error to the “whole” form using:

from django.core.exceptions import ValidationError

def clean(self):
    if form_has_error:
        raise ValidationError(u"The form has an error!")

This kind of error won’t be correlated with a field. You can use this approach when an error is correlated to multiple fields instead of adding the same error to multiple fields.

You must be very careful because if you are using a non-standard form layout method (i.e you enumerate the fields) you also need to display the {{ form.errors }} in your template or else you’ll get a rejected form without any errors! This is a very common mistake.

Another thing to notice is that when your clean method raises it will display only the first such error. So if you’ve got multiple checks like:

def clean(self):
    if form_has_error:
        raise ValidationError(u"The form has an error!")
    if form_has_another_error:
        raise ValidationError(u"The form has another error!")

and your form has both errors only the 1st one will be displayed to the user. Then after he fixes it he’ll also see the 2nd one. When you use self.add_error the user will get both at the same time.

Overriding the __init__

You can override the __init__ method of your forms for three main reasons:

1. Override some field attributes on a ModelForm. A Django ModelForm will automatically create a field for each model field. Some times you may want to override some of the attributes of the field. For example, you may want to change the label of the field or make a field required. To do that, you can do something like:

def __init__(self, *args, **kwargs):
    super().__init__(*args, **kwargs)
    self.fields["my_field"].label = "My custom label" # Change the label
    self.fields["my_field"].help_text = "My custom label" # Change the help text
    self.fields["my_field"].required = True # change the required attribute
    self.fields["my_field"].queryset = Model.objects.filter(is_active=True) # Only allow specific objects for the forein key

Please notice that we need to use self.fields["my_field"] after we call super().__init__(*args, **kwargs).

2. Retrieve parameters (usually the request or user) from the view. A view (either a function-based or a CBV through get_form_kwargs) can pass parameters to the form’s constructor. You need to override __init__ to handle these parameters:

def __init__(self, *args, **kwargs):
    self.request = kwargs.pop("request", None)
    super().__init__(*args, **kwargs)

Please notice that we must pop the request from the kwargs dict before calling super().__init__ or else we’ll get an exception since the Form.__init__ method accepts only specific kwargs.

3. Add functionality related to the current user/request. For example, you may want to add a field that is only editable if the user is superuser:

def __init__(self, *args, **kwargs):
    self.request = kwargs.pop("request", None)
    super().__init__(*args, **kwargs)
    if not self.request.user.is_superuser:
        self.fields["my_field"].widget.attrs['readonly'] = True

or you may want to allow some custom validation logic only for non - superusers:

def clean(self):
    if not self.request.user.is_superuser:
        if not cleaned_data['my_field']:
            self.add_error("my_field", "Please field this field")

Laying out forms

To lay out the forms I recommend using a library like django-crispy-forms. This integrates your forms properly with your front-end engine and helps you have proper styling. I’ve got some more info on form layout post.

Please notice that the django-crispy-forms supports specific front-end frameworks like bootstrap or tailwind (see its docs for all available options). If you’re using a non-supported front-end framework you can create a custom template pack. This seems like a lot of work but I recommend to do it. Also you don’t need to implement everything, only the functionality you’re going to need, when you need it.

Improve the formset functionality

Beyond simple forms, Django allows you to use a functionality it calls formsets. A formset is a collection of forms that can be used to edit multiple objects at the same time. This is usually used in combination with inlines which are a way to edit models on the same page as a parent model. For example you may have something like this:

class Pizza(models.Model):
    name = models.CharField(max_length=128)
    toppings = models.ManyToManyField('Topping', through='PizzaTopping')

class Topping(models.Model):
    name = models.CharField(max_length=128)

class PizzaTopping(models.Model):
    amount = models.PositiveIntegerField()
    pizza = models.ForeignKey('Pizza')
    topping = models.ForeignKey('Topping')

Now we’d like to have a form that allows us to edit a pizza by both changing the pizza name and the toppings of the pizza along with their amounts. The pizza form will be the main form and the topping/amount will be the inline form. Notice that we won’t also create/edit the topping name, we’ll just select it from the existing toppings (we’re gonna have a completely different view for adding/editing individual toppings).

First of all, to create a class based view that includes a formset we can use the django-extra-views package (this isn’t supported by built-in django CBVs unless we implement the functionality ourselves). Then we’d do something like:

from extra_views import CreateWithInlinesView, InlineFormSetFactory


class ToppingInline(InlineFormSetFactory):
    model = Topping
    fields = ['topping', 'amount']


class CreatePizzaView(CreateWithInlinesView):
    model = Pizza
    inlines = [ToppingInline]
    fields = ['name']

This will create a form that will allow us to create a pizza and add toppings to it. Now, to display the formset we’d modify our template to be similar to:

<form method="post">
...
{{ form }}

{% for formset in inlines %}
    {{ formset }}
{% endfor %}
...
<input type="submit" value="Submit" />
</form>

This works however it will be very ugly. The default behavior is to display the Pizza form and three empty Topping forms. If we want to add more toppings we’ll have to submit that form so it will be saved and then edit it. But once again we’ll get our existing toppings and three more. I am not fond of this behavior.

That’s why my recommendation is to follow the instructions on my better django inlines article that allows you to sprinkle some javascript on your template and get a much better, dynamic behavior. I.e you’ll get an “add more” button to add extra toppings without the need t submit the form every time.

Template guidelines

Stick to the built-in Django template backend

Django has its own built-in template engine but it also allows you to use the Jinja template engine or even use a completely different one! The django template backend is considered “too restrictive” by some people mainly because you can only call functions without parameters from it.

My opinion is to just stick to the builtin Django template. Its restriction is actually a strength, enabling you to create re-usable custom template tags (or object methods) instead of calling business logic from the template. Also, using a completely custom backend means that you’ll add dependencies to your project; please see my the guideline about the selection of using external packages. Finally, don’t forget that any packages you’ll use that provide templates would be for the Django template backend, so you’ll need to convert/re-write these templates to be used with a different engine.

I would consider the Jinja engine only if I already had a bunch of Jinja templates from a different project and wanted to quickly use them on my project.

Don’t add template tags when you can use a method

Continuing from the discussion on the previous guideline, I recommend you to add methods to your models instead of adding template tags. For example, let’s suppose that we want to get our pizza toppings order by their name. We could add a template tag that would do that like:

def get_pizza_toppings(context, pizza):
    return pizza.toppings.all().order_by('name')

and use it like {% get_pizza_toppings pizza as pizza_toppings %} in our template. Notice that if you don’t care about the ordering you could instead do {{ pizza.toppings.all }} but you need to use the order_by and pass a parameter so you can’t call the method.

Instead of adding the template tag that I recommend adding a method to your pizza model like:

def get_toppings(self):
    return self.toppings.all().order_by('name')

and then call it like {{ pizza.get_toppings }} in your template. This is much cleaner and easier to understand.

Please notice that this guideline is not a proposal towards the “fat models” approach. You can add 1 line methods to your models that would only call the corresponding service methods if needed.

Re-use templates with partials

When you have a part of a template that will be used in multiple places you can use partials to avoid repeating yourself. For example, let’s suppose you like to display your pizza details. These details would be displayed in the list of pizzas, in the cart page, in the receipt page etc. So can create an html page named _pizza_details.html under a partial folder (or whatever name you want but I recommend having a way to quickly check your partials) with contents similar to:

<div class='pizza-details'>
    <h3>{{ pizza.name }}</h3>
    {% if show_photo %}
        <img src='{{ pizza.photo.url }}'>
    {% endif %}
    <p>Toppings: {{ pizza.get_toppings|join:", " }}</p>
</div>

and then include it in your templates like {% inlude "pizzas/partials/_pizza_details.html" %} to display the info without photo or {% inlude "pizzas/partials/_pizza_details.html" with show_photo=True %} to display the photo. Also notice that you can override the {{ pizza }} context variable so, if you want to display two pizzas in a template you’ll write something like

{% inlude "partials/_pizza_details.html" with show_photo=True pizza=pizza1 %}
{% inlude "partials/_pizza_details.html" with show_photo=True pizza=pizza2 %}

Settings guidelines

Use a package instead of module

This is a well known guideline but I’d like to mention it here. When you create a new project, Django will create a settings.py file. This file is a python module. I recommend to create a settings folder next to the settings.py and put in it the settings.py renamed as base.py and an __init__.py file so the settings folder will be a python package. So instead of project\settings.py you’ll have project\settings\base.py and project\settings\__init__.py.

Now, you’ll add an extra module inside settings for each kind of environment you are gonna use your app on. For example, you’ll have something like * project\settings\dev.py for your development environment * project\settings\uat.py for the UAT environment * project\settings\prod.py for the production environment

Each of these files will import the base.py file and override the settings that are different from the base settings, i.e these files will start like:

from .base import *

# And now all options that are different from the base settings

All these files will be put in your version control. You won’t put any secrets in these files. We’ll see how to handle secrets later.

When Django starts, it will by default look for the project/settings.py module. So, if you try to run python manage.py now it will complain. To fix that, you have to set the DJANGO_SETTINGS_MODULE environment variable to point to the correct settings module you wanna use. For example, in the dev env you’ll do DJANGO_SETTINGS_MODULE=project.settings.dev.

To avoid doing that every time I recommend creating a script that will initiate the project’s virtual environment and set the settings module. For example, in my projects I have a file named dovenv.bat (I use windows) with the following contents:

Handle secrets properly

You should never put secrets (i.e your database password or API KEYS) on your version control. There are two ways that can be used to handle secrets in Django:

  • Use a settings/local.py file that contains all your secrets for the current environment and is not under version control.
  • Use environment variables.

For the settings/local.py solution, you’ll add the following code at the end of each one of your settings environment modules (i.e you should put it at the end of dev.py, uat.py, prod.py etc):

try:
    from .local import *
except ImportError:
    pass

The above will try to read a module named local.py and if it exists it will import it. If it doesn’t exist it will just ignore it. Because this file is at the end of the corresponding settings module, it will override any settings that are already defined. The above file should be excluded from version control so you’ll add the line local.py to your .gitignore.

Notice that the same solution to store secrets can be used if you don’tt use the settings package approach but you have a settings.py module. Create a settings_local.py module and import from that at the end of your settings module instead. However I strongly recommend to use the settings package approach.

To catalogue my secrets, I will usually add a local.py.template file that has all the settings that I need to override in my local.py with empty values. I.e it will may be similar to:

API_TOKEN=''
ANOTHER_API_TOKEN=''
DATABASES_U = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'NAME': '',
        'USER': '',
        'PASSWORD': '',
        'HOST': '',
        'PORT': '',
    }
}

Then I’ll copy over local.py.template to local.py when I initialize my project and fill in the values.

Before continuing, it is important to understand the priority of the settings modules. So let’s suppose we are on production. We should have a DJANGO_SETTINGS_MODULE=project.settings.prod. The players will be base.py, prod.py and local.py. The priority will be

  1. local.py
  2. prod.py
  3. base.py

So any settings defined in prod.py will override the settings of base.py. And any settings defined in local.py will override any settings defined either in prod.py or base.py. Please notice that I mention any setting, not just secrets.

To use the environment variables approach, you’ll have to read the values of the secrets from your environment. A simple way to do that is to use ths os.getenv function, for example in your prod.py you may have something like:

import os

API_TOKEN = os.getenv('API_TOKEN')

This will set API_TOKEN setting to None if the API_TOKEN env var is not found. You can do something like os.environ["API_TOKEN"] instead to throw an exception. Also, there are libraries that will help you with this like python-dotenv, However I can’t really recommend them because I haven’t used them.

Now, which one to use? My recommendation (and what I always do) is to use the first approach (local.py) unless you need to use environment variables to configure your project. For example, if you are using a PaaS like Heroku, you’ll have to use environment variables because of the way you deploy so you can’t really choose. However using the local.py is much simpler, does not have any dependencies and you can quickly understand which settings are overriden. Also you can use it to override any setting by putting it in your local.py, not just secrets.

Static and media guidelines

Use ManifestStaticFilesStorage

Django has a STATICFILES_STORAGE setting that can be used to specify the storage engine that will be used to store the static files. By default, Django uses the StaticFilesStorage engine which stores the files in the file system under the STATIC_ROOT folder and with a STATIC_URL url.

For example if you’ve got a STATIC_ROOT=/static_root and a STATIC_URL=/static_url/ and you’ve got a file named styles.css which you include with {% static "styles.css" %}. When you run python manage.py collectstatic the styles.css will be copied to /static_root/styles.css and you’ll be able to access it with /static_url/styles.css.

Please notice that the above should be configured in your web server (i.e nginx). Thus, you need to configure your web server so as to publish the files under /static_root on the /static_url url. This should work without Django, i.e if you have configured the web server properly you’ll be able to visit example.com/static_url/styles.css even if your Django app isn’t running. For more info see how to deploy static files.

Now, the problem with the StaticFilesStorage is that if you change the styles.css there won’t be any way for the user’s browser to understand that the file has been changed so it will keep using the cached version.

This is why I recommend using the ManifestStaticFilesStorage instead. This storage will append the md5 has of each static file when copying it so the styles.css will be copied to /static_root/styles.fb2be32168f5.css and the url will be /static_url/styles.fb2be32168f5.css. When the styles.css is changed, its hash will also be changed so the users are guaranteed to pick the correct file each time.

Organize your media files

When you upload a file to your app, Django will store it in the MEDIA_ROOT folder and serve it through MEDIA_URL similar to the static files as I explained before. The problem with this approach is that you’ll end up with a lot of files in the same folder. This is why I recommend creating a folder structure for your media files. To create this structure you should set the upload_to attribute of FileField.

So instead of having file = models.FileField or image = models.ImageField you’d do something like file = models.FileField(upload_to='%Y/%m/files') or image = models.ImageField(upload_to='%Y/%m/images') to upload these files to their corresponding folder organized by year/month.

Notice that instead of a string you can also pass a function to the upload_to attribute. This function will need to return a string that will contain the path of the uploaded file including the filename. For example, an upload_to function can be similar to this:

def custom_upload_path(instance, filename):
    dt_str = instance.created_on.strftime("%Y/%m/%d")
    fname, ext = os.path.splitext(filename)
    slug_fn = slugify(anyascii.anyascii(fname))
    if ext:
        slug_fn += "" + ext
    return "protected/{0}/{1}/{2}".format(dt_str, instance.id, slug_fn)

The above code will convert the filename to an ascii slug (i.e a file named δοκιμή.pdf will be converted to dokime.pdf) and will store it in a folder after the created date year/month/day and id of the object instance the file belongs to. So if for example the file δοκιμή.pdf belongs to the object with id 3242 and created date 2022-09-30 will be stored on the directory protected/2022/09/30/3242/dokime.pdf.

The above code is just an example. You can use it as a starting point and modify it to fit your needs. Having the media files in separate folders will enable you to easily navigate the folder structure and for example back up only a portion of the files.

Do not serve media through your application server

This is important. The media files of your app have to be served through your web server (i.e nginx) and not your application server (i.e gunicorn). This is because the application server has a limited number of workers and if you serve the media files through them, it will be a bottleneck for your app. Thus you need to configure your web server to serve the media files by publishing the MEDIA_ROOT folder under the MEDIA_URL url similar to the static files as described above.

Notice that by default Django will only serve your media files for development by using the following at the end of your urls.py file:

if settings.DEBUG:
    urlpatterns += static(settings.MEDIA_URL, document_root=settings.MEDIA_ROOT)

Under no circumstances you should use this when settings.DEBUG = False (i.e on production).

Secure your media properly

Continuing from the above, if you are not allowed to serve your media files through your application then how are you supposed to secure them? For example you may want to allow a user to upload files to your app but you want only that particular user to be able to download them and not anybody else. So you’ll need to check somehow that the user that tries to download the file is the same user that uploaded it. How can you do that?

The answer is to use a functionality offered by most web servers called X SendFile. First of all I’d like to explain how this works:

  1. A user wants to download a file with id 1234 so he clicks the “download” button for that file
  2. The browser of the user will then visit a normal django view for example /download/1234
  3. This view will check if the user is allowed to download the file by doing any permissions checks it needs to do, all in Django code
  4. If the user is not allowed to download, it will return a 403 (forbidden) or 404 (not-found) response
  5. However if the user is allowed to download the Django view will return an http response that will not contain the file but will have a special header with the path of the file to download (which is the path that file 1234 is saved on)
  6. When the web server (i.e nginx) receives the http response it will check if the response has the special header and if it does it will serve the response it got along with the file, directly from the file system without going through the application server (i.e gunicorn)

The above gives us the best of both worlds: We are allowed to do any checks we want in Django and the file is served through nginx.

A library that implements this functionality is django-sendfile2 which is a fork of the non-maintained anymore django-sendfile. To use it you’ll need to follow the instructions provided and depend on your web server. However, let’s see a quick example for nginx from one production project:

# nginx conf

server {
    # other stuff

    location /media_project/protected/ {
        internal;
        alias /home/files/project/media/protected/;
    }

    location /media_project/ {
        alias /home/files/project/media/;
    }


}

For nginx we add a new location block that will serve the files under the /media_project/protected/ url. The internal; directive will prevent the client from going directly to the URI, so visiting example.com/media_project/protected/file.pdf directly will not work. We also have a /media_project/ location that serves the files under /media that are not protected. Please notice that nginx matches the most specific path first so all files under protected will be matched with the correct, internal location.

# django settings
MEDIA_ROOT = "/home/files/project/media"
SENDFILE_ROOT = "/home/files/project/media/protected"

MEDIA_URL = "/media_project/"
SENDFILE_URL = "/media_project/protected"
SENDFILE_BACKEND = "sendfile.backends.nginx"

Notice the difference between the MEDIA_ROOT (that contains all our media files - some are not protected) and SENDFILE_ROOT and same for MEDIA_URL and SENDFILE_URL

# django view

def get_document(request, doc_id):
    from django_sendfile import sendfile

    doc = get_object_or_404(Document, pk=doc_id)
    rules_light.require(request.user, "apps.app.read_docs", doc.app)
    return sendfile(request, doc.file.path, attachment=True)

So this view first gets the Document instance from its id and checks to see if the current user can read it. Finally, it returns the sendfile response that will serve the file directly from the file system passing the path of that file. This function view will have a url like path("get_doc/<int:doc_id>/", login_required(views.get_document), name="get_document", ),

A final comment is that for your dev environment you probably want to use the SENDFILE_BACKEND = "django_sendfile.backends.development" (please see the settings package guideline on how to override settings per env).

Handle stale media

Django does never delete your media files. For example if you have an object that has a file field and the object is deleted, the file that this file field refers to will not be deleted. The same is true if you upload a new file on that file field, the old file will also be kept there!

This is very problematic in some cases, resulting to GB of unused files in your disk. To handle that, there are two solutions:

  • Add a signal in your models that checks if they are deleted or a file field is updated and delete the non-used file. This is implemented by the django-cleanup package.
  • Use a management command that will periodically check for stale files and delete them. This is implemented by the django-unused-media package.

I’ve used both packages in various projects and they work great. I’d recommend the django-cleanup on greenfield projects so as to avoid stale files from the beginning.

Debugging guidelines

Be careful when using django-debug-toolbar

The django-debug-toolbar is a great and very popular library that can help you debug your Django applications and identify slow views and n+1 query problems. However I have observed that it makes your development app much slower. For some views I am seeing like 10x decrease in speed i.e instead of 500 ms we’ll need more than 5 seconds to display that view! Since Django development (at least for me) is based on a very quick feedback loop, this is a huge problem.

Thus, I recommend to keep it disabled when you are doing normal development and only enable it when you need it, for example to identify n+1 query problems.

Use the Werkzeug debugger

Instead of using the traditional runserver to run your app in development I recommend installing the django-extensions package so as to be able to use the Werkzeug debugger. This will enable you to get a python prompt whenever your code throws an exception or even to add your own breakpoints by throwing exceptions.

In a nutshell, you’ll something like aa+=1 (aa should not be an integer) somewhere in your code (in a view, or a model method etc) and python will throw an exception. You’ll be able to get a python shell and inspect the state of your app inside that particular point, so you can see what variables are available and their values, run code etc. This is a superpower that after you start using it you’ll never want to go back to the traditional runserver.

More info on my Django Werkzeug debugger article.

General guidelines

Consider using a cookiecutter project template

If you are working on a Django shop so you need to create frequenctly new Django apps I’d recommend to consider creating (or use an existing) cookiecutter project template. You can use my own cookiecutter to create your projects or as an inspiration to create your own. It follows all the conventions I mention in this post and it is very simple to use.

Be careful on your selection of packages/addons

Django, because of its popularity, has an abundance of packages/addons that can help you do almost anything. However, my experience has taught me that you should be very careful and do your research before adding a new package to your project. I’ve been left many times with projects that I was not able to upgrade because they heavily relied on functionality from an external package that was abandoned by its creator. I also have lost many hours trying to debug a problem that was caused by a package that was not compatible with the latest version of Django.

So my guidelines before using an external Django addon are:

  • Make sure that it has been upgraded recently. There are no finished Django addons. Django is constantly evolving by releasing new versions and that must be true for the addons. Even if the addons are compatible with the new Django version they need to denote that in their README so as to know that their maintainers care.
  • Avoid using very new packages. I’ve seen many packages that are not yet mature and they are not yet ready for production. If you really need to use such a package make sure that you understand what it does and you can fix problems with the package if needed.
  • Avoid using packages that rely heavily on Javascript; this is usually better to do on your own.
  • Try to understand, at least at a high level, what the package does. If you don’t understand it, you will not be able to debug if it breaks.
  • Make sure that the package is well documented and that it has a good test coverage.
  • Don’t use very simple packages that you can easily implement yourself. Don’t be a left-pad developer.

I already propose some packages in this article but I also like to point you out to my Django essential package list. This list was compiled 5 years ago and I’m happy to still recommend all of these packages with the following minor changes:

  • Nowadays I recommend using wkhtmltopdf for creating PDFs from Django instead of xhtml2pdf. Please see my PDFs in Django like it’s 2022 article for more info. Notice that there’s nothing wrong with the xhtml2pdf package, it still works great and is supported but my personal preference is to use the wwhtmltopdf.
  • The django-sendfile is no longer supported so you need to use django-sendfile2 instead. This is a drop-in replacement from django-sendfile2. See the point about media securing for more info.
  • django-auth-ldap uses github now (nothing changed, it just uses github instead of bitbucket).

The fact that from a list of ~30 packages only one (django-sendfile) is no longer supported (and the fact that even for that there’s a drop-in replacement) is a testament to the quality of the Django ecosystem (and to my choosing capabilities).

In addition to the packages of my list, this article already contains a bunch of packages that I’ve used in my projects and I am happy with them so I’d also recommend them to you.

Don’t be afraid to use threadlocals

One controversial aspect if Django is that it avoids using the threadlocals functionality. The thread-local data is a way to store data that is specific to the current running thread. This, combined with the fact that each one of the requests to your Django app will be served by the same thread (worker) gives you a super powerful way to store and then access data that is specific to the current request and would be very difficult (if at all possible) to do it otherwise.

The usual way to work with thread locals in Django is to add a middleware that sets the current request in the thread local data. Then you can access this data from wherever you want in your code, like a global. You can either create that middleware yourself but I’d recommend using the django-tools library for adding this functionality. You’ll add the 'django_tools.middlewares.ThreadLocal.ThreadLocalMiddleware' to your list of middleware (at the end of the listt unless you want to use the current user from another middleware) and then you’ll use it like this:

from django_tools.middlewares import ThreadLocal

# Get the current request object:
request = ThreadLocal.get_current_request()
# You can get the current user directly with:
user = ThreadLocal.get_current_user()

Please notice that Django recommends avoiding this technique because it hides the request/user dependency and makes testing more difficult. However I’d like to respectfully disagree with their rationale.

  • First of all, please notice that this is exactly how Flask works when you access the current request. It stores the request in the thread locals and then you can access it from anywhere in your code.
  • Second, there are things that are very difficult (or even not possible) without using the threadlocals. I’ll give you an example in a little.
  • Third, you can be careful to use the thread locals functionality properly. After all it is a very simple concept. The fact that you are using thread locals can be integrated to your tests.

One example of why thread locals are so useful is this abstract class that I use in almost all my projects and models:

class UserDateAbstractModel(models.Model):
    created_on = models.DateTimeField(auto_now_add=True, )
    modified_on = models.DateTimeField(auto_now=True)

    created_by = models.ForeignKey(
        settings.AUTH_USER_MODEL,
        on_delete=models.PROTECT,
        related_name="%(class)s_created",
    )
    modified_by = models.ForeignKey(
        settings.AUTH_USER_MODEL,
        on_delete=models.PROTECT,
        related_name="%(class)s_modified",
    )

    class Meta:
        abstract = True

    def save(self, *args, **kwargs):
        user = ThreadLocal.get_current_user()
        if user:
            if not self.pk:
                self.created_by = user

            self.modified_by = user
        super(UserDateAbstractModel, self).save(*args, **kwargs)

Models that override this abstract model will automatically set the created_by and modified_by fields to the current user. This works the same no matter if I edit the object from the admin, or from a view. To use that functionality all I need to do is to inherit from that model i.e class MyModel(UserDateAbstractModel) and that’s it.

What would I need to do if I didn’t use the thread locals? I’d need to create a mixin from which all my views (that modify an object) would inherit! This mixin would pick the current user from the request and set it up. Please consider the difference between these two approaches; using the model based approach with the thread locals I can be assured that no matter where I modify an object, the created_by and modified_by will be set properly (unless of course I modify it through the database or django shell — actually, I could make save throw if the current use hasn’t been setup so it wouldn’t be possible to modify from the shell). If I use the mixin approach, I need to make sure that all my views inherit from that mixin and that I don’t forget to do it. Also other people that add code to my project will also need to remember that. This is a lot more error prone and difficult to maintain.

The above is a simple example. I have seen many more cases where without the use of thread locals I’d need to replicate 3-4 classes from an external library (this library was django-allauth for anybody interested) in order to be able to pass through the current user to where I needed to use this. This is a lot of code duplication and a maintenance hell.

One final comment: I’m not recommending to do it like Flask, i.e use thread locals anywhere. For example, in your views and forms it is easy to get the current request, there’s no need to use thread locals there. However, in places where there’s no simple path for accessing the current user then definitely use thread locals and don’t feel bad about it!

The async tasks situation

It is very common for new projects to add support for async tasks, either using celery, or django-rq or various other solutions. I’ve already written a bunch. of. posts. about this topic.

First of all let’s understand why we need async tasks: When you serve Djagno (or any Python web app) in production, you’ll start a number of worker processes that will be used to serve the users. These usually are normal OS processes. The guidelines are to start a finite amount of such processes, equal to 2-4 times the number of the CPU cores of your server. So with 2 cores you’ll have like 8 workers. This means your Django app can handle up to 8 concurrent requests at the same time. If we have a view that takes too long to response (e.g. because it runs a slow query), we’ll have many workers “stuck” on that view, resulting in delays for the other users since the number of workers always stays the same.

To resolve that issue we can use async tasks to offload the work to the “background”. I.e instead of the view waiting for the slow query to finish, it will now add a task in a queue and return immediately. The tasks in the queue will then be run by the async worker one after another. The other way to resolve that is to increase the number of workers, but that is not a good idea since each worker takes a certain amount of memory and resources and we still can’t be positive that we’ll be able to handle all traffic peaks to our slow views.

So, although I believe that async tasks are essential for some situations, my recommendation here is to be very careful and think twice before adding support for async tasks for your project. Because of how python works, the only way to have support for async tasks is to have one or more extra moving parts to your project.

These moving parts will be always a task worker process (that would pick the async tasks from the queue and execute them asynchronously) and probably an external process that would store your queue. Actually the queue process may be redis if you already use it for caching or even the database but also there are projects that use a separate application for the queue like Rabbitmq.

This may look like a small thing but in a production environment this means that instead of running 1 thing for your django app (a gunicorn or uwsgi app server) you need to add at least another thing (the worker). This results tp

  • Make sure that the worker sees and handles your tasks
  • Monitoring the worker (getting alerts when the worker stops, make sure it runs etc)
  • Start the worker when your server starts (i.e when your server reboots)
  • Re-start the worker when you deploy changes (this is critical and easily missed; your worker won’t pick any changes to your app when you deploy it, if you don’t re-start it it will run stale code)
  • Handle exceptions to your async tasks properly
  • Make sure you have some kind of logging and exception tracking for the worker

All this adds up especially if you need to do for every new app.

Taking this into account, I’d recommend to think twice before adding support for async tasks in you Django app. If you really need it, then of course, you’ll need to bite the bullet and add it. But lately my understanding is that people tend to add support for async tasks even though they don’t really need it. Let’s see some examples:

  • I’ve got a view that sends mails and it is too slow. If the view opens a connection to an SMTP server and sends the email then probably it will be slow. However, before using the async task situation, consider using a service like sendgrid or mailgun that will send your mails for you and will be much faster.
  • I need to run slow queries. Well… no you don’t. Your queries should not be slow. You should try to optimize your queries to run faster. If you have done all optimizations and still your queries are slow then you should consider de-normalizing your data to increase performance. This (at least in my book) is preferable over adding async tasks.
  • I need to do bulk operations. This probably is a reason to run async tasks. But before biting the bullet, consider: how many of your users are going to run such bulk operations at the same time?

Setting up Postgres on Windows for development

Introduction

To install Postgresql for a production server on Windows you’d usually go to the official website and use the download link. This will give you an executable installer that would install Postgresql on your server and help you configure it.

However, since I only use Windows for development (and never running any in production on Windows) I’ve found out that there’s a much better and easier way to install postgresql for development and windows which I’ll describe in this post.

If you want to avoid reading the whole post, you can just follow the steps described on the TL;DR below however I’d recommend reading to understand everything.

Downloading the server

First, you’ll click the zip archives link on the . official website and then download the zip archive of the Postgres version you’ll want to install. Right now there are archives for every current version like 14.5, 13.8, 12.12 etc. Let’s get the latest one, 14.5.

This will give me a zip file named postgresql-14.5-1-windows-x64-binaries.zip which contains a single folder named pgsql. I’ll extract that folder, rename it to pgsql145 and move it to c:\progr (I keep stuff there to avoid putting everything on C:). Now you should have a folder named c:\progr\pgsql145 that contains a bunch of folder named bin, doc, include etc.

Setting up the server

Now we are ready to setup Postgresql. Open a command line and move to the pgsql145\bin folder:

cd c:\progr\pgsql145\bin

The bin folder contains all executables of your server and client, like psql.exe (the CUI client), pg_dump.exe (backup), initdb.exe (create a new DB cluster), createdb/dropdb/createuser/dropuser.exe `` (create/drop database/user - these can also be run from SQL) and ``postgres.exe which is the actual server executable.

Our first step is to create a database cluster using initdb. We need to pass it a folder that will contain the data of our cluster. So we’ll run it like:

initdb.exe -D c:\progr\pgsql145\data

(also you could run initdb.exe -D ..\data, since we are on the bin folder). We’ll get output similar to:

The files belonging to this database system will be owned by user "serafeim".
This user must also own the server process.

The database cluster will be initialized with locale "Greek_Greece.1252".
The default database encoding has accordingly been set to "WIN1252".
The default text search configuration will be set to "greek".

Data page checksums are disabled.

fixing permissions on existing directory c:/progr/pgsql145/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... windows
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Europe/Bucharest
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    pg_ctl -D ^"c^:^\progr^\pgsql145^\data^" -l logfile start

And now we’ll have a folder named c:\progr\pgsql145\data that contains files like pg_hba.conf, pg_ident.conf, postgresql.conf and various folders that will keep our database server data. All these can be configured but we’re going to keep using the default config since it fits our needs!

Notice that:

  • The files of our database belong to the “serafeim” role. This role is automatically created by initdb. This is the same username that I’m using to log in to windows (i.e my home folder is c:\users\serafeim\ folder) so this will be different for you. If you wanted to use a different user name or the classic postgres you could pass it to initdb with the -U parameter, for example: initdb.exe -D c:\progr\pgsql145\data_postgres -U postgres.
  • By default “trust” authentication has been configured. This means, copying from postgres trust authentication page that “[…] PostgreSQL assumes that anyone who can connect to the server is authorized to access the database with whatever database user name they specify (even superuser names)”. So local connections will always be accepted with the username we are passing. We’ll see how this works in a minute.
  • The default database encoding will be WIN1252 (on my system). We’ll talk about that a little more later (hint: it’s better to pass -E utf-8 to set your cluster encodign to utf-8)

Starting the server

We could use the pg_ctl.exe executable as proposed by the initdb to start the server as a a background process. However, for our purposes it’s better to start the server as a foreground process on a dedicated window. So we’ll run the postgres.exe directly like:

postgres.exe -D c:\progr\pgsql145\data

or, from the bin directory we could run postgres.exe -D ..\data. The output will be

2022-09-20 09:34:10.184 EEST [10648] LOG:  starting PostgreSQL 14.5, compiled by Visual C++ build 1914, 64-bit
2022-09-20 09:34:10.189 EEST [10648] LOG:  listening on IPv6 address "::1", port 5432
2022-09-20 09:34:10.189 EEST [10648] LOG:  listening on IPv4 address "127.0.0.1", port 5432
2022-09-20 09:34:10.330 EEST [3084] LOG:  database system was shut down at 2022-09-20 09:34:08 EEST
2022-09-20 09:34:10.369 EEST [10648] LOG:  database system is ready to accept connections

Success! Our server is running and listening on 127.0.0.1 port 5432. This means that it accepts connection only from our local machine (which is what we want for our purposes). We can now connect to it using the psql.exe client. Open another cmd, go to C:\progr\pgsql145\bin and run psql.exe: You’ll probably get an error similar to psql: error: connection to server at "localhost" (::1), port 5432 failed: FATAL:  database "serafeim" does not exist (unless your windows username is postgres).

By default psql.exe tries to connect with a role with the username of your Windows user and to a database named after the user you are connecting with. Our database server has a role named serafeim (it is created by default by the initdb as described before) but it doesn’t have a database named serafeim! Let’s connect to the postgres database instead by passing it as a parameter psql postgres:

C:\progr\pgsql145\bin>psql postgres
psql (14.5)
WARNING: Console code page (437) differs from Windows code page (1252)
        8-bit characters might not work correctly. See psql reference
        page "Notes for Windows users" for details.
Type "help" for help.

postgres=# select version();
                          version
------------------------------------------------------------
PostgreSQL 14.5, compiled by Visual C++ build 1914, 64-bit
(1 row)

Success!

Let’s cerate a sample user and database to make user that everything’s working fine createuser.exe koko, createdb kokodb and connect to the kokodb as koko: psql -U koko kokodb.

kokodb=> create table kokotable(foo varchar);
CREATE TABLE
kokodb=> insert into kokotable values('kokoko');
INSERT 0 1
kokodb=> select * from kokotable;
  foo
--------
kokoko
(1 row)

Everything’s working fine! In the meantime, we should get useful output on our postgres dedicated windows, like 2022-09-20 09:36:01.899 EEST [9704] FATAL:  database "serafeim" does not exist. To stop it, just press Ctrl+C on that window and you should get output similar to:

2022-09-20 09:46:45.178 EEST [10648] LOG:  background worker "logical replication launcher" (PID 7860) exited with exit code 1
2022-09-20 09:46:45.185 EEST [10048] LOG:  shutting down
2022-09-20 09:46:45.278 EEST [10648] LOG:  database system is shut down

I usually add a pg.bat file on my c:\progr\pgsql145\ that will start the database with its data folder. It’s contents are only bin\postgres.exe -D data

So let’s create the pg.bat like this:

c:\>cd c:\progr\pgsql145

c:\progr\pgsql145>copy con pg.bat
bin\postgres.exe -D data
^Z
        1 file(s) copied.

c:\progr\pgsql145>pg.bat
2022-09-20 09:49:53.642 EEST [11660] LOG:  starting PostgreSQL 14.5, compiled by Visual C++ build 1914, 64-bit
...

One final thing to notice is that, since we use the trust authentication there’s no check for the password, so if we tried to pass a password like psql -U koko -W kokodb it will work no matter what password we type.

Encoding stuff

The default encoding situation

You may have noticed before that the default encoding for databases will be WIN1252 (or some other similar 8-bit character set). You never want that (I guess this default is there for compatibility reasons), you want to have utf-8 encoding. So you should either pass the proper encoding to initdb, like:

initdb -D ..\datautf8 -E utf-8

This will create a new cluster with utf-8 encoding. All databases created on that cluster will be utf-8 by default.

If you’ve already got a non-utf-8 cluster, you should force utf-8 for your new database instead:

createdb -E utf-8 -T template0 dbutf8

Notice that I also passed the -T template0 parameter to use the template0 template database. If I tried to run createdb -E utf-8 dbutf8 (so it would use the template1) I’d get an error similar to:

createdb: error: database creation failed: ERROR:  new encoding (UTF8) is incompatible with the encoding of the template database (WIN1252)
HINT:  Use the same encoding as in the template database, or use template0 as template.

About the psql codepage warning

You may (or may not) have noticed a warning similar to this when starting the server:

WARNING: Console code page (437) differs from Windows code page (1252)
      8-bit characters might not work correctly. See psql reference
      page "Notes for Windows users" for details.

Some more info about this can be found in the psql reference page and this SO issue. To avoid this warning you’ll use chcp 1252 to set the console code page to 1252 before running psql.

I have to warn you though that using psql.exe from the windows console will be problematic anyway because of not good unicode support. You can use it fine as long as you write only ascii characters but I’d avoid anything else.

That’s why I’d recommend using a graphical database client like for example dbeaver.

A TL;DR walkthrough

Here are the steps to follow to get a working postgresql server on windows:

  1. Download the postgresql windows binaries of the version you want from the zip archives page and extract it to a folder, let’s name it pgsql.
  2. Go to pgsql\bin folder on a command line
  3. Run initdb.exe -D ..\data -E utf-8 from inside the pgsql\bin folder of the to create a new database cluster with utf-8 encoding on the data directory
  4. Run postgresql.exe -D ..\data to start the database server
  5. Go to pgsql\bin folder on another command line
  6. Run psql postgres to connect to the postgres database with a role similar to your windows username
  7. Profit!

Conclusion

Using the above steps you can easily setup a postgres database server on windows for development. Some advantages of the method proposed here are:

  • Since you configure the data directory you can have as many clusters as you want (run initdb with different data directories and pass them to postgres)
  • Since nothing is installed globally, you can have as many postgresql versions as you want, each one having its own data directory. Then you’ll start the one you want each time! For example I’ve got Postgresql 12,13 and 14.5.
  • Using the trust authentication makes it easy to connect with whatever user
  • Running the database from postgresql.exe so it has a dedicated window makes it easy to know what the database is doing, peeking at the logs and stopping it (using ctrl+c)