/var/

Various programming stuff

Phoenix forms integration with select2 and ajax

During the past months I’ve tried to implement a project using Elixir and the Phoenix framework. Old visitors of my blog will probably remember that I mainly use Django for back-end development but I decided to also give Phoenix a try.

My first impressions are positive but I don’t want to go into detail in this post; I’ll try to add a more extensive post comparing Elixir / Phoenix with Python / Django someday.

The problem that this particular post will try to explain is how to properly integrate a jQuery select2 dropdown ajax with autocomplete search to your Phonix Forms. This seems like a very common problem however I couldn’t find a proper solution anywhere in the internet. It seems that most people using Phoenix prefer to implement their autocompletes using SPA like functionality (react etc). Also I found this project that seems to be working, however it does not use select2 and I really didn’t like to mess with a different JS library for reasons that should be too obvious to most people.

So here we’ll implement a simple solution for allowing your foreign key value to be autocompleted through ajax using select2. The specific example is that you have a User that belongs to an Authority i.e user has a field named authority_id which is a foreign key to authority. We’ll add a functionality to the user edit form to select the authority using ajax-autocomplete.

Please notice that you can find a working version of this tutorial in my Phoenix Crud template project: https://github.com/spapas/phxcrd. This project contains various other functionality that I need but you should be able to test the user - authority integration by following the instructions there.

The schemas

For this tutorial, we’ll use two schemas: A User and an Authority. Each User belongs to an Authority (thus will have a foreign key to Authority; that’s what we want to set using the ajax select2). Here are the ecto schemas for these entities:

defmodule Phxcrd.Auth.Authority do
  use Ecto.Schema

  import Ecto.Changeset
  alias Phxcrd.Auth.User

  schema "authorities" do
    field :name, :string
    has_many :users, User, on_replace: :nilify
    timestamps()
  end

  @doc false
  def changeset(authority, attrs) do
    authority
    |> cast(attrs, [:name])
    |> validate_required([:name], message: "The field is required")
    |> unique_constraint(:name, message: "The name already exists!")
  end

  use Accessible
end
defmodule Phxcrd.Auth.User do
  use Ecto.Schema

  import Ecto.Changeset
  alias Phxcrd.Auth.Authority

  schema "users" do
    field :email, :string
    field :username, :string
    field :password_hash, :string
    field :password, :string, virtual: true

    belongs_to :authority, Authority

    timestamps()
  end

  @doc false
  def changeset(user, attrs) do
    user
    |> cast(attrs, [:username, :email, :authority_id])
    |> validate_required([:username, :email ])
  end

  use Accessible
end

Notice that both these entities are contained in the Auth context and were created using mix phx.gen.html; I won’t include the migrations here.

The search API

Let’s now take a look at the search api for Authority. I’ve added an ApiController which contains the following function:

def search_authorities(conn, params) do
    q = params["q"]

    authorities =
      from(a in Authority,
        where: ilike(a.name, ^"%#{q}%")
      )
      |> limit(20)
      |> Repo.all()

    render(conn, "authorities.json", authorities: authorities)
end

Notice that this retrieves a q parameter and makes an ilike query to Authority.name. It then passes the results to the view for rendering. Here’s the corresponding function for ApiView:

def render("authorities.json", %{authorities: authorities}) do
    %{results: Enum.map(authorities, &authority_json/1)}
  end

  def authority_json(a) do
    %{
      id: a.id,
      text: a.name
    }
end

Notice that select2 wants its results in a JSON struct with the following form {results: [{id: 1, name: "Authority 1"}]}.

To add this controller action to my routes I’ve added this to router.ex:

scope "/api", PhxcrdWeb do
    pipe_through :api

    get "/search_authorities", ApiController, :search_authorities
end

Thus if you visit http://127.0.0.1/search_authorities?q=A you should retrieve authorities containing A in their name.

The controller

Concenring the UserController I’ve added the following methods to it for creating and updating users:

def new(conn, _params) do
  changeset = Auth.change_user(%User{})
  render(conn, "new.html", changeset: changeset)
end

def create(conn, %{"user" => user_params}) do
  case Auth.create_user(user_params) do
    {:ok, user} ->
      conn
      |> put_flash(:info, "#{user.name} created!")
      |> redirect(to: Routes.user_path(conn, :show, user))

    {:error, changeset} ->
      render(conn, "new.html", changeset: changeset)
  end
end

def edit(conn, %{"id" => id}) do
  user = Auth.get_user!(id)
  changeset = Auth.change_user(user)
  render(conn, "edit.html", user: user, changeset: changeset)
end

def update(conn, %{"id" => id, "user" => user_params}) do
  user = Auth.get_user!(id)

  user_params = Map.merge(%{"authority_id" => nil}, user_params)

  case Auth.update_user(user, user_params) do
    {:ok, user} ->
      conn
      |> put_flash(:info, "User updated successfully.")
      |> redirect(to: Routes.user_path(conn, :show, user))

    {:error, %Ecto.Changeset{} = changeset} ->
      render(conn, "edit.html", user: user, changeset: changeset)
  end
end

Most of these are more or less the default things that mix phx.gen.html creates. One thing that may seem a strange here is the user_params = Map.merge(%{"authority_id" => nil}, user_params) line of update. What happens here is that I want to be able to clear the authority of a user (I’ll explain how in the next sections). If I do that then the user_params that is passed to update will not contain an authority_id key thus the authority_id won’t be changed at all (so even though I cleared it, it will keep its previous value after I save it). To fix that I set a default value of nil to authority_id; if the user has actually selected an authority from the form this will be overriden when merging the two maps. So the resulting user_params will always contain an authority_id key, either set to nil or to the selected authority.

Beyond that I wont’ go into detail explaining the above functions, but if something seems strange feel free to ask. I also won’t explain the Auth.* functions; all these are created by phoenix in the context module.

The view

The UserView module contains a simple but very important function:

def get_select_value(changeset, attr) do
  case changeset.changes[attr] do
    nil -> Map.get(changeset.data, attr)
    z -> z
  end
end

This functions gets two parameters: The changeset and the name of the attribute (:authority_id in our case). What it does is to first check if this attribute is contained in the changeset.changes; if yes it will return that value. If it isn’t contained in the changeset.changes then it will return the value of changeset.data for that attribute.

This is a little complex but let’s try to understand its logic: When you start editing a User you want to display the current authority of that instance. However, when you submit an edited user and retrieve an errored form (for example because you forgot to fill the username) you want to display the authority that was submitted in the form. So the changeset.changes contains the changes that were submitted just before while the changeset.data contain the initial value of the struct.

The form template

Both the :new and :edit actions include a common form.html.eex template:

<%= form_for @changeset, @action, fn f -> %>
  <%= if @changeset.action do %>
  <div class="alert alert-danger">
    <p><%= gettext("Problems while saving") %></p>
  </div>
  <% end %>
  <div class='row'>
    <div class='column'>
      <%= label f, :username %>
      <%= text_input f, :username %>
      <%= error_tag f, :username %>
    </div>
    <div class='column'>
      <%= label f, :email %>
      <%= text_input f, :email %>
      <%= error_tag f, :email %>
    </div>
  </div>

  <div class='row'>
    <div class='column'>
      <%= label f, :authority %>
      <%= select(f,
        :authority_id, [
          (with sv when not is_nil(sv) <- get_select_value(@changeset, :authority_id),
                                     a <- Phxcrd.Auth.get_authority!(sv), do: {a.name, a.id})
        ],
        style: "width: 100%")
        %>
      <%= error_tag f, :authority_id %>
    </div>

  </div>

  <div>
    <%= submit gettext("Save") %>
  </div>
<% end %>

This is a custom Phoenix form but it has the following addition which is more or less the meat of this article (along with the get_select_value function I explained before):

select(f, :authority_id, [
        (with sv when not is_nil(sv) <- get_select_value(@changeset, :authority_id),
                                   a <- Phxcrd.Auth.get_authority!(sv), do: {a.name, a.id})
      ],
      style: "width: 100%")

So this will create an html select element which will contain a single value (the array in the third parameter of select): The authority of that object or the authority that the user had submitted in the form. For this it uses get_select_value to retrieve the :authority_id and if it’s not nil it passes it to get_authority! to retrieve the actual authority and return a tuple with its name and id.

By default when you create a select element you’ll pass an array of all options in the third parameter, for example:

select(f, :authority_id, Phxcrd.Auth.list_authorities |> Enum.map(&{&1.name, &1.id}))

Of course this beats the purpose of using ajax since all options will be rendered.

The final step is to add the required custom javascript to convert that select to select2-with-ajax:

$(function () {
    $('#user_authority_id').select2({
      allowClear: true,
      placeholder: 'Select authority',
      ajax: {
        url: '<%= Routes.api_path(@conn, :search_authorities) %>',
        dataType: 'json',
        delay: 150,
        minimumInputLength: 2
      }
    });
})

The JS very rather simple; the allowClear option will display an x so that you can clear the selected authority while the ajax url will be that of the :search_authorities.

Conclusion

Although this article may seem a little long, as I’ve already mentioned the most important thing to keep is how to properly set the value that should be displayed in your select2 widget. Beyond that everything is a walk in the park by following the docs.

How to create a custom filtered adapter in Android

Introduction

Android offers a nice component named AutoCompleteTextView that can be used to auto-fill a text box from a list of values. In its simplest form, you just create an array adapter passing it a list of objects (that have a proper toString() method). Then you type some characters to the textbox and by default it will filter the results searching in the beginning of the backing object’s toString() result.

However there are times that you don’t want to look at the beginning of the string (because you want to look at the middle of the string) or you don’t want to just to search in toString() method of the object or you want to do some more fancy things in object output. For this you must override the ArrayAdapter and add a custom Filter.

Unfurtunately this isn’t as straightforward as I’d like and I couldn’t find a quick and easy tutorial on how it can be done.

So here goes nothing: In the following I’ll show you a very simple android application that will have the minimum viable custom filtered adapter implementation. You can find the whole project in github: https://github.com/spapas/CustomFilteredAdapeter but I am going to discuss everything here also.

The application

Just create a new project with an empty activity from Android Studio. Use kotlin as the language.

The layout

I’ll keep it as simple as possible:

<?xml version="1.0" encoding="utf-8"?>
<LinearLayout
        xmlns:android="http://schemas.android.com/apk/res/android"
        xmlns:tools="http://schemas.android.com/tools"
        android:orientation="vertical"
        android:layout_width="match_parent" android:layout_height="match_parent"
        tools:context=".MainActivity">
    <TextView
            android:layout_width="match_parent" android:layout_height="wrap_content"
            android:text="Hello World!"
            android:textSize="32sp"
            android:textAlignment="center"/>
    <AutoCompleteTextView
            android:layout_marginTop="32dip"
            android:layout_width="match_parent" android:layout_height="wrap_content"
            android:id="@+id/autoCompleteTextView"/>
</LinearLayout>

You should just care about the AutoCompleteTextView with an id of autoCompleteTextView.

The backing data object

I’ll use a simple PoiDao Kotlin data class for this:

data class PoiDao(
    val id: Int,
    val name: String,
    val city: String,
    val category_name: String
)

I’d like to be able to search to both name, city and category_name of each object. To create a list of the pois to be used to the adapter I can do something like:

val poisArray = listOf(
    PoiDao(1, "Taco Bell", "Athens", "Restaurant"),
    PoiDao(2, "McDonalds", "Athens","Restaurant"),
    PoiDao(3, "KFC", "Piraeus", "Restaurant"),
    PoiDao(4, "Shell", "Lamia","Gas Station"),
    PoiDao(5, "BP", "Thessaloniki", "Gas Station")
)

The custom adapter

This will be an ArrayAdapter<PoiDao> implementing also the Filterable interface:

inner class PoiAdapter(context: Context, @LayoutRes private val layoutResource: Int, private val allPois: List<PoiDao>):
    ArrayAdapter<PoiDao>(context, layoutResource, allPois),
    Filterable {
    private var mPois: List<PoiDao> = allPois

    override fun getCount(): Int {
        return mPois.size
    }

    override fun getItem(p0: Int): PoiDao? {
        return mPois.get(p0)
    }

    override fun getItemId(p0: Int): Long {
        // Or just return p0
        return mPois.get(p0).id.toLong()
    }

    override fun getView(position: Int, convertView: View?, parent: ViewGroup?): View {
        val view: TextView = convertView as TextView? ?: LayoutInflater.from(context).inflate(layoutResource, parent, false) as TextView
        view.text = "${mPois[position].name} ${mPois[position].city} (${mPois[position].category_name})"
        return view
    }

    override fun getFilter(): Filter {
        // See next section
    }
}

You’ll see that we add an instance variable named mPois that gets initialized in the start with allPois (which is the initial list of all pois that is passed to the adapter). The mPois will contain the filtered results. Then, for getCount and getItem we return the corresponding valeus from mPois; the getItemId is used when you have an sqlite backed adapter but I’m including it here for completeness.

The getView will create the specific line for each item in the dropdown. As you’ll see the layout that is passed must have a text child which is set based on some of the attributes of the corresponding poi for each position. Notice that we can use whatever view layout we want for our dropdown result line (this is the layoutResource parameter) but we need to configure it (i.e bind it with the values of the backing object) here properly.

Finally we create a custom instance of the Filter, explained in the next section.

The custom filter

The getFilter creates an object instance of a Filter and returns it:

override fun getFilter(): Filter {
    return object : Filter() {
        override fun publishResults(charSequence: CharSequence?, filterResults: Filter.FilterResults) {
            mPois = filterResults.values as List<PoiDao>
            notifyDataSetChanged()
        }

        override fun performFiltering(charSequence: CharSequence?): Filter.FilterResults {
            val queryString = charSequence?.toString()?.toLowerCase()

            val filterResults = Filter.FilterResults()
            filterResults.values = if (queryString==null || queryString.isEmpty())
                allPois
            else
                allPois.filter {
                    it.name.toLowerCase().contains(queryString) ||
                    it.city.toLowerCase().contains(queryString) ||
                    it.category_name.toLowerCase().contains(queryString)
                }
            return filterResults
        }
    }
}

This object instance overrides two methods of Filter: performFiltering and publishResults. The performFiltering is where the actual filtering is done; it should return a FilterResults object containing a values attribute with the filtered values. In this method we retrieve the charSequence parameter and converit it to lowercase. Then, if this parameter is not empty we filter the corresponding elements of allPois (i.e name, city and category_name in our case) using contains. If the query parameter is empty then we just return all pois. Warning java developers; here the if is used as an expression (i.e its result will be assigned to filterResults.values).

After the performFiltering has finished, the publishResults method is called. This method retrieves the filtered results in its filterResults parameter. Thus it sets mPois of the custom adapter is set to the result of the filter operation and calls notifyDataSetChanged to display the results.

Using the custom adapter

To use the custom adapter you can do something like this in your activity’s onCreate:

override fun onCreate(savedInstanceState: Bundle?) {
    super.onCreate(savedInstanceState)
    setContentView(R.layout.activity_main)

    val poisArray = listOf(
        // See previous sections
    )
    val adapter = PoiAdapter(this, android.R.layout.simple_list_item_1, poisArray)
    autoCompleteTextView.setAdapter(adapter)
    autoCompleteTextView.threshold = 3

    autoCompleteTextView.setOnItemClickListener() { parent, _, position, id ->
        val selectedPoi = parent.adapter.getItem(position) as PoiDao?
        autoCompleteTextView.setText(selectedPoi?.name)
    }
}

We create the PoiAdapter passing it the poisArray and android.R.layout.simple_list_item_1 as the layout. That layout just contains a textview named text. As we’ve already discussed you can pass something more complex here. The thresold defined the number of characters that the user that needs to enter to do the filtering (default is 2).

Please notice that when the user clicks (selects) on an item of the dropdown we set the contents of the textview (or else it will just use the object’s toString() method to set it).

Fixing your Django async job - database integration

I’ve already written two articles about django-rq and implementing asynchronous tasks in Django. However I’ve found out that there’s a very important thing missing from them: How to properly integrate your asynchronous tasks with your Django database. This is very important because if it is not done right you will start experiencing strange errors about missing database objects or duplicate keys. The most troublesome thing about these errors is that they are not consistent. Your app may work fine but for some reason you’ll see some of your asynchronous tasks fail with these errors. When you re-queue the async jobs everything will be ok.

Of course this behavior (code that runs sometimes) smells of a race condition but its not easy to debug it if you don’t know the full story.

In the following I will describe the cause of this error and how you can fix it. As a companion to this article I’ve implemented a small project that can be used to test the error and the fix: https://github.com/spapas/async-job-db-fix.

Notice that although this article is written for django-rq it should also help people that have the same problems with other async job systems (like celery or django-q).

Description of the project

The project is very simple, you can just add a url and it will retrieve its content asynchronously and report its length. For the models, it just has a Task model which is used to provide information about what we want to the asynchronous task to do and retrieve the result:

from django.db import models

class Task(models.Model):
    created_on = models.DateTimeField(auto_now_add=True)
    url = models.CharField(max_length=128)
    url_length = models.PositiveIntegerField(blank=True, null=True)
    job_id = models.CharField(max_length=128, blank=True, null=True)
    result = models.CharField(max_length=128, blank=True, null=True)

It also has a home view that can be used to start new asynchronous tasks by creating a Task object with the url we got and passing it to the asynchronous task:

from django.views.generic.edit import FormView
from .forms import TaskForm
from .tasks import get_url_length
from .models import Task

import time
from django.db import transaction

class TasksHomeFormView(FormView):
    form_class = TaskForm
    template_name = 'tasks_home.html'
    success_url = '/'

    def form_valid(self, form):
        task = Task.objects.create(url=form.cleaned_data['url'])
        get_url_length.delay(task.id)
        return super(TasksHomeFormView, self).form_valid(form)

    def get_context_data(self, **kwargs):
        ctx = super(TasksHomeFormView, self).get_context_data(**kwargs)
        ctx['tasks'] = Task.objects.all().order_by('-created_on')
        return ctx

And finally the asynchronous job itself that retrieves the task from the database, requests its url and saves its length:

import requests
from .models import Task
from rq import get_current_job
from django_rq import job

@job
def get_url_length(task_id):
    jb = get_current_job()
    task = Task.objects.get(
        id=task_id
    )
    response = requests.get(task.url)
    task.url_length = len(response.text)
    task.job_id = jb.get_id()
    task.result = 'OK'
    task.save()

The above should be fairly obvious: The user visits the homepage and enters a url at the input. When he presses submit the view will create a new Task object with the url that the user entered and fire-off the get_url_length asynchronous job passing the task id of the task that was just created. It will then return immediately without waiting for the asynchronous job to complete. The user will need to refresh to see the result of his job; this is the usual behavior with async jobs.

The asynchronous job on the other hand will retrieve the task whose id got as a parameter from the database, do the work it needs to do and update the result when it is finished.

Unfortunately, the above simple setup will probably behave erratically by randomly throwing database related errors!

Cause of the problem

In the previous section I said probably because the erratic behavior is caused by a specific setting of your Django project; the ATOMIC_REQUESTS. This setting can be set on your database connection and if it is TRUE then each request will be atomic. This means that each request will be tied with a database transaction i.e a transaction will be started when your request starts and commited only when your requests finishes; if for some reason your request throws an error then the transaction will be rolled back. An example of this setting is:

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.sqlite3',
        'NAME': os.path.join(BASE_DIR, 'db.sqlite3'),
        'ATOMIC_REQUESTS': True,
    }
}

Now, in my opinion, ATOMIC_REQUESTS is a great thing to have because it makes everything much easier. I always set it to True to my projects because I don’t need to actually think about transactions and requests; I know that if there’s a problem in a request the whole transaction will be rolle back and no garbage will be left in the database. If on the other hand for some reason a request does not need to be tied to a transaction I just set it off for this specific transaction (using transaction.non_atomic_requests_). Please notice that by default the ATOMIC_REQUESTS has a False value which means that the database will be in autocommit mode meaning that every command will be executed immediately.

So although the ATOMIC_REQUESTS is great, it is actually the reason that there are problems with asynchronous tasks. Why? Let’s take a closer look at what the form_valid of the view does:

def form_valid(self, form):
    task = Task.objects.create(url=form.cleaned_data['url']) #1
    get_url_length.delay(task.id) #2
    return super(TasksHomeFormView, self).form_valid(form) #3

It creates the task in #1, fires off the asynchronous task in #2 and continues the execution of the view processing in #3. The important thing to understand here is that the transaction will be commited only after #3 is finished. This means that there’s a possibility that the asynchronous task will be started before #3 is finished thus it won’t find the task because the task will not be created yet(!) This is a little counter-intuitive but you must remember that the async task is run by a worker which is a different process than your application server; the worker may be able to start before the transaction is commited.

If you want to actually see the problem every time you can add a small delay between the start of the async task and the form_valid something like this:

def form_valid(self, form):
    task = Task.objects.create(url=form.cleaned_data['url'])
    get_url_length.delay(task.id)
    time.sleep(1)
    return super(TasksHomeFormView, self).form_valid(form)

This will make the view more slow so the asynchronous worker will always have time to start executing the task (and get the not found error). Also notice that if you had ATOMIC_REQUESTS: False the above code would work fine because the task would be created immediately (auto-commited) and the async job would be able to find it.

The solution

So how is this problem solved? Well it’s not that difficult now that you know what’s causing it!

One solution would be to set ATOMIC_REQUESTS to False but that would make all database commands auto-commit so you’ll lose request-transaction-tieing. Another solution would be to set ATOMIC_REQUESTS to True and disable atomic requests for the specific view that starts the asynchronous job using transaction.non_atomic_requests_. This is a viable solution however I don’t like it because I’d lose the comfort of transaction per request for this specific request and I would need to add my own transaction handling.

A third solution is to avoid messing with the database in your view and create the task object in the async job. Any parameters you want to pass to the async job would be passed directly to the async function. This may work fine in some cases but I find it more safe to create the task in the database before starting the async job so that I have better control and error handling. This way even if there’s an error in my worker and for some reason the async job never starts or it breaks before being able to handle the database, I will have the task object in the database because it will have been created in the view.

Is there anything better? Isn’t there a way to start the executing the async job after the transaction of the view is commited? Actually yes, there is! For this, transaction.on_commit comes to the rescue! This function receives a callback that will be called after the transaction is commited! Thus, to properly fix you project, you should change the form_valid method like this:

def form_valid(self, form):
    task = Task.objects.create(url=form.cleaned_data['url'])
    transaction.on_commit(lambda: get_url_length.delay(task.id))
    time.sleep(1)
    return super(TasksHomeFormView, self).form_valid(form)

Notice that I need to use lambda to create a callback function that will call get_url_length.delay(task.id) when the transaction is commited. Now even though I have the delay there the async job will start after the transaction is commited, ie after the view handler is finished (after the 1 second delay).

Conclusion

From the above you should be able to understand why sometimes you have problems when your async jobs use the database. To fix it you have various options but at least for me, the best solution is to start your async jobs after the transaction is commited using transaction.on_commit. Just change each async.job.delay(parameters) call to transaction.on_commit(lambda: async.job.delay(parameters)) and you will be fine!

Use du to find out the disk usage of each directory in unix

One usual problem I have when dealing with production servers is that their disks get filled. This results in various warnings and errors and should be fixed immediately. The first step to resolve this issue is to actually find out where is that hard disk space is used!

For this you can use the du unix tool with some parameters. The problem is that du has various parameters (not needed for the task at hand) and the various places I search for contain other info not related to this specific task.

Thus I’ve decided to write this small blog post to help people struggling with this and also to help me avoid googling for it by searching in pages that also contain other du recipies and also avoid the trial and error that this would require.

So to print out the disk usage summary for a directory go to that directory and run du -h -s *; you need to have access to the child subdirectories so probably it’s better to try this as root (unless you go to your home dir for example).

Here’s a sample usage:

[root@server1 /]# cd /
[root@server1 /]# du -h -s *
7.2M    bin
55M     boot
164K    dev
35M     etc
41G     home
236M    lib
25M     lib64
20K     lost+found
8.0K    media
155G    mnt
0       proc
1.6G    root
12M     sbin
8.0K    srv
427M    tmp
3.2G    usr
8.9G    var

The parameters are -h to print human readable sizes (G, M etc) and -s to print a summary usage of each parameter. Since this will output the summary for each parameter I finally pass * to be changed to all files/dirs in that directory. If I used du -h -s /tmp instead I’d get the total usage only for the /tmp directory.

Another trick that may help you quickly find out the offending directories is to append the | grep G pipe command (i.e run du -h -s * | grep G) which will filter out only the entries containing a G (i.e only print the folders having more than 1 GB size). Yeh I know that this will also print entries that have also a G in their name but since there aren’t many directores that have G in their name you should be ok.

If you run the above from / so that /proc is included you may get a bunch of du: cannot access 'proc/nnn/task/nnn/fd/4': No such file or directory errors; just add the 2> /dev/null pipe redirect to redirect the stderr output to /dev/null, i.e run du -h -s * 2> /dev/null.

Finally, please notice that if there are lots of files in your directory you’ll get a lot of output entries (since the * will match both files and directories). In this case you can use echo */ or ls -d */ to list only the directories; append that command inside a ` pair or $() (to substitute for the command output) instead of the * to only get the sizes of the directories, i.e run du -h -s $(echo */) or du -h -s `echo */`.

One thing that you must be aware of is that this command may take a long time especially if you have lots of small files somewhere. Just let it run and it should finish after some time. If it takes too long time try to exclude any mounted network directories (either with SMB or NFS) since these will take extra long time.

Also, if you awant a nice interactive output using ncurses you can download and compile the ncdu tool (NCurses Disk Usage).

Adding a delay to Django HTTP responses

Sometimes you’d like to make your Django views more slow by adding a fake delay. This may sound controversial (why would somebody want to make some of his views slower) however it is a real requirement, at least when developing an application.

For example, you may be using a REST API and you want to implement a spinner while your form is loading. However, usually when developing your responses will load so soon that you won’t be able to admire your spinner in all its glory! Also, when you submit a POST form (i.e a form that changes your data), it is advisable to disable your submit button so that when your users double click it the form won’t be submitted two times (it may seem strange to some people but this is a very common error that has bitten me many times; there are many users that think that they need to double click the buttons; thus I always disable my submit buttons after somebody clicks them); in this case you also need to make your response a little slower to make sure that the button is actually disabled!

I will propose two methods for adding this delay to your responses. One that will affect all (or most) your views using a middleware and another that you can add to any CBV you want using a mixin; please see my previous CBV guide for more on Django CBVs and mixins. For the middleware solution we’ll also take a quick look at what is the Django middleware mechanism and how it can be used to add functionality.

Using middleware

The Django middleware is a mechanism for adding your own code to the Django request / response cycle. I’ll try to explain this a bit; Django is waiting for an HTTP Request (i.e GET a url with these headers and these query parameters), it will parse this HTTP Request and prepare an HTTP Response (i.e some headers and a Payload). Your view will be the main actor for retrieving the HTTP response and returning the HTTP request. However, using this middleware mechanism Django allows you to enable other actors (the middleware) that will universally modify the HTTP request before passing it to your view and will also modify the view’s HTTP respone before sending it back to the client.

Actually, a list of middleware called … MIDDLEWARE is defined by default in the settings.py of all new Django projects; these are used to add various capabilities that are universally needed, for example session support, various security enablers, django message support and others. You can easily attach your own middleware to that list to add extra functionality. Notice that the order of the middleware in the MIDDLEWARE list actually matters. Middleware later in the list will be executed after the ones previous in the list; we’ll see some consequences of this later.

Now the time has come to take a quick look at how to implement a middleware, taken from the Django docs:

class SimpleMiddleware:
    def __init__(self, get_response):
        self.get_response = get_response
        # One-time configuration and initialization.

    def __call__(self, request):
        # Code to be executed for each request before
        # the view (and later middleware) are called.

        response = self.get_response(request)

        # Code to be executed for each request/response after
        # the view is called.

        return response

Actually you can implement the middleware as a nested function however I prefer the classy version. The comments should be really enlightening: When your project is started the constructor (__init__) will be called once, for example if you want to read a configuration setting from the database then you should do it in the __init__ to avoid calling the database everytime your middleware is executed (i.e for every request). The __call__ is a special method that gets translated to calling this class instance as a function, i.e if you do something like:

sm = SimpleMiddleware()
sm()

Then sm() will execute the __call__; there are various similar python special methods, for example __len__, __eq__ etc

Now, as you can see the __call__ special method has four parts:

  • Code that is executed before the self.get_response() method is called; here you should modify the request object. Middleware will reach this point in the order they are listed.
  • The actual call to self.get_response()
  • Code that is executed after the self.get_response() method is called; here you should modify the response object. Middleware will reach this point in the reverse order they are listed.
  • Returning the response to be used by the next middleware

Notice that get_response will call the next middleware; while the get_response for the last middleware will actually call the view. Then the view will return a response which could be modified (if needed) by the middlewares in the opposite order of their definition list.

As an example, let’s define two simple middlewares:

class M1:
    def __init__(self, get_response):
        self.get_response = get_response

    def __call__(self, request):
        print("M1 before response")
        response = self.get_response(request)
        print("M1 after response")
        return response

class M2:
    def __init__(self, get_response):
        self.get_response = get_response

    def __call__(self, request):
        print("M2 before response")
        response = self.get_response(request)
        print("M2 after response")
        return response

When you define MIDDLEWARE = ['M1', 'M2'] you’ll see the following:

# Got the request
M1 before response
M2 before response
# The view is rendered to the response now
M2 after response
M1 after response
# Return the response

Please notice a middleware may not call self.get_response to continue the chain but return directly a response (for example a 403 Forbiden response).

After this quick introduction to how middleware works, let’s take a look at a skeleton for the time-delay middleware:

import time

class TimeDelayMiddleware(object):

    def __init__(self, get_response):
        self.get_response = get_response

    def __call__(self, request):
        time.sleep(1)
        response = self.get_response(request)
        return response

This is really simple, I’ve just added an extra line to the previous middleware. This line adds a one-second delay to all responses. I’ve added it before self.get_response - because this delay does not depend on anything, I could have added it after self.get_response without changes in the behavior. Also, the order of this middleware in the MIDDLEWARE list doesn’t matter since it doesn’t depend on other middleware (it just needs to run to add the delay).

This middleware may have a little more functionality, for example to configure the delay from the settings or add the delay only for specific urls (by checking the request.path). Here’s how these extra features could be implemented:

import time
from django.conf import settings

class TimeDelayMiddleware(object):

    def __init__(self, get_response):
        self.get_response = get_response
        self.delay = settings.REQUEST_TIME_DELAY


    def __call__(self, request):
        if '/api/' in request.path:
            time.sleep(self.delay)
        response = self.get_response(request)
        return response

The above will add the delay only to requests whose path contains '/api'. Another case is if you want to only add the delay for POST requests by checking that request.method == 'POST'.

Now, to install this middleware, you can configure your MIDDLEWARE like this in your settings.py (let’s say that you have an application named core containing a module named middleware):

MIDDLEWARE = [
    'django.middleware.security.SecurityMiddleware',
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.middleware.common.CommonMiddleware',
    'django.middleware.csrf.CsrfViewMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.contrib.messages.middleware.MessageMiddleware',
    'django.middleware.clickjacking.XFrameOptionsMiddleware',

    'core.middleware.TimeDelayMiddleware',
]

The other middleware are the default ones in Django. One more thing to consider is that if you have a single settings.py this middleware will be called; one way to override the delay is to check for settings.DEBUG and only call time.sleep when DEBUG == True. However, the proper way to do it is to have different settings for your development and production environments and add the TimeDelayMiddleware only to your development MIDDLEWARE list. Having different settings for each development is a common practice in Django and I totally recommend to use it.

Using CBVs

Another method to add a delay to the execution of a view is to implement a TimeDelayMixin and inherit your Class Based View from it. As we’ve seen in the CBV guide, the dispatch method is the one that is always called when your CBV is rendered, thus your TimeDelayMixin could be implemented like this:

import time

class TimeDelayMixin(object, ):

    def dispatch(self, request, *args, **kwargs):
        time.sleep(1)
        return super().dispatch(request, *args, **kwargs)

This is very simple (and you can use similar techniques as described for the middleware above to configure the delay time or add the delay only when settings.DEBUG == True etc) - to actually use it, just inherit your view from this mixin, f.e:

class DelayedSampleListView(TimeDelayMixin, ListView):
    model = Sample

Now whenever you call your DelayedSampleListView you’ll see it after the configured delay!

What is really interesting is that the dispatch method actually exists (and has the same functionality) also in Django Rest Framework CBVs, thus using the same mixin you can add the delay not only your normal CBVs but also your DRF API views!