/var/

Various programming stuff

Splitting a query into individual fields in Django

As you should have already seen in previous articles, I really like using django-filter since it covers (nearly) all my queryset filtering needs. With django-filter, you define a bunch of fields and it will automatically create inputs for each one of these fields so that you can filter by each one of these fields individually or a combination of them.

However, one thing that django-filter (and django in generally) lacks is the ability to filter multiple fields using a single input. This functionality may be familiar to some readers from the datatable jquery plugin. If you take a look at the example in the datatable homepage, you’ll see a single “Search” field. What is really great is that you can enter multiple values (seperated by spaces) into that field and it will filter the individual table values by each one of them. For example, if you enter “2011 Engineer” you’ll see all engineering positions that started on 2011. If you append “Singapore” (so you’ll have “2011 Engineer Singapore”) you’ll also get only the corresponding results!

This functionality is really useful and is very important to have if you use single-input fields to query your data. One such example is if you use autocompletes, for example with django-autocomplete-light: You’ll have a single input however you may need to filter on more than one field to find out your selection.

In the following ost I’ll show you how to implement this functionality using Django and django-filters (actually django-filters will be used to provide the form) - to see it in action you may use the https://github.com/spapas/django_table_filtering repository (check out the /filter_ex/ view).

I won’t go into detail on how the code is structured (it’s really simple) and I’ll go directly to the filter I am using. Instead of using a filter you can of course directly query on your view. What you actually need is:

  • a queryset with the instances you want to search
  • a text value with the query (that may contain spaces)
  • a list of the names of the fields you want to search

In my case, I am using a Book model that has the following fields: id, title, author, category. I have created a filter with a single field named ex that will filter on all these fields. So you should be able to enter “King It” and find “It by Stephen King”. Let’s see how the filter is implemented:

import itertools

class BookFilterEx(django_filters.FilterSet):
    ex = django_filters.MethodFilter()
    search_fields = ['title', 'author', 'category', 'id', ]

    def filter_ex(self, qs, value):
        if value:
            q_parts = value.split()

            # Permutation code copied from http://stackoverflow.com/a/12935562/119071

            list1=self.search_fields
            list2=q_parts
            perms = [zip(x,list2) for x in itertools.permutations(list1,len(list2))]

            q_totals = Q()
            for perm in perms:
                q_part = Q()
                for p in perm:
                    q_part = q_part & Q(**{p[0]+'__icontains': p[1]})
                q_totals = q_totals | q_part

            qs = qs.filter(q_totals)
        return qs

    class Meta:
        model = books.models.Book
        fields = ['ex']

The meat of this code is in the filter_ex method, let’s analyze it line by line: First of all, we split the value to its corresponding parts using the whitespace to sperate into individual tokens. For example if the user has entered King It, q_parts be equal to ['King', 'It']. As you can see the search_fields attribute contains the names of the fields we want to search. The first thing I like to do is to generate all possible combinations between q_parts and search_fields, I’ve copied the list combination code from http://stackoverflow.com/a/12935562/119071 and it is the line perms = [zip(x,list2) for x in itertools.permutations(list1,len(list2))].

The itertools.permutations(list1,len(list2)) will generate all permutations of list1 that have length equal to the length of list2. I.e if list2 is ['King', 'It'] (len=2) then it will generate all combinations of search_fields with length=2, i.e it will generate the following list of tuples:

[
    ('title', 'author'), ('title', 'category'), ('title', 'id'), ('author', 'title'),
    ('author', 'category'), ('author', 'id'), ('category', 'title'), ('category', 'author'),
    ('category', 'id'), ('id', 'title'), ('id', 'author'), ('id', 'category')
]

Now, the zip will combine the elements of each one of these tuples with the elements of list2, so, in our example (list2=['King', 'It']) perms will be the following array:

[
    [('title', 'King'), ('author', 'It')],
    [('title', 'King'), ('category', 'It')],
    [('title', 'King'), ('id', 'It')],
    [('author', 'King'), ('title', 'It')],
    [('author', 'King'), ('category', 'It')],
    [('author', 'King'), ('id', 'It')],
    [('category', 'King'), ('title', 'It')],
    [('category', 'King'), ('author', 'It')],
    [('category', 'King'), ('id', 'It')],
    [('id', 'King'), ('title', 'It')],
    [('id', 'King'), ('author', 'It')],
    [('id', 'King'), ('category', 'It')]
]

Notice that itertools.permutations(list1,len(list2)) will return an empty list if len(list2) > len(list1) - this is actually what we want since that means that the user entered more query parts than the available fields, i.e we can’t match each one of the possible values after we split the input with a search field so we should return nothing.

Now, what I want is to create a single query that will combine the tuples in each of these combinations by AND (i.e title==King AND author==It ) and then combine all these subqueries using OR (i.e “ (title==King AND author==It) OR (title==King AND category==It) OR (title==King AND id==It) OR …“.

This could of course be implemented with a raw sql query however we could use some interesting django tricks for this. I’ve already done something similar to a previous article so I won’t go into much detail explaining the code that creates the q_totals Q object. What it does is that it create a big django Q object that combines using AND (&) all individual q_part objects. Each q_part object combines using OR (|) the individual combinations of field name and value — I’ve used __icontains` to create the query. So the result will be something like this:

q_totals =
    Q(title__icontains='King') & Q(author__icontains='It')
    |
    Q(title__icontains='King') & Q(category__icontains='It')
    |
    Q(title__icontains='King') & Q(id__icontains='It')
    |
    Q(author__icontains='King') & Q(title__icontains='It')
    ...

Filtering by this q_totals will return the correct values!

One extra complication we should be aware of is what happens if the user needs to also search for books with multiple words in their titles. For example, if the user enters “Under the Dome King” or “It Stephen King” or even “The Stand Stephen King” we won’t get any results :(

To fix this, we need to get all possible combinations of sequential substrings, i.e for “Under the Dome King”, after we split it to [‘Under’, ‘the’, ‘Dome’, ‘King’] we’ll need the following combinations:

[
    ['Under', 'the', 'Dome', 'King'],
    ['Under', 'the', 'Dome King'],
    ['Under', 'the Dome', 'King'],
    ['Under', 'the Dome King'],
    ['Under the', 'Dome', 'King'],
    ['Under the', 'Dome King'],
    ['Under the Dome', 'King'],
    ['Under the Dome King']
]

A possible solution for that problem can be found on this SO answer: http://stackoverflow.com/a/27263616/119071.

Now, to extend our solution to include this, we’d need to actually search for each one of the above possiblities and combine again the results with OR, something like this:

def filter_ex(self, qs, value):
    if value:
        q_parts = value.split()

        # Use a global q_totals
        q_totals = Q()

        # This part will get us all possible segmantiation of the query parts and put it in the possibilities list
        combinatorics = itertools.product([True, False], repeat=len(q_parts) - 1)
        possibilities = []
        for combination in combinatorics:
            i = 0
            one_such_combination = [q_parts[i]]
            for slab in combination:
                i += 1
                if not slab: # there is a join
                    one_such_combination[-1] += ' ' + q_parts[i]
                else:
                    one_such_combination += [q_parts[i]]
            possibilities.append(one_such_combination)

        # Now, for all possiblities we'll append all the Q objects using OR
        for p in possibilities:
            list1=self.search_fields
            list2=p
            perms = [zip(x,list2) for x in itertools.permutations(list1,len(list2))]

            for perm in perms:
                q_part = Q()
                for p in perm:
                    q_part = q_part & Q(**{p[0]+'__icontains': p[1]})
                q_totals = q_totals | q_part

        qs = qs.filter(q_totals)
    return qs

The previous filtering code works fine with querise like “The Stand” or “Under the Dome Stephen King”!

One thing that you must be careful is that this code will create very complicated and big queries. For example, searching for “Under the Dome Stephen King” will result to q_totals getting this monster value:

(OR:
(AND: ),
(AND: ('title__icontains', u'Under'), ('author__icontains', u'the'), ('category__icontains', u'Dome'), ('id__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under'), ('author__icontains', u'the'), ('id__icontains', u'Dome'), ('category__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under'), ('category__icontains', u'the'), ('author__icontains', u'Dome'), ('id__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under'), ('category__icontains', u'the'), ('id__icontains', u'Dome'), ('author__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under'), ('id__icontains', u'the'), ('author__icontains', u'Dome'), ('category__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under'), ('id__icontains', u'the'), ('category__icontains', u'Dome'), ('author__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under'), ('title__icontains', u'the'), ('category__icontains', u'Dome'), ('id__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under'), ('title__icontains', u'the'), ('id__icontains', u'Dome'), ('category__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under'), ('category__icontains', u'the'), ('title__icontains', u'Dome'), ('id__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under'), ('category__icontains', u'the'), ('id__icontains', u'Dome'), ('title__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under'), ('id__icontains', u'the'), ('title__icontains', u'Dome'), ('category__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under'), ('id__icontains', u'the'), ('category__icontains', u'Dome'), ('title__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under'), ('title__icontains', u'the'), ('author__icontains', u'Dome'), ('id__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under'), ('title__icontains', u'the'), ('id__icontains', u'Dome'), ('author__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under'), ('author__icontains', u'the'), ('title__icontains', u'Dome'), ('id__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under'), ('author__icontains', u'the'), ('id__icontains', u'Dome'), ('title__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under'), ('id__icontains', u'the'), ('title__icontains', u'Dome'), ('author__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under'), ('id__icontains', u'the'), ('author__icontains', u'Dome'), ('title__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under'), ('title__icontains', u'the'), ('author__icontains', u'Dome'), ('category__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under'), ('title__icontains', u'the'), ('category__icontains', u'Dome'), ('author__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under'), ('author__icontains', u'the'), ('title__icontains', u'Dome'), ('category__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under'), ('author__icontains', u'the'), ('category__icontains', u'Dome'), ('title__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under'), ('category__icontains', u'the'), ('title__icontains', u'Dome'), ('author__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under'), ('category__icontains', u'the'), ('author__icontains', u'Dome'), ('title__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under'), ('author__icontains', u'the'), ('category__icontains', u'Dome Stephen'), ('id__icontains', u'King')),
(AND: ('title__icontains', u'Under'), ('author__icontains', u'the'), ('id__icontains', u'Dome Stephen'), ('category__icontains', u'King')),
(AND: ('title__icontains', u'Under'), ('category__icontains', u'the'), ('author__icontains', u'Dome Stephen'), ('id__icontains', u'King')),
(AND: ('title__icontains', u'Under'), ('category__icontains', u'the'), ('id__icontains', u'Dome Stephen'), ('author__icontains', u'King')),
(AND: ('title__icontains', u'Under'), ('id__icontains', u'the'), ('author__icontains', u'Dome Stephen'), ('category__icontains', u'King')),
(AND: ('title__icontains', u'Under'), ('id__icontains', u'the'), ('category__icontains', u'Dome Stephen'), ('author__icontains', u'King')),
(AND: ('author__icontains', u'Under'), ('title__icontains', u'the'), ('category__icontains', u'Dome Stephen'), ('id__icontains', u'King')),
(AND: ('author__icontains', u'Under'), ('title__icontains', u'the'), ('id__icontains', u'Dome Stephen'), ('category__icontains', u'King')),
(AND: ('author__icontains', u'Under'), ('category__icontains', u'the'), ('title__icontains', u'Dome Stephen'), ('id__icontains', u'King')),
(AND: ('author__icontains', u'Under'), ('category__icontains', u'the'), ('id__icontains', u'Dome Stephen'), ('title__icontains', u'King')),
(AND: ('author__icontains', u'Under'), ('id__icontains', u'the'), 'title__icontains', u'Dome Stephen'), ('category__icontains', u'King')),
(AND: ('author__icontains', u'Under'), ('id__icontains', u'the'), ('category__icontains', u'Dome Stephen'), ('title__icontains', u'King')),
(AND: ('category__icontains', u'Under'), ('title__icontains', u'the'), ('author__icontains', u'Dome Stephen'),('id__icontains', u'King')),
(AND: ('category__icontains', u'Under'), ('title__icontains', u'the'), ('id__icontains', u'Dome Stephen'), ('author__icontains', u'King')),
(AND: ('category__icontains', u'Under'), ('author__icontains', u'the'), ('title__icontains', u'Dome Stephen'), ('id__icontains', u'King')),
(AND: ('category__icontains', u'Under'), ('author__icontains', u'the'), ('id__icontains', u'Dome Stephen'), ('title__icontains', u'King')),
(AND: ('category__icontains', u'Under'), ('id__icontains', u'the'), ('title__icontains', u'Dome Stephen'), ('author__icontains', u'King')),
(AND: ('category__icontains', u'Under'), ('id__icontains', u'the'), ('author__icontains', u'Dome Stephen'), ('title__icontains', u'King')),
(AND: ('id__icontains', u'Under'), ('title__icontains', u'the'), ('author__icontains', u'Dome Stephen'), ('category__icontains', u'King')),
(AND: ('id__icontains', u'Under'), ('title__icontains', u'the'), ('category__icontains', u'Dome Stephen'), ('author__icontains', u'King')),
(AND: ('id__icontains', u'Under'), ('author__icontains', u'the'), ('title__icontains', u'Dome Stephen'), ('category__icontains', u'King')),
(AND: ('id__icontains', u'Under'), ('author__icontains', u'the'), ('category__icontains', u'Dome Stephen'), ('title__icontains', u'King')),
(AND: ('id__icontains', u'Under'), ('category__icontains', u'the'), ('title__icontains', u'Dome Stephen'), ('author__icontains', u'King')),
(AND: ('id__icontains', u'Under'), ('category__icontains', u'the'), ('author__icontains', u'Dome Stephen'), ('title__icontains', u'King')),
(AND: ('title__icontains', u'Under'), ('author__icontains', u'the'), ('category__icontains', u'Dome Stephen King')),
(AND: ('title__icontains', u'Under'), ('author__icontains', u'the'), ('id__icontains', u'Dome Stephen King')),
(AND: ('title__icontains', u'Under'), ('category__icontains', u'the'), ('author__icontains', u'Dome Stephen King')),
(AND: ('title__icontains', u'Under'), ('category__icontains', u'the'), ('id__icontains', u'Dome Stephen King')),
(AND: ('title__icontains', u'Under'), ('id__icontains', u'the'), ('author__icontains', u'Dome Stephen King')),
(AND: ('title__icontains', u'Under'), ('id__icontains', u'the'), ('category__icontains', u'Dome Stephen King')),
(AND: ('author__icontains', u'Under'), ('title__icontains', u'the'), ('category__icontains', u'Dome Stephen King')),
(AND: ('author__icontains', u'Under'), ('title__icontains', u'the'), ('id__icontains', u'Dome Stephen King')),
(AND: ('author__icontains', u'Under'), ('category__icontains', u'the'), ('title__icontains', u'Dome Stephen King')),
(AND: ('author__icontains', u'Under'), ('category__icontains', u'the'), ('id__icontains', u'Dome Stephen King')),
(AND: ('author__icontains', u'Under'), ('id__icontains', u'the'), ('title__icontains', u'Dome Stephen King')),
(AND: ('author__icontains', u'Under'), ('id__icontains', u'the'), ('category__icontains', u'Dome Stephen King')),
(AND: ('category__icontains', u'Under'), ('title__icontains', u'the'), ('author__icontains', u'Dome Stephen King')),
(AND: ('category__icontains', u'Under'), ('title__icontains', u'the'), ('id__icontains', u'Dome Stephen King')),
(AND: ('category__icontains', u'Under'), ('author__icontains', u'the'), ('title__icontains', u'Dome Stephen King')),
(AND: ('category__icontains', u'Under'), ('author__icontains', u'the'), ('id__icontains', u'Dome Stephen King')),
(AND: ('category__icontains', u'Under'), ('id__icontains', u'the'), ('title__icontains', u'Dome Stephen King')),
(AND: ('category__icontains', u'Under'), ('id__icontains', u'the'), ('author__icontains', u'Dome Stephen King')),
(AND: ('id__icontains', u'Under'), ('title__icontains', u'the'), ('author__icontains', u'Dome Stephen King')),
(AND: ('id__icontains', u'Under'), ('title__icontains', u'the'), ('category__icontains', u'Dome Stephen King')),
(AND: ('id__icontains', u'Under'), ('author__icontains', u'the'), ('title__icontains', u'Dome Stephen King')),
(AND: ('id__icontains', u'Under'), ('author__icontains', u'the'), ('category__icontains', u'Dome Stephen King')),
(AND: ('id__icontains', u'Under'), ('category__icontains', u'the'), ('title__icontains', u'Dome Stephen King')),
(AND: ('id__icontains', u'Under'), ('category__icontains', u'the'), ('author__icontains', u'Dome Stephen King')),
(AND: ('title__icontains', u'Under'), ('author__icontains', u'the Dome'), ('category__icontains', u'Stephen'), ('id__icontains', u'King')),
(AND: ('title__icontains', u'Under'), ('author__icontains', u'the Dome'), ('id__icontains', u'Stephen'), ('category__icontains', u'King')),
(AND: ('title__icontains', u'Under'), ('category__icontains', u'the Dome'), ('author__icontains', u'Stephen'), ('id__icontains', u'King')),
(AND: ('title__icontains', u'Under'), ('category__icontains', u'the Dome'), ('id__icontains', u'Stephen'), ('author__icontains', u'King')),
(AND: ('title__icontains', u'Under'), ('id__icontains', u'the Dome'), ('author__icontains', u'Stephen'), ('category__icontains', u'King')),
(AND: ('title__icontains', u'Under'), ('id__icontains', u'the Dome'), ('category__icontains', u'Stephen'), ('author__icontains', u'King')),
(AND: ('author__icontains', u'Under'), ('title__icontains', u'the Dome'), ('category__icontains', u'Stephen'), ('id__icontains', u'King')),
(AND: ('author__icontains', u'Under'), ('title__icontains', u'the Dome'), ('id__icontains', u'Stephen'), ('category__icontains', u'King')),
(AND: ('author__icontains', u'Under'), ('category__icontains', u'the Dome'), ('title__icontains', u'Stephen'), ('id__icontains', u'King')),
(AND: ('author__icontains', u'Under'), ('category__icontains', u'the Dome'), ('id__icontains', u'Stephen'), ('title__icontains', u'King')),
(AND: ('author__icontains', u'Under'), ('id__icontains', u'the Dome'), ('title__icontains', u'Stephen'), ('category__icontains', u'King')),
(AND: ('author__icontains', u'Under'), ('id__icontains', u'the Dome'), ('category__icontains', u'Stephen'), ('title__icontains', u'King')),
(AND: ('category__icontains', u'Under'), ('title__icontains', u'the Dome'), ('author__icontains', u'Stephen'), ('id__icontains', u'King')),
(AND: ('category__icontains', u'Under'), ('title__icontains', u'the Dome'), ('id__icontains', u'Stephen'), ('author__icontains', u'King')),
(AND: ('category__icontains', u'Under'), ('author__icontains', u'the Dome'), ('title__icontains', u'Stephen'), ('id__icontains', u'King')),
(AND: ('category__icontains', u'Under'), ('author__icontains', u'the Dome'), ('id__icontains', u'Stephen'), ('title__icontains', u'King')),
(AND: ('category__icontains', u'Under'), ('id__icontains', u'the Dome'), ('title__icontains', u'Stephen'), ('author__icontains', u'King')),
(AND: ('category__icontains', u'Under'), ('id__icontains', u'the Dome'), ('author__icontains', u'Stephen'), ('title__icontains', u'King')),
(AND: ('id__icontains', u'Under'), ('title__icontains', u'the Dome'), ('author__icontains', u'Stephen'), ('category__icontains', u'King')),
(AND: ('id__icontains', u'Under'), ('title__icontains', u'the Dome'), ('category__icontains', u'Stephen'), ('author__icontains', u'King')),
(AND: ('id__icontains', u'Under'), ('author__icontains', u'the Dome'), ('title__icontains', u'Stephen'), ('category__icontains', u'King')),
(AND: ('id__icontains', u'Under'), ('author__icontains', u'the Dome'), ('category__icontains', u'Stephen'), ('title__icontains', u'King')),
(AND: ('id__icontains', u'Under'), ('category__icontains', u'the Dome'), ('title__icontains', u'Stephen'), ('author__icontains', u'King')),
(AND: ('id__icontains', u'Under'), ('category__icontains', u'the Dome'), ('author__icontains', u'Stephen'), ('title__icontains', u'King')),
(AND: ('title__icontains', u'Under'), ('author__icontains', u'the Dome'), ('category__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under'), ('author__icontains', u'the Dome'), ('id__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under'), ('category__icontains', u'the Dome'), ('author__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under'), ('category__icontains', u'the Dome'), ('id__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under'), ('id__icontains', u'the Dome'), ('author__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under'), ('id__icontains', u'the Dome'), ('category__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under'), ('title__icontains', u'the Dome'), ('category__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under'), ('title__icontains', u'the Dome'), ('id__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under'), ('category__icontains', u'the Dome'), ('title__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under'), ('category__icontains', u'the Dome'), ('id__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under'), ('id__icontains', u'the Dome'), ('title__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under'), ('id__icontains', u'the Dome'), ('category__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under'), ('title__icontains', u'the Dome'), ('author__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under'), ('title__icontains', u'the Dome'), ('id__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under'), ('author__icontains', u'the Dome'), ('title__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under'), ('author__icontains', u'the Dome'), ('id__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under'), ('id__icontains', u'the Dome'), ('title__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under'), ('id__icontains', u'the Dome'), ('author__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under'), ('title__icontains', u'the Dome'), ('author__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under'), ('title__icontains', u'the Dome'), ('category__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under'), ('author__icontains', u'the Dome'), ('title__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under'), ('author__icontains', u'the Dome'), ('category__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under'), ('category__icontains', u'the Dome'), ('title__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under'), ('category__icontains', u'the Dome'), ('author__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under'), ('author__icontains', u'the Dome Stephen'), ('category__icontains', u'King')),
(AND: ('title__icontains', u'Under'), ('author__icontains', u'the Dome Stephen'), ('id__icontains', u'King')),
(AND: ('title__icontains', u'Under'), ('category__icontains', u'the Dome Stephen'), ('author__icontains', u'King')),
(AND: ('title__icontains', u'Under'), ('category__icontains', u'the Dome Stephen'), ('id__icontains', u'King')),
(AND: ('title__icontains', u'Under'), ('id__icontains', u'the Dome Stephen'), ('author__icontains', u'King')),
(AND: ('title__icontains', u'Under'), ('id__icontains', u'the Dome Stephen'), ('category__icontains', u'King')),
(AND: ('author__icontains', u'Under'), ('title__icontains', u'the Dome Stephen'), ('category__icontains', u'King')),
(AND: ('author__icontains', u'Under'), ('title__icontains', u'the Dome Stephen'), ('id__icontains', u'King')),
(AND: ('author__icontains', u'Under'), ('category__icontains', u'the Dome Stephen'), ('title__icontains', u'King')),
(AND: ('author__icontains', u'Under'), ('category__icontains', u'the Dome Stephen'), ('id__icontains', u'King')),
(AND: ('author__icontains', u'Under'), ('id__icontains', u'the Dome Stephen'), ('title__icontains', u'King')),
(AND: ('author__icontains', u'Under'), ('id__icontains', u'the Dome Stephen'), ('category__icontains', u'King')),
(AND: ('category__icontains', u'Under'), ('title__icontains', u'the Dome Stephen'), ('author__icontains', u'King')),
(AND: ('category__icontains', u'Under'), ('title__icontains', u'the Dome Stephen'), ('id__icontains', u'King')),
(AND: ('category__icontains', u'Under'), ('author__icontains', u'the Dome Stephen'), ('title__icontains', u'King')),
(AND: ('category__icontains', u'Under'), ('author__icontains', u'the Dome Stephen'), ('id__icontains', u'King')),
(AND: ('category__icontains', u'Under'), ('id__icontains', u'the Dome Stephen'), ('title__icontains', u'King')),
(AND: ('category__icontains', u'Under'), ('id__icontains', u'the Dome Stephen'), ('author__icontains', u'King')),
(AND: ('id__icontains', u'Under'), ('title__icontains', u'the Dome Stephen'), ('author__icontains', u'King')),
(AND: ('id__icontains', u'Under'), ('title__icontains', u'the Dome Stephen'), ('category__icontains', u'King')),
(AND: ('id__icontains', u'Under'), ('author__icontains', u'the Dome Stephen'), ('title__icontains', u'King')),
(AND: ('id__icontains', u'Under'), ('author__icontains', u'the Dome Stephen'), ('category__icontains', u'King')),
(AND: ('id__icontains', u'Under'), ('category__icontains', u'the Dome Stephen'), ('title__icontains', u'King')),
(AND: ('id__icontains', u'Under'), ('category__icontains', u'the Dome Stephen'), ('author__icontains', u'King')),
(AND: ('title__icontains', u'Under'), ('author__icontains', u'the Dome Stephen King')),
(AND: ('title__icontains', u'Under'), ('category__icontains', u'the Dome Stephen King')),
(AND: ('title__icontains', u'Under'), ('id__icontains', u'the Dome Stephen King')),
(AND: ('author__icontains', u'Under'), ('title__icontains', u'the Dome Stephen King')),
(AND: ('author__icontains', u'Under'), ('category__icontains', u'the Dome Stephen King')),
(AND: ('author__icontains', u'Under'), ('id__icontains', u'the Dome Stephen King')),
(AND: ('category__icontains', u'Under'), ('title__icontains', u'the Dome Stephen King')),
(AND: ('category__icontains', u'Under'), ('author__icontains', u'the Dome Stephen King')),
(AND: ('category__icontains', u'Under'), ('id__icontains', u'the Dome Stephen King')),
(AND: ('id__icontains', u'Under'), ('title__icontains', u'the Dome Stephen King')),
(AND: ('id__icontains', u'Under'), ('author__icontains', u'the Dome Stephen King')),
(AND: ('id__icontains', u'Under'), ('category__icontains', u'the Dome Stephen King')),
(AND: ('title__icontains', u'Under the'), ('author__icontains', u'Dome'), ('category__icontains', u'Stephen'), ('id__icontains', u'King')),
(AND: ('title__icontains', u'Under the'), ('author__icontains', u'Dome'), ('id__icontains', u'Stephen'), ('category__icontains', u'King')),
(AND: ('title__icontains', u'Under the'), ('category__icontains', u'Dome'), ('author__icontains', u'Stephen'), ('id__icontains', u'King')),
(AND: ('title__icontains', u'Under the'), ('category__icontains', u'Dome'), ('id__icontains', u'Stephen'), ('author__icontains', u'King')),
(AND: ('title__icontains', u'Under the'), ('id__icontains', u'Dome'), ('author__icontains', u'Stephen'), ('category__icontains', u'King')),
(AND: ('title__icontains', u'Under the'), ('id__icontains', u'Dome'), ('category__icontains', u'Stephen'), ('author__icontains', u'King')),
(AND: ('author__icontains', u'Under the'), ('title__icontains', u'Dome'), ('category__icontains', u'Stephen'), ('id__icontains', u'King')),
(AND: ('author__icontains', u'Under the'), ('title__icontains', u'Dome'), ('id__icontains', u'Stephen'), ('category__icontains', u'King')),
(AND: ('author__icontains', u'Under the'), ('category__icontains', u'Dome'), ('title__icontains', u'Stephen'), ('id__icontains', u'King')),
(AND: ('author__icontains', u'Under the'), ('category__icontains', u'Dome'), ('id__icontains', u'Stephen'), ('title__icontains', u'King')),
(AND: ('author__icontains', u'Under the'), ('id__icontains', u'Dome'), ('title__icontains', u'Stephen'), ('category__icontains', u'King')),
(AND: ('author__icontains', u'Under the'), ('id__icontains', u'Dome'), ('category__icontains', u'Stephen'), ('title__icontains', u'King')),
(AND: ('category__icontains', u'Under the'), ('title__icontains', u'Dome'), ('author__icontains', u'Stephen'), ('id__icontains', u'King')),
(AND: ('category__icontains', u'Under the'), ('title__icontains', u'Dome'), ('id__icontains', u'Stephen'), ('author__icontains', u'King')),
(AND: ('category__icontains', u'Under the'), ('author__icontains', u'Dome'), ('title__icontains', u'Stephen'), ('id__icontains', u'King')),
(AND: ('category__icontains', u'Under the'), ('author__icontains', u'Dome'), ('id__icontains', u'Stephen'), ('title__icontains', u'King')),
(AND: ('category__icontains', u'Under the'), ('id__icontains', u'Dome'), ('title__icontains', u'Stephen'), ('author__icontains', u'King')),
(AND: ('category__icontains', u'Under the'), ('id__icontains', u'Dome'), ('author__icontains', u'Stephen'), ('title__icontains', u'King')),
(AND: ('id__icontains', u'Under the'), ('title__icontains', u'Dome'), ('author__icontains', u'Stephen'), ('category__icontains', u'King')),
(AND: ('id__icontains', u'Under the'), ('title__icontains', u'Dome'), ('category__icontains', u'Stephen'), ('author__icontains', u'King')),
(AND: ('id__icontains', u'Under the'), ('author__icontains', u'Dome'), ('title__icontains', u'Stephen'), ('category__icontains', u'King')),
(AND: ('id__icontains', u'Under the'), ('author__icontains', u'Dome'), ('category__icontains', u'Stephen'), ('title__icontains', u'King')),
(AND: ('id__icontains', u'Under the'), ('category__icontains', u'Dome'), ('title__icontains', u'Stephen'), ('author__icontains', u'King')),
(AND: ('id__icontains', u'Under the'), ('category__icontains', u'Dome'), ('author__icontains', u'Stephen'), ('title__icontains', u'King')),
(AND: ('title__icontains', u'Under the'), ('author__icontains', u'Dome'), ('category__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under the'), ('author__icontains', u'Dome'), ('id__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under the'), ('category__icontains', u'Dome'), ('author__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under the'), ('category__icontains', u'Dome'), ('id__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under the'), ('id__icontains', u'Dome'), ('author__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under the'), ('id__icontains', u'Dome'), ('category__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under the'), ('title__icontains', u'Dome'), ('category__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under the'), ('title__icontains', u'Dome'), ('id__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under the'), ('category__icontains', u'Dome'), ('title__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under the'), ('category__icontains', u'Dome'), ('id__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under the'), ('id__icontains', u'Dome'), ('title__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under the'), ('id__icontains', u'Dome'), ('category__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under the'), ('title__icontains', u'Dome'), ('author__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under the'), ('title__icontains', u'Dome'), ('id__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under the'), ('author__icontains', u'Dome'), ('title__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under the'), ('author__icontains', u'Dome'), ('id__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under the'), ('id__icontains', u'Dome'), ('title__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under the'), ('id__icontains', u'Dome'), ('author__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under the'), ('title__icontains', u'Dome'), ('author__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under the'), ('title__icontains', u'Dome'), ('category__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under the'), ('author__icontains', u'Dome'), ('title__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under the'), ('author__icontains', u'Dome'), ('category__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under the'), ('category__icontains', u'Dome'), ('title__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under the'), ('category__icontains', u'Dome'), ('author__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under the'), ('author__icontains', u'Dome Stephen'), ('category__icontains', u'King')),
(AND: ('title__icontains', u'Under the'), ('author__icontains', u'Dome Stephen'), ('id__icontains', u'King')),
(AND: ('title__icontains', u'Under the'), ('category__icontains', u'Dome Stephen'), ('author__icontains', u'King')),
(AND: ('title__icontains', u'Under the'), ('category__icontains', u'Dome Stephen'), ('id__icontains', u'King')),
(AND: ('title__icontains', u'Under the'), ('id__icontains', u'Dome Stephen'), ('author__icontains', u'King')),
(AND: ('title__icontains', u'Under the'), ('id__icontains', u'Dome Stephen'), ('category__icontains', u'King')),
(AND: ('author__icontains', u'Under the'), ('title__icontains', u'Dome Stephen'), ('category__icontains', u'King')),
(AND: ('author__icontains', u'Under the'), ('title__icontains', u'Dome Stephen'), ('id__icontains', u'King')),
(AND: ('author__icontains', u'Under the'), ('category__icontains', u'Dome Stephen'), ('title__icontains', u'King')),
(AND: ('author__icontains', u'Under the'), ('category__icontains', u'Dome Stephen'), ('id__icontains', u'King')),
(AND: ('author__icontains', u'Under the'), ('id__icontains', u'Dome Stephen'), ('title__icontains', u'King')),
(AND: ('author__icontains', u'Under the'), ('id__icontains', u'Dome Stephen'), ('category__icontains', u'King')),
(AND: ('category__icontains', u'Under the'), ('title__icontains', u'Dome Stephen'), ('author__icontains', u'King')),
(AND: ('category__icontains', u'Under the'), ('title__icontains', u'Dome Stephen'), ('id__icontains', u'King')),
(AND: ('category__icontains', u'Under the'), ('author__icontains', u'Dome Stephen'), ('title__icontains', u'King')),
(AND: ('category__icontains', u'Under the'), ('author__icontains', u'Dome Stephen'), ('id__icontains', u'King')),
(AND: ('category__icontains', u'Under the'), ('id__icontains', u'Dome Stephen'), ('title__icontains', u'King')),
(AND: ('category__icontains', u'Under the'), ('id__icontains', u'Dome Stephen'), ('author__icontains', u'King')),
(AND: ('id__icontains', u'Under the'), ('title__icontains', u'Dome Stephen'), ('author__icontains', u'King')),
(AND: ('id__icontains', u'Under the'), ('title__icontains', u'Dome Stephen'), ('category__icontains', u'King')),
(AND: ('id__icontains', u'Under the'), ('author__icontains', u'Dome Stephen'), ('title__icontains', u'King')),
(AND: ('id__icontains', u'Under the'), ('author__icontains', u'Dome Stephen'), ('category__icontains', u'King')),
(AND: ('id__icontains', u'Under the'), ('category__icontains', u'Dome Stephen'), ('title__icontains', u'King')),
(AND: ('id__icontains', u'Under the'), ('category__icontains', u'Dome Stephen'), ('author__icontains', u'King')),
(AND: ('title__icontains', u'Under the'), ('author__icontains', u'Dome Stephen King')),
(AND: ('title__icontains', u'Under the'), ('category__icontains', u'Dome Stephen King')),
(AND: ('title__icontains', u'Under the'), ('id__icontains', u'Dome Stephen King')),
(AND: ('author__icontains', u'Under the'), ('title__icontains', u'Dome Stephen King')),
(AND: ('author__icontains', u'Under the'), ('category__icontains', u'Dome Stephen King')),
(AND: ('author__icontains', u'Under the'), ('id__icontains', u'Dome Stephen King')),
(AND: ('category__icontains', u'Under the'), ('title__icontains', u'Dome Stephen King')),
(AND: ('category__icontains', u'Under the'), ('author__icontains', u'Dome Stephen King')),
(AND: ('category__icontains', u'Under the'), ('id__icontains', u'Dome Stephen King')),
(AND: ('id__icontains', u'Under the'), ('title__icontains', u'Dome Stephen King')),
(AND: ('id__icontains', u'Under the'), ('author__icontains', u'Dome Stephen King')),
(AND: ('id__icontains', u'Under the'), ('category__icontains', u'Dome Stephen King')),
(AND: ('title__icontains', u'Under the Dome'), ('author__icontains', u'Stephen'), ('category__icontains', u'King')),
(AND: ('title__icontains', u'Under the Dome'), ('author__icontains', u'Stephen'), ('id__icontains', u'King')),
(AND: ('title__icontains', u'Under the Dome'), ('category__icontains', u'Stephen'), ('author__icontains', u'King')),
(AND: ('title__icontains', u'Under the Dome'), ('category__icontains', u'Stephen'), ('id__icontains', u'King')),
(AND: ('title__icontains', u'Under the Dome'), ('id__icontains', u'Stephen'), ('author__icontains', u'King')),
(AND: ('title__icontains', u'Under the Dome'), ('id__icontains', u'Stephen'), ('category__icontains', u'King')),
(AND: ('author__icontains', u'Under the Dome'), ('title__icontains', u'Stephen'), ('category__icontains', u'King')),
(AND: ('author__icontains', u'Under the Dome'), ('title__icontains', u'Stephen'), ('id__icontains', u'King')),
(AND: ('author__icontains', u'Under the Dome'), ('category__icontains', u'Stephen'), ('title__icontains', u'King')),
(AND: ('author__icontains', u'Under the Dome'), ('category__icontains', u'Stephen'), ('id__icontains', u'King')),
(AND: ('author__icontains', u'Under the Dome'), ('id__icontains', u'Stephen'), ('title__icontains', u'King')),
(AND: ('author__icontains', u'Under the Dome'), ('id__icontains', u'Stephen'), ('category__icontains', u'King')),
(AND: ('category__icontains', u'Under the Dome'), ('title__icontains', u'Stephen'), ('author__icontains', u'King')),
(AND: ('category__icontains', u'Under the Dome'), ('title__icontains', u'Stephen'), ('id__icontains', u'King')),
(AND: ('category__icontains', u'Under the Dome'), ('author__icontains', u'Stephen'), ('title__icontains', u'King')),
(AND: ('category__icontains', u'Under the Dome'), ('author__icontains', u'Stephen'), ('id__icontains', u'King')),
(AND: ('category__icontains', u'Under the Dome'), ('id__icontains', u'Stephen'), ('title__icontains', u'King')),
(AND: ('category__icontains', u'Under the Dome'), ('id__icontains', u'Stephen'), ('author__icontains', u'King')),
(AND: ('id__icontains', u'Under the Dome'), ('title__icontains', u'Stephen'), ('author__icontains', u'King')),
(AND: ('id__icontains', u'Under the Dome'), ('title__icontains', u'Stephen'), ('category__icontains', u'King')),
(AND: ('id__icontains', u'Under the Dome'), ('author__icontains', u'Stephen'), ('title__icontains', u'King')),
(AND: ('id__icontains', u'Under the Dome'), ('author__icontains', u'Stephen'), ('category__icontains', u'King')),
(AND: ('id__icontains', u'Under the Dome'), ('category__icontains', u'Stephen'), ('title__icontains', u'King')),
(AND: ('id__icontains', u'Under the Dome'), ('category__icontains', u'Stephen'), ('author__icontains', u'King')),
(AND: ('title__icontains', u'Under the Dome'), ('author__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under the Dome'), ('category__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under the Dome'), ('id__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under the Dome'), ('title__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under the Dome'), ('category__icontains', u'Stephen King')),
(AND: ('author__icontains', u'Under the Dome'), ('id__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under the Dome'), ('title__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under the Dome'), ('author__icontains', u'Stephen King')),
(AND: ('category__icontains', u'Under the Dome'), ('id__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under the Dome'), ('title__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under the Dome'), ('author__icontains', u'Stephen King')),
(AND: ('id__icontains', u'Under the Dome'), ('category__icontains', u'Stephen King')),
(AND: ('title__icontains', u'Under the Dome Stephen'), ('author__icontains', u'King')),
(AND: ('title__icontains', u'Under the Dome Stephen'), ('category__icontains', u'King')),
(AND: ('title__icontains', u'Under the Dome Stephen'), ('id__icontains', u'King')),
(AND: ('author__icontains', u'Under the Dome Stephen'), ('title__icontains', u'King')),
(AND: ('author__icontains', u'Under the Dome Stephen'), ('category__icontains', u'King')),
(AND: ('author__icontains', u'Under the Dome Stephen'), ('id__icontains', u'King')),
(AND: ('category__icontains', u'Under the Dome Stephen'), ('title__icontains', u'King')),
(AND: ('category__icontains', u'Under the Dome Stephen'), ('author__icontains', u'King')),
(AND: ('category__icontains', u'Under the Dome Stephen'), ('id__icontains', u'King')),
(AND: ('id__icontains', u'Under the Dome Stephen'), ('title__icontains', u'King')),
(AND: ('id__icontains', u'Under the Dome Stephen'), ('author__icontains', u'King')),
(AND: ('id__icontains', u'Under the Dome Stephen'), ('category__icontains', u'King')),
('title__icontains', u'Under the Dome Stephen King'),
('author__icontains', u'Under the Dome Stephen King'),
('category__icontains', u'Under the Dome Stephen King'),
('id__icontains', u'Under the Dome Stephen King')
)

This query has around 200 different OR parts!!! So please be careful on the amount of search fields you’ll enable to works with this method or your database will really struggle!

How to download all images of an imgur album

Recently I stubmled upon a great imgur album that contained 379 movie stills that could be used for desktop background. I really liked the idea and wanted to download all the images in order to put them in a folder and use them as a slideshow for my Windows desktop background.

Downloading them one by one would be considered penal labour so I tried to find out an automatic way to get them all. With some research in google, I found out an old post with the hint that by appending /zip to the URL you could get a zip with all the images — this didn’t work for me. I also tried various browser tools for scrapping or downloading all images from a page but they didn’t work also (they could only download a small number of the images and not all).

This seemed strange to me until I understood how imgur loads its images by “inspecting” an image and taking a look at the page’s DOM structure through the console:

How imgur loads images

As we can see, the imgur client-side code has a component with a post-images class that contains the visible images (and thoese that are above/below the visible images). When the user scrolls up/down the contents of post-images will be changed accordingly (notice how the component with id=EKMGEPc moves down when I scroll up). What this means is that each time there are 3-4 images (this actually depends on your window size) under post-images that are changed when you scroll — that’s why downloaders / scrappers are not working (since these tools just inspect the DOM they only see these 3-4 images to download).

Another interesting observation is that if you take a look at the network tab when you scroll app down you won’t see any ajax calls (the only network calls are the images that are downloaded when they are appended to the DOM). So this means that somewhere there’s an array that is loaded when the page is loaded and contains all the images of the album. If we can access this array then we’d be able to get all the URLs of the images…

From a quick look at the DOM structure we can understand that this is a React application (components have a data-reactid attribute). So I tried the React Developer Tools extension to see if I could find anything insteresting. Here’s the output:

Imgur - react dev tools

As you can see, there seem to be 4 top-level react elements — the interesting one is GalleryPost. If you take a look at its props (in the right hand side of the react-devtools) you’ll see that it has an album_image_store property which also seems interesting (it should be the image store for this album). After searching a bit its attributes you’ll see that it has a _ child attribute, which has a posts child attribute which has an aoi3T attribute (notice that this is similar to the URL id of the album) and, finally this has an images attribute with objects describing all the images of that album \o/!

Now we need to get our hands on that images array contents. Unfortunately, right clicking doesn’t seem to do anything from react-dev-tools and there doesn’t seem a way to copy data from that panel… However, in the upper right position of that window you’ll see the hint ($r in the console) which means that the selected react component is available as $r in the normal javascript console - so by entering

copy($r.props.album_image_store._.posts.aoi3T.images)

I was able to copy the images of the album to my clipboard (please notice that $r will have the value of the selected react component so, before trying it you must select the GalleryPost component in the react-dev-tools tab)!

I dumped this to a file to take a look at it - it is really easy to interpret it:

[
  {
    "hash": "MQplfkV",
    "title": "2001: A Space Odyssey",
    "description": "Cinematographer: Geoffrey Unsworth\n\nsource:\nhttp://www.filmcaptures.com/2001-a-space-odyssey/",
    "width": 1920,
    "height": 864,
    "size": 2262862,
    "ext": ".png",
    "animated": false,
    "prefer_video": false,
    "looping": false,
    "datetime": "2014-10-25 04:02:58",
    "thumbsize": "g",
    "minHeight": 306,
    "shown": true,
    "containerHeight": 501
  },
..

The imgur images have a URL of http//i.imgur.com/{hash}{ext} so, we can use the following small python 2 program to download all images from that album:

import requests
import json
from slugify import slugify

# Modified from http://stackoverflow.com/a/16696317/119071
def download_file(url, local_filename):
    r = requests.get(url, stream=True)
    with open(local_filename, 'wb') as f:
        for chunk in r.iter_content(chunk_size=1024):
            if chunk: # filter out keep-alive new chunks
                f.write(chunk)
                #f.flush() commented by recommendation from J.F.Sebastian
    return local_filename


if __name__ == '__main__':
    for i, jo in enumerate(json.loads(open("album.txt").read())):
        filename = '{0}-{1}{2}'.format(slugify(jo['title']), i+1, jo['ext'])
        url = 'http://i.imgur.com/{0}{1}'.format(jo['hash'].strip(), jo['ext'])
        print filename, url
        download_file(url, filename)

Notice that the above uses the requests library to retrieve the files and the python-slugify library to generate a filename using the image title so these libraries must be installed by using pip install requests python-slugify. This will read a file named album.txt that should contain the copied imgur album images in the same directory and download all the images.

Disclaimer The above methodology works today (27-06-2016) - probably it will stop working sometime in the future, when imgur changes its image loading algorithm or its image object representation. Also, I haven’t been able to find a way to quickly access the GalleryPost react component from the javascript console - you need to install the react dev tools and select that component from there so that you’ll have the $r reference to it in the javascript console. Finally, don’t forget to change the copy($r.props.album_image_store._.posts.aoi3T.images) depending on your album id (also if the id is not a valid identifier, for example it starts with number, use copy($r.props.album_image_store._.posts['aoi3T'].images).

Using Werkzeug debugger with Django

Introduction

Werkzeug is a WSGI utility library for Python. Beyond others, it includes an interactive debugger - what this means is that when your python application throws an exception, Werkzeug will display the exception stacktrace in the browser (that’s not a big deal) and allow you to write python commands interactively wherever you want in that stacktrace (that’s the important stuff).

Now, the even more important stuff is that you can abuse the above feature by adding code that will throw an exception in various parts of your application and, as a result get an interactive python prompt at specific parts of your application (for example, before validating your form, or when a method in your model is executed). All this, without the need to use a specific IDE to add breakpoints!

This is an old trick however some people don’t use it and make their work more difficult. Actually, this one of the first things I learned when starting with django and use it all the time since then - I am writing this post mainly to emphasize its usefulness and to urge more people to use it. If you don’t already use it please try it (and thank me later).

Configuration

There are two components you need to install in your django project to use the above technique:

  • django-extensions: a swiss army knife toolset for django - beyond other useful tools it includes a management command (runserver_plus) to start the Werkzeug interactive debugger with your project
  • werkzeug: the werkzeug utility library

Both of these can just be installed with pip (even on windows). After installing them, add django_extensions to your INSTALLED_APPS setting to enable the management command.

After that, you can just run python manage.py runserver_plus - if everything was installed successfully you should see something like this (in windows at least):

(venv) C:\progr\py\werkzeug\testdebug>python manage.py runserver_plus
 * Restarting with stat
Performing system checks...

System check identified no issues (0 silenced).

Django version 1.9.7, using settings 'testdebug.settings'
Development server is running at http://127.0.0.1:8000/
Using the Werkzeug debugger (http://werkzeug.pocoo.org/)
Quit the server with CTRL-BREAK.
 * Debugger is active!
 * Debugger pin code: 143-738-172
 * Debugger is active!
 * Debugger pin code: 174-740-467
 * Running on http://127.0.0.1:8000/ (Press CTRL+C to quit)

Now, the “debugger pin” you see is a way to protect your interactive debugger (i.e it asks for the pin before allowing you to enter the interactive prompt). Since this feature should only be used in your local development system I recommend to just disable it by setting the WERKZEUG_DEBUG_PIN environment variable to off (i.e set WERKZEUG_DEBUG_PIN=off in windows). After that you should see the message “ * Debugger pin disabled. DEBUGGER UNSECURED!“. Please be careful with the interactive debugger and never, ever use it in a production deployment even with the debug pin enabled. I also recommend to use it only on a local development server (i.e the server must be run on 127.0.0.1/local IP and not allow remote connections).

Usage

Now its time for the magic: Let’s add a django view that throws an exception, like this:

def test(request):
    a+=1

to your urls.py ( url(r'^test/', test ) ) and after you visit test you should see something like this:

Werkzeug debugger

Since the a variable was not defined you’ll get an exception when you try to increaseit. Now, notice the console icon in the lower right corner - when you click it you’ll get the interactive debugger! Now you can enter python commands exactly where the a+=1 code was. For example, you can see what are the attributes of the request object you receive (for example, just enter request.GET to output the GET dictionary to the interactive console).

Notice that you can get interactive consoles wherever you want in the stacktrace, i.e I could get a console at line 147 of django.core.handlers.base module on the get_response method — this is needed sometimes especially when you want to see how your code is called by other modules.

Conclusion

As you can see, using the presented technique you can really quickly start an interactive console wherever you want and start entering commands. I use it whenever I need to write anything non trivial (or even trivial stuff - I sometimes prefer opening and interactive debugger to find out by trial and error how should I write a django ORM query than open models.py) and really miss it on other environments (Java).

The above technique should also work with few modifications with other python web frameworks so it’s not django-only.

Finally, please notice that both Werkzeug and django-extensions offer many more tools beyond the interactive debugger presented here - I encourage you to research them since - if you follow my advice - you’ll integrate these to all your django projects!

Understanding nested list comprehension syntax in Python

List comprehensions are one of the really nice and powerful features of Python. It is actually a smart way to introduce new users to functional programming concepts (after all a list comprehension is just a combination of map and filter) and compact statements.

However, one thing that always troubled me when using list comprehensions is their non intuitive syntax when nesting was needed. For example, let’s say that we just want to flatten a list of lists using a nested list comprehension:

non_flat = [ [1,2,3], [4,5,6], [7,8] ]

To write that, somebody would think: For a simple list comprehension I need to write [ x for x in non_flat ] to get all its items - however I want to retrieve each element of the x list so I’ll write something like this:

>>> [y for y in x for x in non_flat]
[7, 7, 7, 8, 8, 8]

Well duh! At this time I’d need research google for a working list comprehension syntax and adjust it to my needs (or give up and write it as a double for loop).

Here’s the correct nested list comprehension people wondering:

>>> [y for x in non_flat for y in x]
[1, 2, 3, 4, 5, 6, 7, 8]

What if I wanted to add a third level of nesting or an if? Well I’d just bite the bullet and use for loops!

However, if you take a look at the document describing list comprehensions in python (PEP 202) you’ll see the following phrase:

It is proposed to allow conditional construction of list literals using for and if clauses. They would nest in the same way for loops and if statements nest now.

This statement explains everything! Just think in for-loops syntax. So, If I used for loops for the previous flattening, I’d do something like:

for x in non_flat:
    for y in x:
        y

which, if y is moved to the front and joined in one line would be the correct nested list comprehension!

So that’s the way… What If I wanted to include only lists with more than 2 elements in the flattening (so [7,8] should not be included)? I’ll write it with for loops first:

for x in non_flat:
    if len(x) > 2
        for y in x:
            y

so by convering this to list comprehension we get:

>>> [ y for x in non_flat if len(x) > 2 for y in x ]
[1, 2, 3, 4, 5, 6]

Success!

One final, more complex example: Let’s say that we have a list of lists of words and we want to get a list of all the letters of these words along with the index of the list they belong to but only for words with more than two characters. Using the same for-loop syntax for the nested list comprehensions we’ll get:

>>> strings = [ ['foo', 'bar'], ['baz', 'taz'], ['w', 'koko'] ]
>>> [ (letter, idx) for idx, lst in enumerate(strings) for word in lst if len(word)>2 for letter in word]
[('f', 0), ('o', 0), ('o', 0), ('b', 0), ('a', 0), ('r', 0), ('b', 1), ('a', 1), ('z', 1), ('t', 1), ('a', 1), ('z', 1), ('k', 2), ('o', 2), ('k', 2), ('o', 2)]

Configuring Spring Boot

Introduction

The Spring Boot project is great way of building Java applications using Spring. Instead of trying to integrate everything by hand (and usually end up with a configuration hell) you use spring-boot to help you to bootstrap your application: Just include its dependencies in your pom.xml and Spring Boot will try its best to auto-configure all these components!

Of course, no matter how hard Spring Boot tries to auto-configure everything, you’ll still need to pass some configuration to configure your databases, caches, email sending, security etc. Thankfully, Spring Boot can be configured without any xml (actually, its a bad practice to use xml-based configuration with it), using plain Java .properties files or (if you prefer the more compact syntax) YAML .yml files!

In this guide, along with a simple introduction to the way Spring Boot configuration works, we’ll talk about a specific way of stucturing your settings configuration files in order to have:

  • A global configuration file that will contain all your settings
  • Different settings for each of your environments (development, UAT, staging, production and test)
  • A way to configure your passwords and other sensitive data (that you don’t want to put to your VCS)
  • Being able to override any setting in any environment
  • Deploying your Spring Boot app in Linux using init.d

To quickly test the proposed settings configuration I’ve created a simple Spring Boot project @ https://github.com/spapas/spring-boot-config. Just clone it, optionally change the packaged settings (more on this later), package it (mvn package), optionally change the config settings (more on this also later) and run it (using something like java -jar spring-boot-config-0.0.1-SNAPSHOT.jar) optionally passing it command line settings (more on this also later). You’ll then be able to visit http://127.0.0.1:8080 and check the current settings!

properties vs yml files

You can use two kinds of files to configure your settings: Normal Java .properties files or YAML .yml files. The .properties files have the form:

config.value.a=1
config.value.b=2
config.value.c=3

while the .yml files are like:

config:
    value:
        a: 1
        b: 2
        c: 3

You may use whatever you wish - in the examples I’ll use normal Java .properties files because they are more compact (you don’t need to use multiple lines to represent a single setting like in YAML).

Structuring your configuration files

Spring Boot reads its configuration from various places, however in this article we’ll talk about four of them which should be enough for most cases. Starting from the most global to the most specific ones (i.e the latter ones will override the previous ones) these are:

  • Main (global) application settings
  • Profile settings
  • Local (/config) settings
  • Command line arguments

The first two are setting files that will be contained inside the artifact (jar or war) that will be created and should be commited to your version control system. I’ll call them jar-packaged settings. The other two won’t be commited to the version control but will be created directly on the server to-deploy. Let’s see a little more about them:

Main application settings

These are kept in a file named application.properties (or yml — from now on I’ll just use .properties` but keep in mind that you may use .yml): This file should reside inside the src\main\resources folder of your project and ideally contain all the settings your spring-boot application users. Some of these settings will be overriden by settings kept in the next source so they may have a default value or even be empty if they will be always overriden (or contain sensitive data like passwords), however I still prefer to list them all in this file even as placeholders to have a central source of all the settings that your Spring Boot application uses.

Profiles

A profile is a set of settings that can be configured to override settings from application.properties. Each profile is contained in a file named application-profilename.properties where profilename is the name of the profile. Now, a profile could configure anything you want, however for most projects I propose to have the following profiles:

  • dev for your local development settings
  • uat for your UAT server settings
  • staging for your staging server settings
  • prod for your production settings
  • test for running your tests

(depending of course on what are your requirements, some projects may not need uat or staging but all projects should have a dev, a prod and a test profile). The configuration for these environemnts needs to be different for obvious reasons. For example when developing you may want to use a local database, when running tests an ephemeral in memory database and your production database when deploying to production. These profile configuration files will be stored inside your src\main\resources folder, right next to the application.properties, i.e you’ll have application-dev.properties, application-prod.properties, application-test.properties etc - and all these files will be kept in your VCS (and will also be jar-packaged since they will be contained in the resulting artifact).

How do you select which profile is active each time (i.e pick it when running the Spring Boot application under its corresponding environment)?

For tests, since they can be run by a different Main than the normal application, you should use the @ActiveProfiles annotation (for example @ActiveProfiles("test")) to make sure that the tests will run with the correct settings. So if the contents of your application-test.properties are config.value=Hello test! running this test should produce no errors:

@RunWith(SpringJUnit4ClassRunner.class)
@SpringApplicationConfiguration(classes = SpringBootConfigApplication.class)
@ActiveProfiles("test")
public class SpringBootConfigApplicationTests {

    @Value("${config.value}")
    private String value;

    @Value("${spring.profiles.active}")
    private String profile;

    @Test
    public void contextLoads() {
        assertThat(value, is("Hello test!"));
        assertThat(profile, is("test"));
    }
}

To activate a different profile when running your Spring Boot applications you’ll need to use the spring.profiles.active setting, so if you set spring.profiles.active=prod in your application.properties and create the packaged jar (or war) then you’ll have the production settings when you run your application (i.e the contents of application-prod.properties will be used to override your application.properties). Of course, to deploy it to UAT, you’ll need to change spring.profiles.active to uat and re-create the packaged artifact — see some repetition and penal labour here? Definitely you don’t want to do re-create your artifacts for each of the environments you may want to deploy — we’ll see in the next sections how to improve this flow by overriding jar-packaged settings!

Some more advanced profile usage

You may have noticed in the previous section that the name of the annotation is @ActiveProfiles and the name of the setting spring.profiles.active - both in plural. This of course is on purpose: You may have more than one active profiles!

This, along with the fact that you can make @Components or @Configuration available only on certail profiles is a really powerful tool!

Here are some examples:

  • Configure two spring-security @Configuration s: Use in memory security for your dev environment, while using LDAP for your production.
  • If you want to support more than one database you can configure multiple profiles — and use them along with the dev/uat/prod I mentioned before.
  • Create verbose and non-verbose logging profiles and quickly change between them

Overriding settings

All the above settings we’ve defined should be safely kept inside your VCS - however we wouldn’t like storing passwords or other sensitive data to a VCS! Sensitive settings should be empty (or have a default value) when saved to VCS and overriden by “local” settings.

Also, all the previous are jar-packaged and we definitely need a way to override them without messing with the artifacts (for example, we need to select the correct profile for running the application by overriding spring.profiles.active).

There two methods of overriding settings, and these are the last two methods of the four we discussed above:

Using a config/application.properties

You can put files in a directory named config that is at the same level as the location from which you try to run your jar. These file should be named either application.properties or application-profilename.properties and will be used to override your jar-packaged settings.

What happens is that Spring will at first try to load a file named config/application.properties that will override your jar-packaged application.properties (so here you can set your current profile). Then, it will also try to load a file named config/application-profilename.properties that will override your jar-packaged application-profilename.properties (so here you may override any profile related properties).

The priority of the files from lowest to highest:

  • jar-packaged application.properties
  • local config/application.properties
  • jar-packaged application-profilename.properties
  • local config/application-profilename.properties

So (repeating for emphasis) the settings in your jar-packaged application-profilename.properties will only be overriden by config/application-profilename.properties (and not by the config/application.properties which will only override settings on the jar-packaged application.properties).

Also, to make everything clear about where the config directory should be kept:

If the current directory from which you’ll run your jar is /home/serafeim and you want to execute /opt/spring/my-spring-app.jar (so you’ll run something like /home/serafeim$ java -jar /opt/spring/my-spring-app.jar) then the config directory should be at /home/serafeim/config (i.e at the same directory from where you execute jar). Normally however and to avoid confusion, the best approach would be to just put it at /opt/spring/config and cd /opt/spring before running your jar (so config will be right next to your jar and run the jar from the directory).

Finally, my recommendation is to keep these config/*properties files off version control (after all they should be different for each of your environments - common settings should go to the jar-packaged files) and to put only the profile selection setting and sensitive settings there. That means that the config/application.properties file should only contain a spring.profiles.active=profilename setting to set the correct profile for this instance of your app and the config/application-profilename.properties will contain all sensitive information that you’ll need to run that profile.

For example in your UAT server you’ll have spring.profiles.active=uat in your application.properties and your uat server passwords in your application-uat.properties

Passing command line arguments

The most specific way of overriding parameters (including the active profile of course) is by directly passing these parameters as arguments when running your jar. For example, if you run java -Dconfig.value=foo -jar my-spring-app.jar then the config.value will always have a value of foo no matter what you have in your other config files.

That’s a different way to set your active profile (by passing -Dspring.profiles.active=profilename) or to quickly set sensitive settings however I prefer to keep the settings in properties files (and not to put them in scripts where they will definitely be missed and will be more difficult to be managed) so I’ll recommend the previous way of using a non-commited to version control local config/application.properties. Use command line arguments only for quick tests (run something with a specific setting to test how it works).

Deploying Spring Boot applications

If you check the deployment documentation of Spring Boot you’ll see that it has various hints on on deploying Spring Boot applications. I won’t go into much detail about these however I’ll represent my recommendation on deploying Spring Boot apps on Linux as an init.d script:

What is really interesting about Spring boot is that it allows you to make your jar-packaged jars executable as an init.d script so that you will be able to manage it using something like service springbootapp start/stop/restart etc. To do that, you’ll just need to add the <executable>true</executable> configuration for your pom’s spring-boot-maven-plugin. This will add some things in the start of your resulting jar file that will make it behave as a unix init.d script. If you take a look at your package artifact you’ll see something like this:

#!/bin/bash
#
#    .   ____          _            __ _ _
#   /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
#  ( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
#   \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
#    '  |____| .__|_| |_|_| |_\__, | / / / /
#   =========|_|==============|___/=/_/_/_/
#   :: Spring Boot Startup Script ::
#

### BEGIN INIT INFO
# Provides:          spring-boot-config
# Required-Start:    $remote_fs $syslog $network
# Required-Stop:     $remote_fs $syslog $network
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: spring-boot-config
# Description:       Demo project for Spring Boot configuration
# chkconfig:         2345 99 01
### END INIT INFO

[[ -n "$DEBUG" ]] && set -x

# Initialize variables that cannot be provided by a .conf file
WORKING_DIR="$(pwd)"
# shellcheck disable=SC2153
[[ -n "$JARFILE" ]] && jarfile="$JARFILE"
[[ -n "$APP_NAME" ]] && identity="$APP_NAME"

...

One thing that may seem puzzling at first is that if make this jar executable and try to run it you’ll see that, instead of offering you the well known options of the init scripts (Usage … start/stop/restart etc) it will immediatelly run the application! This is because the embedded script is smart enough to check that it will be executed as an init script only when it is executed as a link from /etc/init.d - else it will immediately run the application.

If you want to quickly test that behavior, you may override the MODE parameter which forces the mode of operation of the jar. If you want to run it as a script (without using a links from /etc/ini.d) then just set MODE=service. So, try runnin:

> MODE=service ./springapplication.jar
Usage: ./hsk9eea.jar {start|stop|restart|force-reload|status|run}

Success! Of course, this is just for testing purposes, to actually deploy your application then please create a link to it from /etc/init.d as proposed by the Spring Boot docs.

If you want to customize the init.d script you can use a file named sprinbootapp.conf in the same directory as your springbootapp.jar (i.e it should have the same name as your jar with an extension of .conf). The options from it will be sourced before running your application — for example you could set the active profile using RUN_ARGS, however as I already recommended, explicitly setting it to a file named config/applications.properties is preferrable.

Conclusion

Using the described file structure you should be able to fully configure Spring Boot and have all the goodies you’d expect from a modern framework: global settings, profiles, non-version control settings! Also, using the advanced profiles techniques (multiple profiles, profile enabled @Components and @Configurations) you’ll be able to implement some really complex configurations! Finally, you’ll be able to really quickly deploy the resulting jar as an init.d system service!