In this post, I will summarize my recent lesson learned on scaling django admin site.
The motivation of scaling django admin site is the slow loading of the admin view when the backend table has more than millions of records. Even, for those quite large tables, they may reach the nginx proxy timeout and raise 502 bad gateway error. As we know, infinitely increasing the timeout is not a good practice in real deployment. The disadvantages of increasing timeout include the delay of failure showing up and awful user experience while waiting for the pages spanning for requests. So the better solution here is to scale django admin site.
Silk
The first thing to introduce is silk
for django application profiling. After right configured in django as a middleware.
We can easily get access to a silk interfaces via localhost:8000/silk
(We normally do the local profiling for development).
In this page, we can see the request profiling of localhost:8000
. It includes some key info like total time of a specific
request, sql statement required for the specific request, and execution time and number of joins of each sql. So with
this profiling tool we can easily see the bottleneck of each request and find out the improper implementation of the
admin site.
Paginator
All data you see from a website is coming from the storage system. Thinking in this way, we can easily imagine that the paginator of each admin view requires total number of records, which involves a count query over the whole table. For a table with millions of records, count query will take a while. So for the admin view for large table, we can try to avoid the count step. In django, the count step is under paginator object. We can override the paginator attribute of admin object by following block of code.
from django.contrib import admin
from django.core.paginator import Paginator
class BigTablePaginator(Paginator):
@property
def count(self):
return 9999999
class BigTableAdmin(admin.ModelAdmin):
paginator = BigTablePaginator
show_full_result_count = False
For every large table admin, we can inherit on this BigTableAdmin
. In this way, we can get ride of the time spending
on count query.
N + 1 Query
N + 1 query is one issue people might easily oversee. For a table with foreign fk from another table, if we don’t select
related object(in ORM perspective) in advance, a request of displaying the whole table will involve join queries with
the same number of records with foreign object. In other words, the records will join one by one during the query. Such
problem is called N + 1 query problem. To avoid this, we can select related object in advance by overwrite the
get_queryset
method under an admin object by following,
def get_queryset(self, request):
queryset = super().get_queryset(request)
return queryset.select_related("foreign_object")
This will reduce significant number of joins and achieve the join within one query, definitely improve the speed of loading a site.
Use raw_id_fields
or display id instead of objects
Even if we solve the N + 1 query problem, there is still a possibility that the all-in-one join query joins too many times
across multiple tables. To avoid the join happens, we can also directly display foreign object id in the admin view
or put the object inside raw_id_fields
attribute of admin. Under this way, we can get rid of the big cost of join
query.
Above all, these are the perspectives and measures we can take to scale django admin site. They should be enough for most of cases. If the admin site still has functionality issue due to the unscalability, we might need to consider partition of the existing tables and other distributed ideas.