Efficient pagination and database querying in django

Question 1

mymodel.objects.all() yields a queryset, not a list. Querysets are lazy - no request is issued and nothing done until you actually try to use them. Also slicing a query set does not load the whole damn thing in memory only to get a subset but adds limit and offset to the SQL query before hitting the database.

Question 2

There is nothing memory inefficient when using paginator. Querysets are evaluated lazily. In your call Paginator(models, 7), models is a queryset which has not been evaluated till this point. So, till now database hasn't been hit. Also no list containing all the instances of model is in the memory at this point.

When you want to get a page i.e at paginatedPage = paginator.page(pageNumber), slicing is done on this queryset, only at this point the database is hit and database returns you a queryset containing instances of model. And then slicing only returns the objects which should be there on the page. So, only the sliced objects will go in a list which will be there in the memory. Say on one page you want to show 10 objects, only these 10 objects will stay in the memory.

When someone does;

Model.objects.all()[:40]

When you slice a list, a new list is created. In your case a list will be created with only 40 elements and will be stored somewhere in memory. No other list will be there and so there won't be any list which contains all the instances of Model in memory.

Question 3

Using the above information I came up with a view function decorator. The json_list_objects takes djanog objects to json-ready python dicts of the known relationship fields of the django objects and returns the jsonified list as {count: results: }.

Others may find it useful.

def with_paging(fn):
  """
  Decorator providing paging behavior.  It is for decorating a function that 
  takes a request and other arguments and returns the appropriate query
  doing select and filter operations.  The decorator adds paging by examining
  the QueryParams of the request for page_size (default 2000) and 
  page_num (default 0).  The query supplied is used to return the appropriate
  slice. 
  """
  @wraps(fn)
  def inner(request, *args, **kwargs):
    page_size = int(request.GET.get('page_size', 2000))
    page_num = int(request.GET.get('page_num', 0))
    query = fn(request, *args, **kwargs)
    start = page_num * page_size
    end = start + page_size
    data = query[start:end]
    total_size = query.count()
    return json_list_objects(data, overall_count=total_size)
  return inner