Entity scale issue on local development server for GAE Python NDB

https://stackoverflow.com/questions/22784343

25-06-2023
|

سؤال

I have three models of the same structure(each string property stores ~20 characters):

class A(ndb.Model): 
    _use_cache = False 
    _use_memcache = False 
    p1 = ndb.KeyProperty(indexed=False) 
    p2 = ndb.IntegerProperty(default=0) 
    p3 = ndb.IntegerProperty(default=6) 
    p4 = ndb.StringProperty(indexed=True) 
    p5 = ndb.StringProperty(indexed=True) 
    p6 = ndb.StringProperty(indexed=True) 
    p7 = ndb.StringProperty(indexed=True) 
    p8 = ndb.StringProperty(indexed=True)

I want to create 200,000 entities for testing the performance at scale

I planned to feed the data in multiple runs of my app(i stopped my app from app launcher and started it again). I decided this approach because i noticed that the memory(as shown in my Task Manager) does not come down even after putting all the data. It actually increases gradually as the data is put in sets of 10000. But, on stopping the app, memory gets released. I suspect memory leak issue but am not sure

I gave a for loop with a total entity count of 17550. i gave ndb.put_multi() for sets of 1000 entities. I got 7051 entities as per count(A.query().count())

Next, i tried adding 12870 entities by the same method(in sets of 1000). But i got a count of 7109

My computer is Lenovo T430 with 8 GB RAM(Windows 7 Enterprise); so resources should not be an issue and i do not run anything else apart from app engine launcher and chrome.. i am using the 1.9.1 version of gae python sdk..

Did anyone else face similar development server entity scaling issues.. What is maximum that you were able to achieve..

CONCERN:

Although GAE docs say that the query(filter conditions),fetch_page(count) response time is dependent only on the total size of the fetched matched entities, but i have seen response time degradation when the total (matched and not matched)entity count increased keeping the fetched count same..

المحلول

Don't even bother trying to do this on the dev server.

It's meaningless in terms of what you experience on the dev server and can not in anyway be compared to how production will run. The only value (assuming you can get all 200,000 records stored in a timeframe you can handle) is application runs as expected.

Secondly doingquery.count() will not immediately give you the expected results on either dev server or production as you will encounter the effects of eventual consistency.

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى StackOverflow