Question

I'm using the Freebase-Python module to iterate through hundreds of results. Using:

results = freebase.mqlreaditer(query, extended=True) 

I get a Python generator that I can iterate through like so:

for r in results:
   #do stuff, like create new object and save to datastore

mqlreaditer() fetches JSON results 100 at a time. One entry in that result of 100 is a short String like:

result:: {u'type': u'/games/game', u'mid': u'/m/0dgf58f', u'key': 
          {u'namespace': u'/user/pak21/', u'value': u'42617'}}

I'm running into an error locally:

"WARNING  2011-01-29 15:59:48,383 recording.py:365] 
 Full proto too large to save, cleared variables."

Not sure what is happening but I suspect it's just too much too fast, so I want to slow down the iteration OR break it out into chunks. I'm not sure how generators work or what my option are. Note this is running on Google App Engine, so Python dependencies and quirks of using the local app engine launcher apply.

Was it helpful?

Solution

A generator is just a function that appears to look like a sequence but which retrieves the items one at a time for you instead of having the whole list of data up-front, which often requires a lot more memory. It's a "just-in-time" iterable, if you like. But, you have no guarantees as to how much data it is reading or caching to do that. Sometimes it could well have the entire data already - you just don't know, without looking at the docs or the code.

If it really is a question of speed, then doing import time and adding a call such as time.sleep(1.0) inside the loop will delay it for a second each time: but I suspect that is not actually what the problem is, nor what the solution should be. Perhaps your query is retrieving too much data, or the objects are too large?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top