Question

On my local machine the script runs fine but in the cloud it 500 all the time. This is a cron task so I don't really mind if it takes 5min...

< class 'google.appengine.runtime.DeadlineExceededError' >:

Any idea whether it's possible to increase the timeout?

Thanks, rui

Was it helpful?

Solution

You cannot go beyond 30 secs, but you can indirectly increase timeout by employing task queues - and writing task that gradually iterate through your data set and processes it. Each such task run should of course fit into timeout limit.

EDIT

To be more specific, you can use datastore query cursors to resume processing in the same place:

http://code.google.com/intl/pl/appengine/docs/python/datastore/queriesandindexes.html#Query_Cursors

introduced first in SDK 1.3.1:

http://googleappengine.blogspot.com/2010/02/app-engine-sdk-131-including-major.html

OTHER TIPS

The exact rules for DB query timeouts are complicated, but it seems that a query cannot live more than about 2 mins, and a batch cannot live more than about 30 seconds. Here is some code that breaks a job into multiple queries, using cursors to avoid those timeouts.

def make_query(start_cursor):
  query = Foo()

  if start_cursor:
    query.with_cursor(start_cursor)

  return query

batch_size = 1000
start_cursor = None

while True:
  query = make_query(start_cursor)
  results_fetched = 0

  for resource in query.run(limit = batch_size):
    results_fetched += 1

    # Do something

    if results_fetched == batch_size:
      start_cursor = query.cursor()
      break
  else:
    break

Below is the code I use to solve this problem, by breaking up a single large query into multiple small ones. I use the google.appengine.ext.ndb library -- I don't know if that is required for the code below to work.

(If you are not using ndb, consider switching to it. It is an improved version of the db library and migrating to it is easy. For more information, see https://developers.google.com/appengine/docs/python/ndb.)

from google.appengine.datastore.datastore_query import Cursor

def ProcessAll():
  curs = Cursor()
  while True:
    records, curs, more = MyEntity.query().fetch_page(5000, start_cursor=curs)
    for record in records:
      # Run your custom business logic on record.
      RunMyBusinessLogic(record)
    if more and curs:
      # There are more records; do nothing here so we enter the 
      # loop again above and run the query one more time.
      pass
    else:
      # No more records to fetch; break out of the loop and finish.
      break
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top