Question

I built a database that I would like to be able to search using the Search API in GAE. I clicked through the tutorials that Google has on the API, but the one thing I'm missing is how to actually turn a datastore kind into a "document". Is there a good tutorial for that somewhere? Thank you

Was it helpful?

Solution

You can not convert db.Model ndb.Model to search.Document.

Why? because it does not have too much value.

I give you some example - you have string 'this-is-black-tag' in db.StringProperty() how to convert it:

  1. you can use it as Atoms - so match will be if exact match
  2. you can use it as String - so it will be broken into 'this', 'is', 'black', 'tag' than tokenized 't', 'th', 'thi', 'this', ...
  3. you can decide that is not visible since not help in search but give false hits

You need design search feature yourself that it should be manual design that is answer.

You just need:

  1. create search.Document
  2. add fields
  3. add document to index

Read reference: https://developers.google.com/appengine/docs/python/search/documentclass

OTHER TIPS

Unfortunately this is not possible.

Looking at the Index constructor (python), we can se that there has been some tries on implementing that in early phases but it never actually worked. Specifying the source of the index has been deprecated for a while now and doesn't work anymore.

Here the constructor pydoc:

class Index(object):
  [...]

  def __init__(self, name, namespace=None, source=SEARCH):
  """Initializer.

  Args:
    name: The name of the index. An index name must be a visible printable
      ASCII string not starting with '!'. Whitespace characters are excluded.
    namespace: The namespace of the index name. If not set, then the current
      namespace is used.
    source: Deprecated as of 1.7.6. The source of
      the index:
        SEARCH - The Index was created by adding documents throught this
          search API.
        DATASTORE - The Index was created as a side-effect of putting entities
          into Datastore.
        CLOUD_STORAGE - The Index was created as a side-effect of adding
          objects into a Cloud Storage bucket.
  [...]
  """

So, at least for now (?), the only solution, like mentioned by Tim Hoffman, is to handle your documents and index separately from your datastore data.

You can still file a feature request to https://code.google.com/p/googleappengine/issues/list ans there where that goes.

I had thousands of entities that I wanted to build an index for and then used mapreduce to build an index for my entities and that became searchable with the search API. The mapreduce job was

- name: buildindex
  mapper:
    input_reader: mapreduce.input_readers.DatastoreInputReader
    handler: main.buildindex
    params:
    - name: entity_kind
      default: main.Article

Function

def buildindex(entity):
    try:
        doc = search.Document(doc_id=str(entity.key()), fields=[
             search.TextField(name='title', value=entity.title),
             search.TextField(name='text', value=entity.text),
                    ])
                search.Index(name='myIndex').put(doc)

    except Exception, e:
        logging.exception('Mapreduce has exception:%s' % str(e))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top