Question

i'm quite aware that it's not described in the spec, but if i can obtain a keyword array from my back end for my, say 'article' object, how would i search against it with a given search phrase?

i'm coming from ruby on rails, if that'd help to explain


a prominent example floating around the webs is a to-do list example, and another one (see: http://www.html5rocks.com/en/tutorials/indexeddb/uidatabinding/ NOTE: broken for chrome) lets imagine the primary model in each case, (todo task, and employee) to have an associated array of keywords in addition to what's described in the articles.

Now say i wanted to get the task that mentions 'roman history' based on keywords (or 'clerical work' for the employee example) stored in the respective aforementioned array.

How in the h*** would you do this?


Note: I'd be using this to build a google chrome packaged app and a phonegap app with either sencha touch 2 or backbone.js (if one or the either would make the above easier, do tell)

Was it helpful?

Solution

I am planing to implement very simple full-text search in YDN-DB. I am thinking as follow:

  1. Annotate the schema with full-text for an index of an object store, which effectively create keyword reference table. keyword reference table is one-to-many relationship table to keyPath of the parent object store. The relationship data type is array, i think it is more efficient than multiple entry of references.
  2. When putting the object, split keywords from the value of the index and update the keyword reference table referencing keyPath of the object store.
  3. To query a full text search, open key cursor to keyword reference table giving a range of the search terms. The range will pull out all keyword start with the given search terms. From the result corresponding full record are take from the parent object store. The search is performed in single transaction.

Later key cursor can expand to include related phase.

The implementation is very straight forward and retrieval should be very fast.

EDIT:

Implemented in YDN-DB-FULLTEXT repo.

Features

  • Unicode-base tokenization supporting full language spectrum.
  • Stemming and phonetic normalization for English language.
  • Free text query base ranking with logical and, or and near.
  • Support exact match and prefix match.
  • Being based on YDN-DB, storage mechanisms could be IndexedDB, WebSQL or localStorage.
  • Easy and flexible configuration using fulltext catalog.

API Reference

Use search method to query full text search.

db.search(catalog, query)

Documents are indexed during storing into the database using add or put methods.

Query format is free text, in which implicit and/or/near logic operator apply for each token. Use double quote for exact match, - to subtract from the result and * for prefix search.

Parameters:

  • {string} catalog Full text search catalog name, as defined in schema.
  • {string} query Free text query string.

Returns:

{!ydn.db.Request} Returns a request object.

done: {Array} Return list of inverted index. An inverted index has the following attributes: storeName, primaryKey, score, tokens, representing for store name of original document, primary key of original document, match quality score and array of token objects. Token object has the following attributes: keyPath, value and loc representing key path of index of the original document, original word from the original document and array list of position of word in the document.

fail: {Error} If any one of deleting a key fail, fail callback is invoked, with the resulting error in respective elements.

progress: {Array} During index retrieval, raw inverted index are dispatched.

Example

var schema = {
  fullTextCatalogs: [{
    name: 'name',
    lang: 'en',
      sources: [
        {
          storeName: 'contact',
          keyPath: 'first'
        }],
    ]},
    stores: [
      {
        name: 'contact',
        autoIncrement: true
      }]
};
var db = new ydn.db.Storage('db name', schema);
db.put('contact', [{first: 'Jhon'}, {first: 'Collin'}]);
db.search('name', 'jon').done(function(x) {
  console.log(x);
  db.get(x[0].storeName, x[0].primaryKey).done(function(top) {
    console.log(top);
  })
});

Full text catalog

Full text catalog is a logical grouping of one or more full-text indexes. It is defined in database initialization in database schema.

Fields:

  • {string} name Full text catalog name.
  • {string=} lang Language. Stemming, word segmentation and phonetic normalization are language dependent. lang must be defined to index properly. Currently only en is well supported. For more languages, check out on natural project repo.
  • {Array} indexes Full text indexes. Each index has source reference to original document by storeName and keyPath. The value of keyPath is the text to be indexed. weight factor is applied when ranking search result. This value is not stored in the database can be changed after indexing as well.

The following full text catalog index author name on first and last field of record value with weighting more on first.

var catalog = {
  name: 'author-name',
  lang: 'en',
  sources: [{
    storeName: 'author',
    keyPath: 'first',
    weight: 1.0
  }, {
    storeName: 'author',
    keyPath: 'last',
    weight: 0.8
}]

Demo applications

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top