Pregunta

Straight up, I have read: http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-views-writing-bestpractice.html and various other pages on Couchbase site, but the following question just bothers me still so I want to double check before rolling this out.

If I have a document about a product, say:

"DocumentType": "productDoc",
"Name": "product",
"Price": 150,
"SellerID": 10,
"DeliverableAUWide": true,
"Colour": "red",
"Size": "large"

say I want a product that is between 100 and 200 then:

if(doc.DocumentType == "productDoc" && doc.Price)
{
emit([1, doc.Price], null)
}

would get me what I want with a start and end key

say I also want to search by size, then

if(doc.DocumentType == "productDoc" && doc.Size)
{
emit([2, doc.Size], null)
}

would get that again with correct key for search.

say I want to search for both at the same time, then:

if(doc.DocumentType == "productDoc" && doc.Size && doc.Price)
{
emit([3, doc.Size, doc.Price], null)
}

would get that.

Now say I want to search by: Price, SellerID, AU deliverable or not, Colour and size....

if(doc.DocumentType == "productDoc" 
&& doc.Price
&& doc.SellerID
&& doc.DeliverableAUWide
&& doc.Colour
&& doc.Size 
)
{
emit([4, doc.Price, doc.SellerID, doc.DeliverableAUWide, doc.Colour, doc.Size], null)
}

would get that.

But say I also want to be able to search by all of those expect price, I can't use the above, because Price will be null and hence the rest of that emit will be 'useless', because everything will be matched...

so I would need a new view query?? e.g.

if(doc.DocumentType == "productDoc"

&& doc.SellerID
&& doc.DeliverableAUWide
&& doc.Colour
&& doc.Size 
)
{
emit([5, doc.SellerID, doc.DeliverableAUWide, doc.Colour, doc.Size], null)
}

The Question

Is this approach correct because it seems that I would need a 'new' emit call for each type of search. So in the .net code I would check what I have re search input from the user and then call the right 'emit' (Note: that is why I have the number in the front of the emits so I can tell them apart later -- for sanity...).

Not only am I concerned about the length of the view that I would have to write, but say I later add a field to the documents, like 'Discount Amount', then I change the view, the re index of that would be massive or? Is this a concern???

Possible alternative to above structure???

Or is it better to have say only,

if(doc.DocumentType == "productDoc" && doc.Price)
{
emit([1, doc.Price], null)
}
if(doc.DocumentType == "productDoc" && doc.Size)
{
emit([2, doc.Size], null)
}

and then when I want a product by price and size call both and get effectively 'lists' of document ids and then 'intersect' these lists and see which id's are in both and then make calls to get the documents. This approach but has significantly more calls to the CB sever and also I would not be able to use the built in skip, limit and startkey_docid for paging. This also seem more performance intensive in the code side. I am presuming this is the 'wrong' mindset for Couchbase, but coming from an environment where the mentality is "less calls to DB = better" I might not be embracing the CB philosophy correctly....

If someone can please:

  1. Confirm the first approach is correct...
  2. the second is wrong

that would be great.

Please let me know if something does not make sense...

Thanks in advance,

Cheers

Robin

Other note: this document is the only document structure that would be in the bucket. I would only have 1 view. 10k documents ~ish.

¿Fue útil?

Solución 2

In Couchbase version 3.x you can use N1QL query language to specify filtering condition to select your json objects without having any views in place.

For example, you should be able issue a query like that:

SELECT *
  FROM your_bucket_name
    WHERE yourID1 = 'value1' AND yourID2 = 'value2' etc...

Try out N1QL tutorial

Another way could be to utilize Couchbase integration with ElasticSearch and perform search query in ElasticSearch engine that will return you all keys it found based on your search criteria. It also synchronized with your bucket via XDCR streaming.

Otros consejos

There is an elegant solution to this but be warned, this approach scales exponentially.

You were on the right track with composite keys.

if(doc.DocumentType == "productDoc" && doc.SellerID && doc.DeliverableAUWide && doc.Colour && doc.Size){
    emit([doc.SellerID, doc.DeliverableAUWide, doc.Colour, doc.Size], null)
}

Gives you the ability to filter on all of these fields. Say you wanted all of the documents with a sellerId of 123, a DeliverableAUWide of "true", a color of red, and a size of large, just suffix your query like so.

&startkey[123,"true","red","large"]&endkey[123,"true","red","large",""]

This returns everything that matches those four validations, but your issue is that if you're utilizing this view, you must pass a value for each category.

The solution comes with CouchDB's ability to emit a row multiple times with different keys. Say you want to leave color as a wild card, if you add a new line to your map function

if(doc.DocumentType == "productDoc" && doc.SellerID && doc.DeliverableAUWide && doc.Colour && doc.Size){
    emit([doc.SellerID, doc.DeliverableAUWide, doc.Colour, doc.Size], null)
    emit([doc.SellerID, doc.DeliverableAUWide, -1, doc.Size], null)
}

you can now query like so

&startkey[123,"true",-1,"large"]&endkey[123,"true",-1,"large",""]

(note: I choose to use -1 because I assume that will never be a valid value on any of these fields. Any value can work, just make sure none of your key values on your documents will actually be whatever you choose.)

and rows with documents of all colors will be return to you. Notice that you can still use the previous query to return all red documents on this same map.

Say you want all of your filters to have the ability to be wildcards. You can use the following map function to recursively generate every emit you're looking for.

function(doc, meta) {
  var emitCombos = function(current) {
    var dataSet = [doc.SellerID, doc.DeliverableAUWide, doc.Colour, doc.Size]; //add any new keys as they come up here
    var current = current || [];
    return (current.length === dataSet.length) ? [current] : emitCombos(current.concat(dataSet[current.length])).concat(emitCombos(current.concat(-1)));
  };
  var allCombos = emitCombos();
  //if all three are -1, it's not really filtering, hence the ... .length-1
  for (var combo = 0; combo < allCombos.length - 1; combo++) {
    emit(allCombos[combo].concat(doc.document.createdDate[1]));
  }
}

Using this map, each document will emit rows with these keys

[ 123, 'TRUE', 'RED', 'LARGE' ]
[ 123, 'TRUE', 'RED', -1 ]
[ 123, 'TRUE', -1, 'LARGE' ]
[ 123, 'TRUE', -1, -1 ]
[ 123, -1, 'RED', 'LARGE' ]
[ 123, -1, 'RED', -1 ]
[ 123, -1, -1, 'LARGE' ]
[ 123, -1, -1, -1 ]
[ -1, 'TRUE', 'RED', 'LARGE' ]
[ -1, 'TRUE', 'RED', -1 ]
[ -1, 'TRUE', -1, 'LARGE' ]
[ -1, 'TRUE', -1, -1 ]
[ -1, -1, 'RED', 'LARGE' ]
[ -1, -1, 'RED', -1 ]
[ -1, -1, -1, 'LARGE' ]

As stated earlier, the more filters you use, the more rows you'll emit, therefor bloating your view. So please, emit responsibly.

1- The first approach with the compound key is probably not good for your requirement. The reason why I am saying that is because you can only query the key from left to right ( see http://blog.couchbase.com/understanding-grouplevel-view-queries-compound-keys )

2- The second approach where you do multiple emit is one possible approach but you have to be careful in the way you query to get the proper range/type of data. And as you said if you want to add a new attribute you will have to reindex all the documents.

Why not creating multiple views and take the same approach and do the intersect in your application?

Another approach could be to use the Elasticsearch plugin to delegate the indexing to Elastic in for more complex and full text search queries. http://www.couchbase.com/docs/couchbase-elasticsearch/

PS: The size of the view per itself is not an issue most of the time so do not worry about this.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top