Question

I've had hard time explaining and finding what I need so please put your self in my shoes for a moment.

My requirement comes from a relational database background. I may be using Solr to do something it wasn't designed to do, or may be it can do what I need, I still need to confirm that. Hopefully you can assist me.

After indexing numerous documents into Solr. I need to retrieve distinct documents based on a filter. Just think about it as retrieving distinct rows while also applying a WHERE condition.

For example, in a relational database, I may have the following columns

(Country)  (City)     (Whatever)
 Egypt      Cairo      Hospitals
 Egypt      Alex       Schools
 Egypt      Mansoura   Hospitals
 Egypt      Cairo      Schools

If I perform this query: SELECT DISTINCT Country, City FROM mytable

I should get the following rows

(Country)  (City)
 Egypt      Alex
 Egypt      Mansoura
 Egypt      Cairo

Now after indexing the original table (SELECT * FROM mytable), how can I achieve the SAME output from Solr ? How can I retrieve documents by saying that I need these documents to be distinct based on some fields ? I will also need to apply a not null filter for a specific field.

I don't need statistics of any kind, I only need to get the documents.

I hope I was clear enough. Thank you for your time.

Was it helpful?

Solution

this would be achievable with field collapsing by grouping by multiple fields, but unfortunately only one field is supported right now. There is an open issue, check it out.

OTHER TIPS

Did you try with facet? You should do somethings like this:

http://localhost:8983/solr/select/?q=*:*&facet=on&facet.field=city&facet.field=country

he will return you all the city (with a distinct) and the his count. Here there is the wiki if you want to learn more about it.

I hope this help you.

Another good solution available from Solr 4 is based on Pivot (Decision Tree) Faceting.

Try with:

/solr/collection1/select?q=*:*&facet=true&facet.pivot=Country,City

This should return:

  "facet_counts" : {
        "facet_queries" : {},
        "facet_fields" : {},
        "facet_dates" : {},
        "facet_ranges" : {},
        "facet_pivot" : {
           "Country,City" : [ {
                 "field" : "Country",
                 "value" : "Egypt",
                 "count" : 4,
                 "pivot" : [ {
                       "field" : "City",
                       "value" : "Cairo",
                       "count" : 2
                 }, {
                       "field" : "City",
                       "value" : "Alex",
                       "count" : 1
                 }, {
                       "field" : "City",
                       "value" : "Mansoura",
                       "count" : 1
              } ]
           } ]
        }
  }
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top