Why do dynamic fields not act like normal fields (specifically when querying and displaying in Hue) in solr?

StackOverflow https://stackoverflow.com/questions/19689626

  •  01-07-2022
  •  | 
  •  

Question

In both Hue and Solr dynamic fields cause issues. In Hue, data that is stored as a dynamic field refuses to appear in the default Solr search, which displays all available data from the imported collection. It also fails to search for this data which is of the text_general type and is indexed and stored. In Solr, it seems that the dynamic fields are not indexed even though the schema settings are like so:

   <dynamicField name="*_t"  type="text_general"    indexed="true"  stored="true"/>
   <dynamicField name="*_txt" type="text_general"   indexed="true"  stored="true" multiValued="true"/>

These settings are exactly the same as the normal field "name" which is token searchable and appears in the Solr Search in Hue.

   <field name="name" type="text_general" indexed="true" stored="true"/>

My goal was to use the tag and attribute names as the field names in the indexing, the values being indexed. And this works in solr, and I can see the results when I use the basic query "start:star"

"docs": [
      {
        "id": "5CCD1D4D-2D7D-4F6A-BD2B-FC9D8577493F",
        "name": "43-02 43 AVENUE",
        "borough_t": "QUEENS",
        "community_board_t": "02 QUEENS",
        "police_precinct_t": "Precinct 108",
        "city_council_district_t": "26",
        "created_date_t": "1305702000",
        "status_t": "Closed",
        "resolution_action_t": "Cleaning crew dispatched.  Property cleaned.",
        "closed_date_t": "1309244400",
        "x_coordinate_t": "1006146",
        "y_coordinate_t": "210783",
        "_version_": 1450333101007831000
      },

Everything with a _t at the end is known to be a dynamic field'd value and the name before it is the name from the tag. The only field that is partially searchable though is the name field. If I search "43" in the query, I will get this and other documents that have 43 in the "name". But if I search the word "Precinct" from the police_precinct_t field, my search returns nothing. All of this is in the Solr administrative window reached by going to http://HOST:8983/solr.

In Hue, I have even sparser information. Going to the Solr Search panel and doing the default blank search returns the first page of all the data in the Solr db.

1450333101007831000 5CCD1D4D-2D7D-4F6A-BD2B-FC9D8577493F 43-02 43 AVENUE

1450333092014194700 7606462C-8657-4113-9427-5CEF30FB5483 Engine 53/Ladder 43

1450333092021534700 EEB939BD-DE52-467E-8EA3-91C7AF8E162A Engine 43/Ladder 59

1450333095903363000 0BEDA34C-ECCE-4405-A0DD-6D9994C51CE3 102-18 43 AVENUE

1450333095906508800 F7B6F181-C289-4F42-9ADC-36971ABE813A 102-28 43 AVENUE

1450333095907557400 C0F5286F-3216-4A0A-A4D0-F6038020122C 102-28 43 AVENUE

1450333095908606000 1C94DAFF-AB59-452B-A569-6CE4472867C7 102-36 43 AVENUE

1450333096052260900 9C6AF32C-06FA-46B2-8266-2BC8CF23CE79 104-20 43 AVENUE

The first value is the version, the second is the id, and the next is the name value that I was using to test if only defined fields appeared in Hue, and this seems to be the case. The dynamic field data doesn't even show up in Hue like it does in the Solr Administrative panel, but I can search the data the same way I can in the Solr Query window. If I type in "borough_t:QUEENS" I get all the results where that exact field and value match, in both Hue and Solr Query. This does not allow for token matching and doesn't match the expected behavior of the declared dynamic field or it's attributes that's given above. I'm using the example schema.xml that's given with Solr in collection1. I've checked to make sure there aren't any other dynamic fields with the glob "*_t" to conflict and negate or error the field when it's set.

Solr is very simple to use and, with some reading, to understand, but I can't seem to find an answer for why solr refuses to work as expected. I'm using SolrJ to index my files into Solr, and then I'm using both the Solr Admin Panel and Solr Search in Hue (2.5) to interact with the indexed data.

An example of my XML data, pulled from the NYC Open Data Site, will not format properly, but if you're really interested you can look up the Firehouse Location and Graffiti Location data and export the xml. That is the data I'm putting into Solr. My solutions that I can see seem to be writing hardcoded interpretation code that assigns each tag value to an exact field with the text_general settings or creating the fields on the fly, which doesn't seem much different from dynamic fields. I would love to figure out how to get dynamic fields to work as I expect them to.

Thank you in advance.

Was it helpful?

Solution

Your problem is that the field "name" is being copied to the field "text" (and *_t not). "text" is declared as the default field to search if you don't specify a keyname. so if you do something like this q=what to search, it will search on "text" field (which includes "name").

In the default schema.xml you will find these instructions.(Solr 4.4 and 4.5)

stored="false" is the reason why you don't see the value of this field in solr responses.

<field name="text" type="text_general" indexed="true" stored="false" multiValued="true"/>

And here the name being copied to text

<copyField source="name" dest="text"/>

As you can see in these lines commented in the schema.xml the declaration of "text" as defaultSearch field is in the solrconfig.xml

< ! --Note: Un-commenting defaultSearchField will be insufficient 
if your request handler in solrconfig.xml defines "df", which takes precedence. 
That would need to be removed.
 <defaultSearchField>text</defaultSearchField> -->

So.. let's go to solrconfig.xml then..

<requestHandler name="/select" class="solr.SearchHandler">
    <!-- default values for query parameters can be specified, these
        will be overridden by parameters in the request
     -->
 <lst name="defaults">
   <str name="echoParams">explicit</str>
   <int name="rows">10</int>
   <!--******TAKE A LOOK HERE *******-->
   <str name="df">text</str>
   <!--*****************************-->

 </lst>
... more stuff
</requestHandler>

How to solve your problem?

uncomment this line in schema.xml.. to copy all "*_t" to "text" too.

 <!-- <copyField source="*_t" dest="text" maxChars="3000"/> -->

OBS: You will need to re-indexing after this change.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top