Question

I'm trying to index a JSON file in Solr and it works, but i don't understant why Solr is indexing elements as an array instead of an element.

When I indexed the example json file "books.json" it works fine, but if I index another file "items.json" it generates a different output.

I show below:

Books.json

 [{
    "id" : "978-0641723445",
    "cat" : ["book","hardcover"],
    "name" : "The Lightning Thief",
    "author" : "Rick Riordan",
    "series_t" : "Percy Jackson and the Olympians",
    "sequence_i" : 1,
    "genre_s" : "fantasy",
    "inStock" : true,
    "price" : 12.50,
    "pages_i" : 384
  }]

 OUTPUT

{
    "id": "978-0641723445",
    "cat": [
      "book",
      "hardcover"
     ],
    "name": "The Lightning Thief",
    "author": "Rick Riordan",
    "author_s": "Rick Riordan",
    "series_t": "Percy Jackson and the Olympians",
    "sequence_i": 1,
    "genre_s": "fantasy",
    "inStock": true,
    "price": 12.5,
    "price_c": "12.5,USD",
    "pages_i": 384,
    "_version_": 1457847842153431000
},

Items.json

[{ 
    "title" : "Pruebas Carlos",
    "id" : 14,
     "desc" : "Probando como funciona el campo de descripciones"
}]

OUTPUT

{
    "title": [
       "Pruebas Carlos"
    ],
    "id": "10",
    "desc": [
      "Probando como funciona el campo de descripciones"
    ],
    "_version_": 1457849881416695800
},

My Schema, where i only added the new fields that i need.

Can someone explain to me how I have to do to index the elements without []?

Thanks

Was it helpful?

Solution

In short, these fields are configured to be arrays by your schema, this is why they are written as JSON arrays to the response. Even if they only have one member in your samples.

You need to configure them as multiValued="false" if they are only single-valued.


The fields you worry about title and desc are configured as multiValued="true" as you can see in this excerpt from your schema

<field name="title" type="text_general" indexed="true" stored="true" multiValued="true"/>
<field name="desc" type="text_general" indexed="true" stored="true" multiValued="true"/>

If you scroll up a little (to line 82) in your schema, you can read what this stands for

multiValued: true if this field may contain multiple values per document

You can read what this is good for and what the consequences are in several sources

OTHER TIPS

You have set both fields (title, desc) as multivalued, that is why, do this if they have a single value:

<field name="desc" type="text_general" indexed="true" stored="true" multiValued="false"/>
<field name="title" type="text_general" indexed="true" stored=" true" multiValued="false"/>

Looks like you have a problem associated to nested Jsons, you can use -

(i) /solr/update/json?commit=true?split=/&f=txt:/**

(ii) Using Index handlers - https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top