Will clustering work only on newly indexed documents or even old
documents?
It will work even on old documments
How can I specify which fields to look at for clustering?
Here's an example using the shakepspeare dataset. The query is which of shakespeare's plays are about war?
$ curl -XPOST http://localhost:9200/shakespeare/_search_with_clusters?pretty -d '
{
"search_request": {
"query": {"match" : { "_all": "war" }},
"size": 100
},
"max_hits": 0,
"query_hint": "war",
"field_mapping": {
"title": ["_source.play_name"],
"content": ["_source.text_entry"]
},
"algorithm": "lingo"
}'
Running this you'll get back plays like Richard, Henry... The title is what carrot2 uses to develop the cluster names and the text entry is what it uses to make the clusters.
The curl command is working and giving some results. How can I get the
curl command which takes a JSON as input to a REST API url of the form
localhost:9200/article-index/article/_search_with_clusters?.....
Typically use the elasticsearch client libraries for your language of choice.