Question

I am using NUTCH 1.4 and SOLR 3.3.0 to crawl and index my website. On the front-end I use the php API SOLARIUM to query to SOLR. I have the following fields that I search in by default:

content -> of type Text

title -> of type Text

ur-> of type url

I want to search for a keyword but at the same time I want to exclude some of the results based on some URL pattern without affecting the total number of results return. (For example I always want to show 20 results.)

If anyone knows a way of doing this with SOLARIUM it would be really nice. But if not I am curious how this can be done in SOLR.

I have already looked at faceted search but I couldn't wrap my head around it. If someone can explain in details I would really appreciate it.

Was it helpful?

Solution

I can't help you with Solarium, but your Solr query should be relatively straightforward:

q=+keyword -ur:exclude&rows=20

OTHER TIPS

http://{url_endpoint}/?wt=json&rows=20&start=0&q=content:contentText OR title:titleText OR ur:url

  • wt=json result will be in json format
  • rows=20 result will be paginated by 20 records per page
  • start=0 page to start displaying results
  • q= query to run search (make sure to properly escape inputs also * wildcard to look for anything before and after)

In php using curl.

$solr_end_point = '';   //enter endpoint
$search_term = '';
$url_type = '';
$start = 0;
$ch = curl_init();
$query = urlencode("content:*{$search_term}* OR title:*{$search_term}* OR ur:*{$url_type}*");
curl_setopt($ch, CURLOPT_URL, "http://{$solr_end_point}/?wt=json&rows=30&start={$start}&q={$query}");
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 2);
$result = curl_exec($ch);
curl_close($ch);
print_r($result);   //output result (json)
$json_result = json_decode($result,true);
print_r($json_result);  //output result as an array
exit();
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top