Question

I am trying to add time constraints to my search method which means searching by date. I know lucene can only handle strings, but I am converting the dates to string first. But it's still not working and due to the complexity of the code base I'm not quite sure why it's not working. Here's a simple version:

@Indexed
public class SomeDocument extends Document{
}

public abstract class Document extends BaseEntity{
}

@Indexed
public class BaseEntity {

@IndexedEmbedded
private Date lastUpdatedDate;
}

//snippet of search method
BooleanQuery bq = new BooleanQuery();    
long oneDay = 1000L * 60 * 60 * 24;
long currentTime = System.currentTimeMillis();
Date dateOne = new Date(currentTime);
Date dateTwo = new Date(currentTime - (oneDay * 7)); // 1 week ago ago    
TermRangeQuery dateQuery = new TermRangeQuery("lastUpdatedDate", dateTwo.toString(),    dateOne.toString(), true, true);
bq.add(new BooleanClause(dateQuery, BooleanClause.Occur.MUST));

//more is added to boolean query, I create a full text query, and use the list() method

Does anyone see a place that is implemented incorrectly? Thank you!

Was it helpful?

Solution

Instead of using Date.toString(), to generate the date string, you should use Lucene's DateTools.DateToString. Date.toString generates a date in " yyyy-mm-dd" format, while Lucene's DateTools formats dates in "yyyyMMddHHmmssSSS" format, which is more suited to querying effectively with typical analyzers. Something like:

String dateOneString = DateTools.dateToString(dateOne, DateTools.Resolution.MILLISECOND);
String dateTwoString = DateTools.dateToString(dateTwo, DateTools.Resolution.MILLISECOND);
TermRangeQuery dateQuery = new TermRangeQuery("lastUpdateDate", dateTwoString,  dateOneString, true, true);

I believe the default date resolution is Resolution.MILLISECOND. This can be changed with a @DateBridge annotation.

OTHER TIPS

From the code snippet you have given I'd guess the currentTime.toString() and dateTwo.toString() are in different formats, the first is the number of milliseconds since epoch and the second is in the "dow mon dd hh:mm:ss zzz yyyy" format which probably makes no sense in a Lucene range query.

As for the numbers, Lucene can index them just fine. See LongField and NumericRangeQuery.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top