Question

if I have millions of User records with some text fields getting indexed to solr on create and on update, how do I go back and re-index the few records that never made it to solr?

i.e. what if solr goes down for a few minutes during the day and about 300 records out of millions never got indexed.

I don't want to re-index millions of records, just the 300.

Was it helpful?

Solution

A good way to manage this would be to just insert the record IDs into a queue table on create and update, and then have a process that runs later to index the records. That way if Solr goes down, you don't have to worry about which records weren't processed, they'll just continue sitting in the queue until processed. The advantage of this is that your database doesn't have to wait for the solr update to complete before completing the transaction. The downside is that Solr isn't going perfectly in sync with what's in the database. You can adjust how often the queue reading program runs to accommodate your needs for that.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top