Question

I am new to solr and I have a quite basic question about delta-imports. I have several new records by second in my mySQL DB. So when I start an import at second x it is very possible, that I will get some new records in the DB at the very same second after starting the import, but the next time when I start a delta-import it will check the "last_index_time" in dataimport.properties and will import all the records changed after this second x. So I will lose all records which have been changed in second x after starting the last import. And if I am right, it would be same issue even if it is possible to cahange the timestamp from seconds to e.g. milliseconds. The timewindow would be smaller, the amount of lost records would be smaller, but the problem itself would still be there.

I have not found any mention of this issue in the tutorials or anywhere else for that matter. Am I the first one who deals with several records per second, or do i miss something else?

Many thanks in Advance!

Was it helpful?

Solution

If it is to handle the exact second, you just check for the records equal to or greater then the last modified time.
Anyways if the the record already exists (identified by Unique Key) it will be overwritten. Solr will update the record by default so no duplicates would be created.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top