Question

I do delta dataimport. I use delete_item table for getting data, which I should delete from the solr index.

How can I execute query

TRUNCATE TABLE delete_item

after the executing delta import.

It can be do with solr or should I do this with cron jobs.

Was it helpful?

Solution

There is no out of the box, configure me in XML solution for this. Out of Solr's perspective this makes sense. Solr wants to manage itself and not manage other data sources. But you can do several things.

Personally I would recommend (2) as this does not include to write custom code which needs to be deployed to your solr instance. Thus that solution is transferable to solr cloud.

1. A Custom EventListener

Like mentioned in this answer https://stackoverflow.com/a/9100844/2160152 to Solr - How can I receive notifications of failed imports from my DataImportHandler? you can write a custom EventListener. That listener may connect to your database and execute the truncate.

import java.sql.Connection;
import java.sql.SQLException;

import org.apache.solr.handler.dataimport.Context;
import org.apache.solr.handler.dataimport.EventListener;

public class ImportEndListener implements EventListener {

    @Override
    public void onEvent(Context aContext) {
        Connection connection = getConnection();
        try {
            connection.createStatement()
                .executeUpdate("TRUNCATE TABLE delete_item");
        } catch (SQLException e) {
            // TODO think of something better
            e.printStackTrace();
        } finally {
            try {
                connection.close();
            } catch (SQLException e) {
                // TODO think of something better
                e.printStackTrace();
            }
        }
    }

    private Connection getConnection() {
        // TODO get a connection to your database, somehow
        return null;
    }

}

That listener needs to be compiled and bundled in a jar file. Then you need to make your jar and all its' dependencies available to Solr as described in the wiki (the article is about plugins, but holds true for any custom code).

2. Redisign 'deleted_item' Table

Like shown in the blog entry "Data Import Handler – removing data from index" you could extend your table by a timestamp column deleted_at. Then you would need to extend your onDelete trigger to insert the current time into that column.

If you had that you could reformulate the deletedPkQuery attribute in your entity as follows

deletedPkQuery="SELECT id FROM deleted_item WHERE deleted_at > '${dataimporter.last_index_time}'"

That way there would be no need to truncate the table, except you want to save the disc space.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top