Frage

I have an application which consumes RSS feeds and makes them searchable by performing the following steps:

  1. pulling article from the feed URL
  2. storing that data in a relational DB
  3. indexing the data in Elasticsearch

I'd like to reverse this process so that I can use the RSS River Elasticsearch plugin to pull data from feeds. However, this plugin integrates directly with Elasticsearch, bypassing my relational DB (which is a problem for other parts of the application which rely on each article having a record in the DB).

How can I have Elasticsearch notify the DB when a new article has been indexed (and de-indexed)?

Edit

Currently I'm using Ruby on Rails 4 with a PostgreSQL DB. RSS feeds are fetched in the background using Sidekiq to manage jobs. They go directly into PG and are then indexed by Elasticsearch. I'm using Chewy to provide an interface to the ES index. It doesn't support callbacks like I'm looking for (no Ruby library does afaik?).

Searching queries ES for matches then loads the records from PG to display results.

War es hilfreich?

Lösung

It sounds like you are looking for the sort of notification/trigger functionality described in this feature request. In the absence of that feature I think the approach suggested in that thread by the user "cravergara" is your best bet - that is, you can alter the RSS river Elasticsearch plugin to update your DB whenever an article is indexed.

That would handle the indexing requirement. To sync the de-indexing, you should make sure that any code that deletes your Elasticsearch documents also deletes the corresponding DB records.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top