Question

I need to write some indexing jobs to run once per day that query our Oracle database tables and index into ElasticSearch. Some tables index first and others next since there are table dependencies. But around that process of indexing, I need to enhance the fields going into to the ES index, as well as log to our Oracle database table job statuses and perhaps even the records that have succeeded/failed the indexing process.

Can I use the Elastic Search plugin JDBC-River.

Was it helpful?

Solution

My concern was logging back to the RDBS via an insert statement after the query to extract from the DB. I got in touch with the creator of the jdbc-river. He mentioned this is how I should configure things: really helpful!

curl -XDELETE '0:9200/_river/my_jdbc_river/


curl -XPUT '0:9200/_river/my_jdbc_river/_meta' -d '
    {
        "type": "jdbc",
        "jdbc": {
            "url": "jdbc:mysql://localhost:3306/test",
            "user": "",
            "password": "",
            "schedule": "0 0-59 0-23 ? * *",
            "sql": [
                {
                    "statement": "select *, created as _id, \"myjdbc\" as _index, \"mytype\" as _type from orders"
                },
                {
                    "statement": "insert into ack(n,t,c) values(?,?,?)",
                    "parameter": [
                        "$job",
                        "$now",
                        "$count"
                    ]
                }
            ]
        }
    }'

OTHER TIPS

Yes , You can do this by using poll parameter in jdbc river. In detail

Polling

JDBC river runs are repeated at a given interval. This method is also known as polling. You can specify the polling interval with the poll parameter, which takes an Elasticsearch time value. The default value is 1h.

Example :-

curl -XPUT 'localhost:9200/_river/my_jdbc_river/_meta' -d '{
    "type" : "jdbc",
    "jdbc" : {
        "driver" : "com.mysql.jdbc.Driver",
        "url" : "jdbc:mysql://localhost:3306/test",
        "user" : "",
        "password" : "",
        "sql" : "select * from orders",
        "poll" : "1h" 
    },
    "index" : {
        "index" : "jdbc",
        "type" : "jdbc",
        "bulk_size" : 100,
        "max_bulk_requests" : 30,
        "bulk_timeout" : "60s"
    }
}'

For your reference :- https://github.com/jprante/elasticsearch-river-jdbc/issues/92

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top