Question

I am looking to hack together a kafka consumer in Python or R (preferably R). Using the kafka console consumer I can grep for a string and retrieve the relevant data but I am at a loss when it comes to parsing it suitably in R.

There are kafka clients available in other languages (for example: PHP, CPP) but one in R would be helpful from a data analytics point of view.

It would be great if the expert R developers on this forum could hint at/suggest resources that would allow me to make headway in this direction.

Apache Kafka : incubator.apache.org/kafka/

Kafka Consumer Client(s) : https://github.com/kafka-dev/kafka/tree/master/clients

Was it helpful?

Solution

As there is a C++ API for Kafka, you could use Rcpp to bring it to R.

Edit in response to comment on R-only solution: I do not know Kafka well enough to answer, but generally speaking, middleware runs fast, connecting multiple clients, streams etc. So you would to simplify some thing somewhere to get R (single-threaded as it is) to play with it.

OTHER TIPS

[2015 Update] there is a library that allows you to connect to kafka - rkafka

http://cran.r-project.org/web/packages/rkafka/rkafka.pdf

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top