Question

I want to use python or scala to connect shark server. But I didn't find any tools to do this. Are there any libs(python or scala/java). Thanks advanced.

Was it helpful?

Solution 2

It is not clear what do you mean by connect but both shark and spark speak scala:

$./bin/shark-shell
scala> val youngUsers = sql2rdd("SELECT * FROM users WHERE age < 20")
scala> println(youngUsers.count)
...
scala> val featureMatrix = youngUsers.map(extractFeatures(_))
scala> kmeans(featureMatrix)

In addition spark speaks python as well.

OTHER TIPS

If you want to run SQL queries using Shark, Shark's sharkserver behaves like a regular Hive Thrift server, so you should be able to re-use existing Python methods for connecting to Hive, such as

Shark Server also supports Hive's JDBC interface, so you can use that to run queries from Scala or Java; just use the Shark Server's address in place of the Hive Server address.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top