Вопрос

I'm building a client which pushes some data into my HDFS. Because the HDFS is inside a cluster behind a firewall I use HttpFS as a proxy to access it. The client exits with an IOException when I try to read/write to the HDFS. The message is No FileSystem for scheme: webhdfs. The code is very simple

String hdfsURI = "webhdfs://myhttpfshost:14000/";
Configuration configuration = new Configuration();
FileSystem hdfs = FileSystem.get(new URI(hdfsURI), configuration);

It crashes in the last line. I'm building with Maven 3.0.4 and added the Hadoop-Client dependency 2.2.0 to my project. Accessing via curl on the command line works fine.

Any ideas why this could be failing?

Это было полезно?

Решение

Similar to this question on SO I had to add the following code prior doing any FS activities:

configuration.set("fs.webhdfs.impl", org.apache.hadoop.hdfs.web.WebHdfsFileSystem.class.getName());

I don't know why, but there seems to be something wrong with the Maven build process... for now it works.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top