Question

I'm building a client which pushes some data into my HDFS. Because the HDFS is inside a cluster behind a firewall I use HttpFS as a proxy to access it. The client exits with an IOException when I try to read/write to the HDFS. The message is No FileSystem for scheme: webhdfs. The code is very simple

String hdfsURI = "webhdfs://myhttpfshost:14000/";
Configuration configuration = new Configuration();
FileSystem hdfs = FileSystem.get(new URI(hdfsURI), configuration);

It crashes in the last line. I'm building with Maven 3.0.4 and added the Hadoop-Client dependency 2.2.0 to my project. Accessing via curl on the command line works fine.

Any ideas why this could be failing?

Était-ce utile?

La solution

Similar to this question on SO I had to add the following code prior doing any FS activities:

configuration.set("fs.webhdfs.impl", org.apache.hadoop.hdfs.web.WebHdfsFileSystem.class.getName());

I don't know why, but there seems to be something wrong with the Maven build process... for now it works.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top