Question

I'm building a client which pushes some data into my HDFS. Because the HDFS is inside a cluster behind a firewall I use HttpFS as a proxy to access it. The client exits with an IOException when I try to read/write to the HDFS. The message is No FileSystem for scheme: webhdfs. The code is very simple

String hdfsURI = "webhdfs://myhttpfshost:14000/";
Configuration configuration = new Configuration();
FileSystem hdfs = FileSystem.get(new URI(hdfsURI), configuration);

It crashes in the last line. I'm building with Maven 3.0.4 and added the Hadoop-Client dependency 2.2.0 to my project. Accessing via curl on the command line works fine.

Any ideas why this could be failing?

Was it helpful?

Solution

Similar to this question on SO I had to add the following code prior doing any FS activities:

configuration.set("fs.webhdfs.impl", org.apache.hadoop.hdfs.web.WebHdfsFileSystem.class.getName());

I don't know why, but there seems to be something wrong with the Maven build process... for now it works.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top