Вопрос

I have a requirement to copy files from HDFS to local. Now, I have 2 options . 1) Either use Hadoop Native Java API (FileSystem) or 2) Use WebHDFS [ I don;t have any issues with enabling it to my cluster]

Can someone let me know what is preferred option and why?

Это было полезно?

Решение

If you are using Java, I recommend the native Java APIs as it's more flexible and give you more control.

However, WebHDFS is better if you don't want to involve dozens of libraries required by Hadoop. It decouples your application and Hadoop. Of course, you need pay a little performance cost due to HTTP.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top