문제

Is it possible to use the Hadoop SDK, especially LINQ to Hive, with a local installation of HDInsight Server. Note that I am not refering to HDInsight Service hosted on Azure.

I tried to use LINQ to Hive from Microsoft.Hadoop.Hive Nuget package, but was unable to get it working, because LINQ to Hive seems to require that results are stored in Azure Blob Storage, rather than on my hosted instance.

var hiveConnection = new HiveConnection(new Uri("http://hadoop-poc.cloudapp.net:50111"), "hadoop", "hgfhdfgh", "hadoop", "hadooppartner", "StorageKey");
var metaData = hiveConnection.GetMetaData().Result;
var result = hiveConnection.ExecuteQuery(@"select * from customer limit 1");

Even with a storage key, I cannot get this to work, because the MapReduce job fails with:

AzureException: org.apache.hadoop.fs.azure.AzureException: Container a7e3aa39-75ba-4cc2-a8aa-301257018146 in account hadooppartner not found, and we can't create  it using anoynomous credentials.

I also added the credentials once more to the core-site.xml file, as follows:

<property>
   <name>fs.azure.account.key.hadooppartner.blob.core.windows.net</name>
   <value>Credentials</value>
</property>

However I would rather get rid of storing results on Azure Storage, if possible.

Thank you for your help!

도움이 되었습니까?

해결책

You can use the HiveConnection constructor without the storage account options to connect to a local install. This works against a default install of the HDInsights developer preview on a local box:

var db = new HiveConnection(
            webHCatUri: new Uri("http://localhost:50111"),
            userName: (string) "hadoop", password: (string) null);
var result = db.ExecuteHiveQuery("select * from w3c");

Of course you can then use that connection for any LINQ queries as well.

다른 팁

It turned out that in the HiveConnection constructor you have to specify the full storage account name, i.e. hadooppartner.blob.core.windows.net.

I am still interested to use the .NET LINQ API without the need for a storage account. Furthermore is it possible to use the .NET API with other Hadoop distributions?

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top