Question

I am a Newbie to HIVE. I am trying to implement a simple HiveQL query for joins. But due to lack of knowledge in HiveQL and surprisingly few resources available on internet related to HiveQl, I am stuck in constructing a simple query which just takes seconds in regular SQL.

Scenario:

I have 4 tables. Every Table has "playerid" column in it. I just want to join all the tables together and print out the result like mentioned below in SQL. Eg:-

select A.column1 ,B.column2,C.column3,D.column4 
from 
Table1 A,Table2 B,Table3 C,Table4 D
where
A.playerid = B.playerid
and
A.playerid = C.playerid
and
A.playerid = D.playerid

I want the similar kind of query in HiveQL. I know how to join two tables in HiveQL using JOIN ON statement, but joining many tables is challenging.

Was it helpful?

Solution

SELECT a.column1, b.column2, c.column3, d.column4
  FROM a
  JOIN b ON (a. playerid = b. playerid)
  JOIN c ON (c. playerid = b. playerid)
  JOIN d ON (d. playerid = c. playerid)

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins

OTHER TIPS

    select A.column1,B.column2,C.column3,D.column4 
       from Table1 A 
          join Table2 B on(A.playerid = B.playerid) 
          join Table3 C on (A.playerid = C.playerid) 
          join Table4 D on (A.playerid = D.playerid)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top