Question

I have a table, more or less in the following format

col1 col2 col3 ... col100
val1 val2 val3 ... val100

Where val* are doubles. Is there a way to extract for each row in which column is the highest value within row in Hive?

For example, for table like

col1 col2 col3
2     4   5
8     1   2

I would get

col3
col1
Was it helpful?

Solution

I can't test in hive, but a possible SQL query is as follows (greatest returns the maximum value from the list):

select 
  case 
   when col1 = greatest(col1,col2,col3) then 'col1' 
   when col2 = greatest(col1,col2,col3) then 'col2'  
   when col3 = greatest(col1,col2,col3) then 'col3'    
  end as c1 
from test;

Additional note: you should check, how ties are to be handled, in my solution I simple take the first column.

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top