Question

I need to calculate mean sales for sunday. Values for the column salesdate(timestamp) are:

2012-01-01 09:00:00
2012-01-01 09:00:00
2012-01-01 09:00:00
...........

I have extracted the date part using to_date().Now how to get weekday(like sunday) from this date in hive? Please guide.

Was it helpful?

Solution

You can use a combination of unix_timestamp, and from_unixtime UDFs.

from_unixtime(unix_timestamp(col), 'EEEE')

If you check the documentation for SimpleDateFormat, which from_unixtime uses, you can see that "EEEE" is the code for the full name of the day of the week. "EEE" gets you the abbreviated version, i.e. "Sun" or "Mon".

OTHER TIPS

There is no OOTB feature to achieve this as of now. A ticket is open though.

You need to write a UDF for this. Or, you could also try the patch available with the above mentioned ticket.

HTH

In Hive you can also use the below method to solve this problem in very elegant way and its performance is very good.

from_unixtime accepts the 1st argument in int format:

date_format(from_unixtime(col(timestampinseconds),'yyyy-MM-dd'),'EEEE')

You can also test it like this:

select date_format(from_unixtime(1531372789,'yyyy-MM-dd'),'EEEE');

Output:

Thursday

I hope it serves your purpose.

Just a suggestion.. you can take a low date (lower than the minimum date in your data), which is a Sunday, in 'yyyy-mm-dd' format. Use DATEDIFF() function to find the difference between the date value in your data (in 'yyyy-mm-dd' format) and this low date. Calculate modulo 7 of the datediff output. This will be 0 for Sunday, 1 for Monday, and so on..

select extract(dayofweek from from_unixtime(unix_timestamp));

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top