Question

Below is my shell script from which I am trying to invoke few hive SQL queries which is working fine.

#!/bin/bash

DATE_YEST_FORMAT1=`perl -e 'use POSIX qw(strftime); print strftime "%Y-%m-%d",localtime(time()- 3600*504);'`
echo $DATE_YEST_FORMAT1

hive -e "
        SELECT t1 [0] AS buyer_id
            ,t1 [1] AS item_id
            ,created_time
        FROM (
            SELECT split(ckey, '\\\\|') AS t1
                ,created_time
            FROM (
                SELECT CONCAT (
                        buyer_id
                        ,'|'
                        ,item_id
                        ) AS ckey
                    ,created_time
                FROM dw_checkout_trans
                WHERE to_date(from_unixtime(cast(UNIX_TIMESTAMP(created_time) AS BIGINT))) = '$DATE_YEST_FORMAT1' distribute BY ckey sort BY ckey
                    ,created_time DESC
                ) a
            WHERE rank(ckey) < 1
            ) X
        ORDER BY buyer_id
            ,created_time DESC;"

sleep 120

QUERY1=`hive -e "
set mapred.job.queue.name=hdmi-technology;
SELECT SUM(total_items_purchased), SUM(total_items_missingormismatch) from lip_data_quality where dt='$DATE_YEST_FORMAT2';"`

Problem Statement:-

If you see my first hive -e block after the echo $DATE_YEST_FORMAT1. Sometimes that query gets failed due to certain reasons. So currently what happens is that, if the first Hive SQL query gets failed, then it goes to second Hive SQL query after sleeping for 120 seconds. And that is the thing I don't want. So Is there any way if the first query gets failed dues to any reasons, it should get stopped automatically at that point. And it should start running automatically from the starting again after few minutes(should be configurable)

Update:-

As suggested by Stephen.

I tried something like this-

#!/bin/bash

hive -e " blaah blaah;"

RET_VAL=$?
echo $RET_VAL
if [ $RET_VAL -ne 0]; then
echo "HiveQL failed due to certain reason" | mailx -s "LIP Query Failed" -r rj@host.com rj@host.com
exit(1)

I got something like this below as an error and I didn't got any email too. Anything wrong with my syntax and approach?

syntax error at line 152: `exit' unexpected

Note:-

Zero is success here if the Hive Query is executed successfully.

Another Update after putting the space:- After making changes like below

#!/bin/bash

hive -e " blaah blaah;"

RET_VAL=$?
echo $RET_VAL
if [ $RET_VAL -ne 0 ]; then
echo "HiveQL failed due to certain reason for LIP" | mailx -s "LIP Query Failed" -r rj@host.com rj@host.com
fi
exit

hive -e 'Another SQL Query;'

I got something like below-

RET_VAL=0
+ echo 0
0
+ [ 0 -ne 0 ]
+ exit

Status code was zero as my first query was successful but my program exited after that and it didn't went to execute my second query? Why? I am missing something here for sure again.

Was it helpful?

Solution

Unless I'm misunderstanding the situation, it's very simple:

#!/bin/bash

DATE_YEST_FORMAT1=`perl -e 'use POSIX qw(strftime); print strftime "%Y-%m-%d",localtime(time()- 3600*504);'`
echo $DATE_YEST_FORMAT1

QUERY0="
        SELECT t1 [0] AS buyer_id
            ,t1 [1] AS item_id
            ,created_time
        FROM (
            SELECT split(ckey, '\\\\|') AS t1
                ,created_time
            FROM (
                SELECT CONCAT (
                        buyer_id
                        ,'|'
                        ,item_id
                        ) AS ckey
                    ,created_time
                FROM dw_checkout_trans
                WHERE to_date(from_unixtime(cast(UNIX_TIMESTAMP(created_time) AS BIGINT))) = '$DATE_YEST_FORMAT1' distribute BY ckey sort BY ckey
                    ,created_time DESC
                ) a
            WHERE rank(ckey) < 1
            ) X
        ORDER BY buyer_id
            ,created_time DESC;"

if hive -e "$QUERY0"
then
    sleep 120
    QUERY1=`hive -e "
    set mapred.job.queue.name=hdmi-technology;
    SELECT SUM(total_items_purchased), SUM(total_items_missingormismatch) from lip_data_quality where dt='$DATE_YEST_FORMAT2';"`
    # ...and whatever you do with $QUERY1...
fi

The string $QUERY0 is for convenience, not necessity. The key point is that you can test whether a command succeeded (returned status 0) with the if statement. The test command (better known as [) is just a command that returns 0 when the tested condition is met, and 1 (non-zero) when it is not met.

So, the if statement runs the first hive query; if it passes (exit status 0), then (and only then) does it move on to the actions in the then clause.

I've resisted the temptation to reformat your SQL; suffice to say, it is not the layout I would use in my own code.

OTHER TIPS

You may also find useful setting the exit immediately option:

     set  -e      Exit immediately if a simple command (see SHELL  GRAMMAR
                  above) exits with a non-zero status.  The shell does not
                  exit if the command that fails is part  of  the  command
                  list  immediately  following  a  while or until keyword,
                  part of the test in an if statement, part of a && or  ||
                  list, or if the command's return value is being inverted
                  via !.  A trap on ERR, if set, is  executed  before  the
                  shell exits.

as in this example

#!/bin/bash

set -e
false
echo "Never reached"
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top