Question

I was using the following query to parse the data which is dot ('.') separated. Following query works fine when it is fired from hive console. I get the proper values like this :

v=WijimLM4Khb5YUVrh7kl4bOWx YtIOtZwTRJ 1397755516 1397755721 1397755739 1 9

But when i use the same query using the hive -e " ". It cannot parse the data and parse columns comes out to be null.

null null null null null null null 

Query :

   select split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'\\.')[0],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'\\.')[1],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'\\.')[2],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'\\.')[3],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'\\.')[4],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'\\.')[5],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'\\.')[6]  from tmp1 where cookie is not null

Data :

 v=oijim124Khb5YUVrh7kl4bOWx.tyIOt6wTRJ.1397755516.1397755721.1397755739.1.9
Was it helpful?

Solution

We need to use [.] in the regex to make the query work. It will give you the proper parsed data

hive -e "select split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'[.]')[0],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'[.]')[1],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'[.]')[2],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'[.]')[3],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'[.]')[4],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'[.]')[5],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'[.]')[6]  from tmp1 where cookie is not null"
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top