Split using Dot '.' doesn't work using hive -e " " even after putting double slash

StackOverflow https://stackoverflow.com/questions/23537818

  •  17-07-2023
  •  | 
  •  

Pregunta

I was using the following query to parse the data which is dot ('.') separated. Following query works fine when it is fired from hive console. I get the proper values like this :

v=WijimLM4Khb5YUVrh7kl4bOWx YtIOtZwTRJ 1397755516 1397755721 1397755739 1 9

But when i use the same query using the hive -e " ". It cannot parse the data and parse columns comes out to be null.

null null null null null null null 

Query :

   select split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'\\.')[0],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'\\.')[1],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'\\.')[2],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'\\.')[3],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'\\.')[4],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'\\.')[5],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'\\.')[6]  from tmp1 where cookie is not null

Data :

 v=oijim124Khb5YUVrh7kl4bOWx.tyIOt6wTRJ.1397755516.1397755721.1397755739.1.9
¿Fue útil?

Solución

We need to use [.] in the regex to make the query work. It will give you the proper parsed data

hive -e "select split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'[.]')[0],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'[.]')[1],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'[.]')[2],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'[.]')[3],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'[.]')[4],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'[.]')[5],
    split(regexp_extract(cookie,'v=[^&\n\;\" ]*',0),'[.]')[6]  from tmp1 where cookie is not null"
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top