2

I have some data in table e.g.:

id,params  
123,utm_content=doit|utm_source=direct|   
234,utm_content=polo|utm_source=AndroidNew|

desired data using regexp_extract:

id,channel,content
123,direct,doit
234,AndroidNew,polo

Query used:

Select id, REGEXP_extract(lower(params),'(.*utm_source=)([^\|]*)(\|*)',2) as channel, REGEXP_extract(lower(params),'(.*utm_content=)([^\|]*)(\|*)',2)  as content from table;

It is showing error '* dangling meta character' and returning error code 2

Can someone help here please??

3
  • What is your regex supposed to match? Note that in hive, you need to double the backslashes, and your regex should look like (.*utm_content=)([^|]*)(\\|*). I believe you are looking for ([0-9]*),utm_content=([^|]*)\\|utm_source=([^|]*) Commented Jan 17, 2016 at 11:03
  • \\ worked. Thanks a lot for your help!! Commented Jan 17, 2016 at 11:20
  • I posted since that helped. Commented Jan 17, 2016 at 13:16

1 Answer 1

1

Note that in hive, you need to double the backslashes.

Your regex should look like

(.*utm_content=)([^|]*)(\\|*)
0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.