how to use regexp_extract in hive

Question

I am trying to extract a portion of the below string using regexp_extract but am not having any success:

CUST_NEW_ACCOUNTS_LINES_2019-03-03.dat.gz

I want to just get the date portion. On the regex101.com website this seemed to work, but hive is giving me an error message.

regexp_extract(meta_source_filename,'^(?:[^_]+_){4}([^_]+)') file_date

Can someone help me understand what is incorrect here? I am not at all familiar with regexp_extract syntax so have been using another function as a starting point.
Thank you!

leftjoin · Accepted Answer · 2019-03-05 07:26:41Z

2

with your_data as (
select 'CUST_NEW_ACCOUNTS_LINES_2019-03-03.dat.gz' str
)

select regexp_extract(str,'_(\\d{4}(-\\d{2}){2})\\.',1)
from your_data;

Result:

OK
2019-03-03
Time taken: 0.062 seconds, Fetched: 1 row(s)

Expression '_(\\d{4}(-\\d{2}){2})\\.' means:

underscore _ four digits \\d{4} repeat (hyphen and two digits) two times (-\\d{2}){2} dot\\.

Capture group number one (date only): (\\d{4}(-\\d{2}){2}) . In Hive you need to use \\ for shielding.

edited Mar 5, 2019 at 7:26

answered Mar 5, 2019 at 7:15

leftjoin

38.2k8 gold badges62 silver badges123 bronze badges

Add a comment |

Wiktor Stribiżew · Accepted Answer · 2019-03-04 17:22:56Z

You have captured the substring you need into a capturing group. You should use the number, ID of the group as the third argument:

regexp_extract(meta_source_filename,'^(?:[^_]+_){4}([^_]+)', 1) file_date
                                                             ^

See the regexp_extract(string subject, string pattern, int index) docs:

The 'index' parameter is the Java regex Matcher group() method index. See docs/api/java/util/regex/Matcher.html for more information on the 'index' or Java regex group() method.

Collectives™ on Stack Overflow

how to use regexp_extract in hive

2 Answers 2

Your Answer

Not the answer you're looking for? Browse other questions tagged
sql
regex
hive
hiveql
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged sqlregexhivehiveql or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
sql
regex
hive
hiveql
or ask your own question.