Page MenuHomePhabricator

FilterEvaluator->mungeRegexp() may unexpectedly remove backslashes in rules' regex patterns
Open, Needs TriagePublic

Description

This ticket is for an issue I discovered while working on T374170:

I have just discovered another bug. Per regex101.com:

A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data

Therefore, the (\\\\\\\\)* group (for 2 backslashes, but PHP-escaped + regex-escaped, totaling 8 backslashes to type) would match all pairs of backslashes (i.e. escaped backslashes) that precede the slash (escaped or not), which is what we want, BUT would capture and restitute only the last pair in the result.

The existing FilterEvaluator->mungeRegexp() is affected, and the one I'm adding in this PR.

Event Timeline

Change #1071284 had a related patch set uploaded (by Gerrit Patch Uploader; author: Anne Haunime):

[mediawiki/extensions/AbuseFilter@master] Fix unexpected removal of backslashes when more than one occurrence is found

https://gerrit.wikimedia.org/r/1071284