Page MenuHomePhabricator

Add user field to mediawiki/api/request
Closed, ResolvedPublic

Description

api.log at mwlog1002 has information about which logged in user caused an API request, but the structured version, mediawiki/api/request, does not (while it is much easier to process than a 100GB+ txt file :)).

I think we should add an user field to the event stream as well, as I don't see a reason why those two data sources should provide different information.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change 700253 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[schemas/event/primary@master] mediawiki/api/request: Add performer field

https://gerrit.wikimedia.org/r/700253

Change 700254 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[mediawiki/core@master] api-request: Add performer to $logCtx

https://gerrit.wikimedia.org/r/700254

Change 700253 merged by jenkins-bot:

[schemas/event/primary@master] mediawiki/api/request: Add performer field and bump to 1.0.0

https://gerrit.wikimedia.org/r/700253

Change 700254 merged by jenkins-bot:

[mediawiki/core@master] api-request: Add performer to $logCtx

https://gerrit.wikimedia.org/r/700254

@Ottomata Is there anything else to do before resolving this task? :)

I don't think so; once your MW core change is out the new events should start coming in. Maybe just wait until that happens and you can verify before closing?

Thanks!

spark-sql (default)> select performer from event.mediawiki_api_request where year=2021 and month=10 and day=11 and hour=0 and `database`='cswiki' and performer.user_text='Martin Urbanec' limit 1;
[...]
performer
{"user_id":275298,"user_text":"Martin Urbanec","user_groups":null,"user_is_bot":null,"user_registration_dt":null,"user_edit_count":null}
Time taken: 22.293 seconds, Fetched 1 row(s)
spark-sql (default)>

Confirming this indeed works as intended. Closing.