Page MenuHomePhabricator

Remove PageTriageUtil::getTopTriagers() and ApiPageTriageStats topreviewers=1
Closed, ResolvedPublic

Description

This appears to be dead code. I believe this used to be used by a feature that displayed the top 5 patrollers in the footer of Special:NewPagesFeed, but this feature appears to have been removed.

image.png (228×1 px, 191 KB)

https://en.wikipedia.org/wiki/Wikipedia:Database_reports/Top_new_article_reviewers does not appear to use this. The code for that is at https://github.com/MusikAnimal/MusikBot/blob/master/tasks/top_article_reviewers.rb

There may be an opportunity here to also remove the entire pagetriage_log SQL table. The other thing this SQL table is used for is getting a count of unreviewed articles, but this could probably be replaced with an SQL query targeting the pagetriage_page table.

Or conversely, we could look into restoring the top 5 reviewers feature. Here's the patch where the feature was added: https://phabricator.wikimedia.org/rSVN114684. I haven't found the ticket and patch that removed it yet.

@Kudpung, do you remember the top 5 reviewers feature, and do you recall why it was removed?

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

We should first verify that the API isn't being used. @kostajh, can we see if the API (action=pagetriagestats&topreviewers=yes) is hit and by whom?

We should first verify that the API isn't being used. @kostajh, can we see if the API (action=pagetriagestats&topreviewers=yes) is hit and by whom?

We have a log of all API calls made in production (mwlog1002.eqiad.wmnet:/srv/mw-log/api.log), but those are so massive to make proper searches a pain. A better option is to mark the parameter as deprecated which will get those calls to Logstash (via api-feature-usage).

Change 869172 had a related patch set uploaded (by Majavah; author: Majavah):

[mediawiki/extensions/PageTriage@master] ApiPageTriageStats: deprecate topreviewers

https://gerrit.wikimedia.org/r/869172

Sounds good to me, but let's wait to see if anyone wants to keep this functionality around.

Seems like it was removed way back in 2012, not long after it was added: rEPTRc2ebc74dc50c: Modifying page stats. Per global search a couple user scripts use it, which in itself shouldn't be an obstacle for removal.

We have a log of all API calls made in production (mwlog1002.eqiad.wmnet:/srv/mw-log/api.log), but those are so massive to make proper searches a pain.

They are logged to the data lake.

hive (default)> select count(*) from event.mediawiki_api_request where datacenter = 'eqiad' and year = 2022 and month = 12 and day > 9 and params['action'] = 'pagetriagestats' and params['topreviewers'] = 'yes';
...
OK
_c0
1
Time taken: 335.905 seconds, Fetched: 1 row(s)

So, not used.

Yes, I remember it. It was a very early UI of the feed. It wasn't a feature I asked for during development. Much as I am flattered to see my name in lights I don't recall it being deprecated.

It serves no use other purpose than to turn NPP into another MMORPG - a function that has been replace by countless backlog drives and their barnstars
Just my opinion.

Change 869172 merged by jenkins-bot:

[mediawiki/extensions/PageTriage@master] ApiPageTriageStats: deprecate topreviewers

https://gerrit.wikimedia.org/r/869172

The other thing this SQL table is used for is getting a count of unreviewed articles, but this could probably be replaced with an SQL query targeting the pagetriage_page table

@Novem_Linguae could you point to where you see it used in that way, please?

I came across T165910: Revive top reviewers stats from 2017. @MusikAnimal do you have any input for this task?

My proposal would be to remove this feature entirely if no one is using it:

  • remove topreviewers parameter
  • delete PageTriageUtil::getTopTriagers()
  • remove logUserTriageAction()
  • remove the pagetriage_log table

The table has 859,634 rows currently.

@Novem_Linguae could you point to where you see it used in that way, please?

Hmm. It used to be in Special:NewPagesFeed's footer, but I don't see it there now. We must have removed it recently.

Looks like there isn't much interest in adding the top 5 reviewers feature back, so should be OK to remove this.

@kostajh I already mentioned above that I don't recall it being deprecated but that probably because it was not in the final version that was rolled out. I can confirm that the image is extremely old and certainly dates back to October 2011. If you really need to know, perhaps @Scottywong 's memory is better than mine.

If you are looking for a consensus today, I do not believe it serves any useful purpose.

Change 870742 had a related patch set uploaded (by Novem Linguae; author: NovemLinguae):

[mediawiki/extensions/PageTriage@master] Remove all code related to pagetriage_log SQL table

https://gerrit.wikimedia.org/r/870742

Change 878938 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/PageTriage@master] DB: Remove pagetriage_log table references

https://gerrit.wikimedia.org/r/878938

Change 870742 merged by jenkins-bot:

[mediawiki/extensions/PageTriage@master] Remove all code related to pagetriage_log SQL table

https://gerrit.wikimedia.org/r/870742

The other thing this SQL table is used for is getting a count of unreviewed articles, but this could probably be replaced with an SQL query targeting the pagetriage_page table

@Novem_Linguae could you point to where you see it used in that way, please?

I came across T165910: Revive top reviewers stats from 2017. @MusikAnimal do you have any input for this task?

My proposal would be to remove this feature entirely if no one is using it:

  • remove topreviewers parameter
  • delete PageTriageUtil::getTopTriagers()
  • remove logUserTriageAction()
  • remove the pagetriage_log table

The table has 859,634 rows currently.

Apologies for the delayed reply!

I only vaguely recall this. I think T165910 was a quick step to revive the stats. A volunteer made a user script to expose the data, then we created https://en.wikipedia.org/wiki/Wikipedia:Database_reports/Top_new_article_reviewers which made both obsolete.

Please also update https://www.mediawiki.org/wiki/Extension:PageTriage accordingly whenever the time is appropriate.

kostajh renamed this task from Look into removing PageTriageUtil::getTopTriagers() and ApiPageTriageStats topreviewers=1 to Remove PageTriageUtil::getTopTriagers() and ApiPageTriageStats topreviewers=1.Jan 13 2023, 9:19 AM

Change 878938 merged by jenkins-bot:

[mediawiki/extensions/PageTriage@master] DB: Remove pagetriage_log table references

https://gerrit.wikimedia.org/r/878938

I recommand creating a new ticket and setting the needed information for the alter tables for the DBAs. See T322618 as an example.

I recommand creating a new ticket and setting the needed information for the alter tables for the DBAs. See T322618 as an example.

+1 to avoid confusion yeah.

Change 884454 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] wikireplicas: drop views for pagetriage_log

https://gerrit.wikimedia.org/r/884454

I recommand creating a new ticket and setting the needed information for the alter tables for the DBAs. See T322618 as an example.

+1 to avoid confusion yeah.

I created T328309: Remove pagetriage_log table for that. Subscribers to this task, please have a look at that one.

bump PageTriage version? (since database schema is changing)

Regarding that, we removed the version in rEPTRbdd49702ab90: extension.json: Remove version specifier, so I think we can delete that from the checklist.

@BTullis is away at the moment, but hopefully we can get this moving on June 5th.

@Marostegui @Gehel - I'll take care of the maintain-views run now. Thanks for the ping.

Change 884454 merged by Btullis:

[operations/puppet@production] wikireplicas: drop views for pagetriage_log

https://gerrit.wikimedia.org/r/884454

I believe that the maintain-views run is actually a noop. I haven't seen evidence of this view existing on any database.

I did a sudo maintain-views --dry-run --all-databases --replace-all on clouddb1013 and it didn't find anything to replace,

According to here: T328309#8574202 the pagetriage_log table was only ever created on enwiki,testwiki, and test2wiki and they were subsequently deleted in T328309#8686359 by @Marostegui

Is it possible that the maintain-views script was never run to create these views in the first place?

I'm happy to do a full run of maintain-views anyway, but as this stand I think it will not make any changes. Happy to be corrected if I've misunderstood anything.

Novem_Linguae claimed this task.

Sounds like we're done here. Thanks for helping to finish up this ticket.