Page MenuHomePhabricator

Cite error category rapidly populating with phantom entries
Closed, ResolvedPublic

Description

See [[Wikipedia:Village pump (technical)#false negatives in category:Pages with missing references list]]. [[Category:Pages with missing references list]] on enwiki keeps filling up with phantom entries.

These pages disappear upon doing a forcelinkupdate purge through the API, but in the time it took me to type part of a reply to the village pump thread, nearly another 75 pages were added to the category, all of which disappeared when another API purge was performed.

This may have more to do with the citation extension than with categories.


Version: master
Severity: normal
URL: https://en.wikipedia.org/wiki/Category:Pages_with_missing_references_list

Details

Reference
bz46978

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:19 AM
bzimport added a project: Cite.
bzimport set Reference to bz46978.

That's odd.

To take a total stab in the dark, maybe Cite's clearState hook isn't being called properly somehow, and then its state gets carried over between pages when the job queue is run.

Only recent change in
https://gerrit.wikimedia.org/r/#/q/message:+Categories++project:mediawiki/core+is:merged,n,z

It probably wouldn't be a "category" change. More likely a "parser" or "job queue" change.

Last time we had a bug kind of like this (bug 31576/bug 33409) it was super-obscure [Note, I am *not* suggesting the two are related, just that this may have a non-obvious cause].

For how long have you seen this problem? I've also asked this in
https://en.wikipedia.org/wiki/Wikipedia:
Village_pump_%28technical%29#false_negatives_in_category:
Pages_with_missing_references_list

According to User:Johnmperry it began on the Thursday before Easter, 28th March.

http://en.wikipedia.org/w/index.php?title=Wikipedia:Village_pump_%28technical%29&diff=549411580&oldid=549410577

Posting his full comment here for posterity:

:It was "always" a minor irritant, but then the Thursday (my time) before Easter it suddenly escalated - overnight the entries on the list jumped from less than 20 to 300 or more. (usually there are perhaps 40 true bad 'uns). This actually took more than a week to clear, as each entry had to be inspected and nulledited at a minimum. Then it suddenly, last Friday I think, jumped back from clear to in excess of 200, which is when I started this particular ball rolling. I was monitoring yesterday: it is not a steady trickle but a whole batch at once. There seemed to be three waves. Fortunately I am now able to purge the set through the sandbox. I ran this once in the(my) morning, which removed more than 50, then again later in the afternoon which removed a similar number. Finally in the evening the list jumped from around 10 to more than 50 and I ran it again. I still have the log for that, but it doesn't tell you much, doesn't have a timestamp or anything. Hope this helps. --Johnmperry


Which is interesting. So perhaps this always happened, and recently a lot more pages are being refreshlinkupdate-d (due to Wikidata maybe?), which is triggering an increase into more noticable levels.

Guess I should go test cite locally with runJobs.php to see if my pet theory about cite-being-broken-when-parsing-is-run-via-a-job is right...

john37perry wrote:

Well the two features which I think counter your theory are:
. it happens in discernible 'batches' rather than a steady trickle
. it happens to pages that haven't been edited for weeks. Unfortunately I can't tell you most of them, because making a null edit doesn't show up on my contributions list. Maybe you have access to something else

Actually, whatever it was, seems to have stopped. I think I ran the api purger yesterday morning, and not since. Not only that, but I think even the previously usual content of perhaps 50% duds seems to have dried up too. Something has changed somewhere.

~~~~

john37perry wrote:

maybe spoke too soon - just ran purge again and it removed 20-30 (i.e. all except permanent)

john37perry wrote:

What is happening now (18:45 UTC+8) - I purge the page and more or less everything goes. Then they come back a few minutes later.

(In reply to comment #6)

Well the two features which I think counter your theory are:
. it happens in discernible 'batches' rather than a steady trickle

That's actually consistent somewhat with job queue. Someone edits a big template and big batch of jobs happen all at once.

. it happens to pages that haven't been edited for weeks.

But did they use a template that got edited recently, or did the langlinks at wikidata get edited recently?

Unfortunately I
can't tell you most of them, because making a null edit doesn't show up on my
contributions list. Maybe you have access to something else

I don't have access to anything you don't have (im just a volunteer). And I should note my guess is just a guess so theres quite a possibility it is wrong.

A list of example pages wouldn't be that useful (to me anyhow). The parseroutput of the pages after going through a refreshlinks job might be mikdly interesting, but there's not an easy way to get that.

Actually, whatever it was, seems to have stopped. I think I ran the api
purger
yesterday morning, and not since. Not only that, but I think even the
previously usual content of perhaps 50% duds seems to have dried up too.
Something has changed somewhere.

~~~~

Guess I should go test cite locally with runJobs.php to see if my pet theory
about cite-being-broken-when-parsing-is-run-via-a-job is right...

Ok, I tried messing around with the Job Queue locally to see if I could reproduce, and I could not. So either that theory isn't right, or more likely my config is different from wmf's. (Additionally the volume of things on the wikipedia seems to suggest it wouldn't be all the time that this happens, but only occasionally)

john37perry wrote:

Another idea which occurs to me is that we can't discount the notion of "not good faith". If I am able to run an API process which pops pages off the list, then I imagine it must be possible to run something to do the opposite - pop an item onto the list.

It got particularly bad early evening here yesterday (around 18:00 local, which would be 10:00 UTC) when I would purge the list down to a handful, then after I'd worked on one the list would be back up to 30-40, sometimes with the same pages as before. Went round that cycle several times, then after a couple of hours it went quiet again.

I don't think this is malicious. You would have a lot more things in the category, and there are probably more effective and easier methods to make an attack.

john37perry wrote:

There is a [[WP:VPT#Articles randomly showing up in categories|new bleat]] which suggests the same problem with other categories.

Jason_quinn wrote:

I hope it has nothing to do with anything but I have been making lots of edits to fix "citation needed" template usage errors (about 1000 since March 20th, or 50 or so a day). I have also recently changed Template:Fix and a bunch of inline templates. I don't think it could (should?) be responsible for any of this but the timing, number of edits, and "batch" style of this problem, make me wonder.

See [[Wikipedia:Village_pump_(technical)#false_negatives_in_category:Pages_with_missing_references_list]], for more details.

On the flip side, the first "phantom" entry I noticed, which ultimately lead me to this discussion was one I had never edited at all. And it was in a [[Category:Pages containing citation needed template with deprecated parameters]], not a cite category.

Might be a broader issue, see bug 31577

Assigning to Aaron for initial debugging per today's MW Core meeting.

Created attachment 12113
a debugging patch

It might be interesting to put a live debugging/logging here.

The category is showing up on pages which have no <ref>'s, which would seem to indicate that clearState hook isn't being run. It might be interesting to see what the content of mRefs is on such a page, since it should be empty.

Attached:

I found a bug (bug 47291) where the parser clone used by MessageCache would not be able to be cleared with Cite::clearState.

In some testing, it seems that if this bug occurs during a job queue run, then all subsequent pages processed during the run will wind up in the error category if they include any parsed/transformed messages.

john37perry wrote:

I don't want to say it too loudly in case it's listening, but the problem has not appeared, no false-negatives, for more than 24 hours. Maybe 48, I forget.

Jason_quinn wrote:

The [[Nicholas Purcell of Loughmoe]] article has been in [[:Category:Pages containing citation needed template with deprecated parameters]] for about 4 or 5 days when it just appeared despite the article history not showing any edits since 14 October 2012‎.

During the last week, a couple other false entries suddenly appeared in that category despite no recent edit in the history. But within a day of each appearance the history updated to show that there *was* an edit underlying its inclusion. For some reason there was just a long lag before it got processed.

I've been assuming the same thing would happen to [[Nicholas Purcell of Loughmo]] but no edit history update has occurred for almost a week now. This may be useful lead about these category issues.

(In reply to comment #21)

The [[Nicholas Purcell of Loughmoe]] article has been in [[:Category:Pages
containing citation needed template with deprecated parameters]] for about 4
or
5 days when it just appeared despite the article history not showing any
edits
since 14 October 2012‎.

During the last week, a couple other false entries suddenly appeared in that
category despite no recent edit in the history. But within a day of each
appearance the history updated to show that there *was* an edit underlying
its
inclusion. For some reason there was just a long lag before it got processed.

I've been assuming the same thing would happen to [[Nicholas Purcell of
Loughmo]] but no edit history update has occurred for almost a week now. This
may be useful lead about these category issues.

I suspect the issue will go away once gerrit change
Id3e91c41dc33a703b5326961fd57e1 goes live on wikipedia

Jason_quinn wrote:

@Bawoflff and John Perry

I completed emptying [[:Category:Pages
containing citation needed template with deprecated parameters]] in the last couple of days. Seems highly suspicious that the problem also stopped around then too. Again, the timing of the onset and now offset of this bug makes it seem like the two may have been related, although I don't know why.

Aaron tells me that Brad's fix should fix this one as soon as it's backported (or until 1.22wmf3 gets deployed). Assigning to Brad, but leaving open to track the backporting.

Backported to 1.22wmf2, which is already deployed to most wikis and will be to the rest tomorrow.