Page MenuHomePhabricator

Wikibase\DataModel\Services\Lookup\EntityLookupException: Call to a member function getTargetId() on a non-object (null)
Closed, ResolvedPublicPRODUCTION ERROR

Description

Error

Request URL: https://commons.wikimedia.org/wiki/File:Susana_Pavon_Instagram.jpg
Request ID: [XT9auwpAIC4AAL1toesAAACE] 2019-07-29 20:44:43: Fatal exception of type "Wikibase\DataModel\Services\Lookup\EntityLookupException"

Updated Example URLs (T229280#5463723):

message
Wikibase\DataModel\Services\Lookup\EntityLookupException from line 76 of /srv/mediawiki/php-1.34.0-wmf.15/extensions/Wikibase/lib/includes/Store/RevisionBasedEntityLookup.php: Call to a member function getTargetId() on a non-object (null)
trace
#0 /srv/mediawiki/php-1.34.0-wmf.15/vendor/wikibase/data-model-services/src/Lookup/RedirectResolvingEntityLookup.php(70): Wikibase\Lib\Store\RevisionBasedEntityLookup->hasEntity(Wikibase\MediaInfo\DataModel\MediaInfoId)
#1 /srv/mediawiki/php-1.34.0-wmf.15/extensions/Wikibase/repo/RepoHooks.php(945): Wikibase\DataModel\Services\Lookup\RedirectResolvingEntityLookup->hasEntity(Wikibase\MediaInfo\DataModel\MediaInfoId)
#2 /srv/mediawiki/php-1.34.0-wmf.15/includes/Hooks.php(174): Wikibase\RepoHooks::onSkinTemplateBuildNavUrlsNavUrlsAfterPermalink(SkinVector, array, integer, integer)
#3 /srv/mediawiki/php-1.34.0-wmf.15/includes/Hooks.php(202): Hooks::callHook(string, array, array, NULL)
#4 /srv/mediawiki/php-1.34.0-wmf.15/includes/skins/SkinTemplate.php(1308): Hooks::run(string, array)
#5 /srv/mediawiki/php-1.34.0-wmf.15/includes/skins/SkinTemplate.php(454): SkinTemplate->buildNavUrls()
#6 /srv/mediawiki/php-1.34.0-wmf.15/includes/skins/SkinTemplate.php(217): SkinTemplate->prepareQuickTemplate()
#7 /srv/mediawiki/php-1.34.0-wmf.15/includes/OutputPage.php(2580): SkinTemplate->outputPage()
#8 /srv/mediawiki/php-1.34.0-wmf.15/includes/MediaWiki.php(899): OutputPage->output(boolean)
#9 /srv/mediawiki/php-1.34.0-wmf.15/includes/MediaWiki.php(911): Closure$MediaWiki::main()
#10 /srv/mediawiki/php-1.34.0-wmf.15/includes/MediaWiki.php(523): MediaWiki->main()
#11 /srv/mediawiki/php-1.34.0-wmf.15/index.php(42): MediaWiki->run()
#12 /srv/mediawiki/w/index.php(3): include(string)
#13 {main}

Impact

Unclear whether this happens for any other files; appears in Logstash > 2000 times this week.

Notes

Got a bit turned around figuring out where this actually happens, figured others would know much more quickly than I would.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Eric to investigate with a timebox of 2 hours

Timebox of "today" at any rate. I'll post updates based on what I can discover.

I'm still trying to follow the exact code-path here. This is what I can say after digging around so far:

Wikibase has a hook method called onSkinTemplateBuildNavUrlsNavUrlsAfterPermalink that is defined in Wikibase/repo/RepoHooks.php. I think that this method was originally written to support some kind of Wikibase-specific Entity page; the comments here mention that the method "allows us to set up navigation URLs to later be used in the toolbox."

It looks like this hook method also gets called on regular File pages when the MediaInfo extension is enabled, presumably because entity lookup is happening on those pages now.

I have not been able to reproduce this error locally yet. On a normal File page (whether or not Structured Data is already present), this method will be called and the appropriate MediaInfo entity ("M-item") will be looked up. Since our file already has an EntityId (even if no structured data is present yet), the method runs to completion and some navigation URLs are set up. I don't think these are actually used anywhere in the File page.

The method is also called when a user navigates to the page of a recently-deleted file (which I think is part of what was happening here). I deleted a file locally and stepped through the code again in a debugger in an attempt to reproduce this error. onSkinTemplateBuildNavUrlsNavUrlsAfterPermalink gets called as before, but now the EntityId is null. This causes the method to return early and no errors get thrown.

The "path not taken" in this case might be the one that led to the error in question. onSkinTemplateBuildNavUrlsNavUrlsAfterPermalink contains a guard clause that looks like this:

$entityId = $entityIdLookup->getEntityIdForTitle( $title );
if ( $entityId === null || !$entityLookup->hasEntity( $entityId ) ) {
    return;
}

When I walked through the code on a recently-deleted File page, I hit the first condition here, since the $entityId of the page in question was equal to null.

It looks like the second condition is the source of the error. Somehow, the $entityId is non-null but the attempt to actually look it up throws an error.

Could this be caused by some kind of caching issue, maybe? A file has been deleted, but the file page remains accessible to some users with stale caches? Perhaps calls to getEntityIdForTitle for those users are still returning an ID, but attempts to access more information are blowing up because the actual entity the ID used to correspond to has been deleted?

@alaa_wmde any thoughts on this? We seem to be stuck (see Eric's comment above)

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:05 PM

@Ramsey-WMF apologies for late reply.

We'll look into it asap and report back what we find

Krinkle triaged this task as Unbreak Now! priority.EditedSep 4 2019, 1:22 AM
Krinkle subscribed.
From T231795
Error message
[XWznKQpAIDAAAKjiRwoAAABY] /wiki/File:Rockstar_San_Diego_Logo.svg.png   Wikibase\DataModel\Services\Lookup\EntityLookupException from line 76 of /srv/mediawiki/php-1.34.0-wmf.20/extensions/Wikibase/lib/includes/Store/RevisionBasedEntityLookup.php: Call to a member function getTargetId() on a non-object (null)
Notes

7423 hits in the logs for wmf.20 (last week).

First occurrence of the error in wmf.11 (19k). Also noticed in wmf.15 (12k), wmf.16 (59), wmf.17 (5k), wmf.19 (2) and wmf.20 (7k).

The WikibaseRepo hook in the skin system is causing a fatal exception, leading to a generic system error with HTTP 500 status code, which means it cannot be cached (potential traffic risk).

Users are consistently unable to open these files on Commons:

capture.png (302×1 px, 26 KB)

This does not appear to be intermittent or user/cache related.

Can there be a short-term solution even if there are perhaps some complications here for the long-term, so that the user is still able to see the regular file description page. Even if it means just disabling the structured data feature on this page, or handling the exception in some way?

@Cparle did you maybe have luck finding a reproduction path for this bug? Having stared at the code where the exception was thrown there was little hint on in which situations that might actually happen. Admittedly, this staring was rather brief.
Looking at the three example URLs, merging/redirecting seems to be involved.

* https://commons.wikimedia.org/wiki/File:Rockstar_San_Diego_Logo.svg.png
* https://commons.wikimedia.org/w/index.php?title=File:Rockstar_San_Diego_Logo.svg.png&redirect=no (different error).
* also at https://commons.wikimedia.org/w/index.php?title=File:Buddha_with_Naga_(snake).jpg&redirect=no

For instance, https://commons.wikimedia.org/w/index.php?title=File:Buddha_with_Naga_(snake).jpg&redirect=no does throw an exception, whereas omitting the redirect=no does result in being redirected to (correctly presented) file page.

@egardner let's know whether you find something new, esp. regarding the redirect bit to it.. or if you are not working on it actively at the moment, so that we switch to it.

egardner subscribed.

@alaa_wmde I'm not working actively on this at the moment, so I've un-assigned myself.

From the various reports it sounds like there are a couple of probably-related errors here, but I think this is the code path for at least one of them. TL;DR There seems to be an expectation of some bookkeeping happening in the entity content blob when a linked page becomes a redirect page, and that isn't actually happening, which breaks subsequent assumptions.

File:Rockstar_San_Diego_Logo.svg.png has page ID 76630136.

RevisionBasedEntityLookup::hasEntity() calls getLatestRevisionId() on its member EntityRevisionLookup, which is a WikiPageEntityRevisionLookup. WikiPageEntityRevisionLookup loads page and revision info from the DB, and sees that the associated page is a redirect page:

mysql:[email protected] [commonswiki]> select page_is_redirect from page where page_id = 76630136;
+------------------+
| page_is_redirect |
+------------------+
|                1 |
+------------------+
1 row in set (0.00 sec)

Seeing that the page is a redirect, it attempts to load an EntityRedirect object in its private loadEntity method, but instead finds a still-existing, valid MediaInfo entity:

>>> $slot = MediaWiki\MediaWikiServices::getInstance()->getRevisionStore()->getRevisionByTitle( Title::newFromText( 'File:Rockstar_San_Diego_Logo.svg.png' ) )->getSlot( 'mediainfo' );
=> MediaWiki\Revision\SlotRecord {#1459}
>>> $blob = Mediawiki\MediawikiServices::getInstance()->getBlobStore()->getBlob( $slot->getAddress() );
=> "{"type":"mediainfo","id":"M76630136","labels":{"ja":{"language":"ja","value":"\u30ed\u30b4"}},"descriptions":[],"statements":[]}"
>>> $entity = Wikibase\Repo\WikibaseRepo::getDefaultInstance()->getEntityContentDataCodec()->decodeEntity( $blob, $slot->getFormat() );
=> Wikibase\MediaInfo\DataModel\MediaInfo {#6194}

So it returns that and a null EntityRedirect instead of the EntityRedirect that the caller was expecting, and then errors out when calling getTargetId() on the null EntityRedirect.

From looking at the decodeEntity and decodeRedirect methods of EntityContentDataCodec, It looks like the expectation is that the entity content would have a redirect key, causing decodeEntity to return null and signal to the caller to attempt to decode the content as an EntityRedirect, but we as we can see from the content blob, there is no redirect key present:

=> "{"type":"mediainfo","id":"M76630136","labels":{"ja":{"language":"ja","value":"\u30ed\u30b4"}},"descriptions":[],"statements":[]}"

This can be reproduced locally by uploading a file, adding mediainfo content, then editing the page wikitext to hand-create a redirect (e.g., change the page content to #REDIRECT [[File:Foo.jpg]]).

Change 534536 had a related patch set uploaded (by Mholloway; owner: Michael Holloway):
[mediawiki/extensions/Wikibase@master] Allow for entity content existing on redirect pages

https://gerrit.wikimedia.org/r/534536

Propsed a possible fix, but needs review by someone with deeper Wikibase knowledge than I have in case there are negative implications I'm not seeing.

This approach means that if you have MediaInfo content in a slot on the page, and someone edits the wikitext to make the page into a redirect, then the MediaInfo content becomes inaccessible.

Is this what we want?

The only alternative I can think of off the top of my head would be to prevent a redirect if there was a MediaInfo slot on a page, not sure if this is actually possible though

Yeah, this is something of a stopgap. I think part of the issue here is that just creating a redirect from a duplicate file page seems like the wrong thing to do on an editor's part, though I'd need to check on-wiki for what it has to say about this.

There is the capability to delete a file and its page on grounds that it's a duplicate (though only admins and the like have this right), which seems like what should have happened here, but it doesn't appear to do any reconciliation of content. It would be good if Commons had something like Phabricator's "close as duplicate" functionality that would at least attempt to merge the mediainfo content for the two.

@brennen a fix for T231276 deployed to production yesterday may have (mostly) fixed this, can you verify?

@Cparle @Mholloway while the review of https://gerrit.wikimedia.org/r/534536 is still to happen/on-going, I am wondering about the priority of this task. It's been an Unbreak Now! task for 2+ weeks, if WMDE is blocking resolution of such an important issue, we probably should stop doing so rather quick.

Pinging @brennen again to see if our fix for T231276 has indeed fixed this (if it has we can abandon the other patch @WMDE-leszek )

Looks like it's still in wmf.23.

[XYi8uQpAAEYAAFwX@0kAAABW] /w/index.php?title=File:Buddha_with_Naga_(snake).jpg&redirect=no   Wikibase\DataModel\Services\Lookup\EntityLookupException from line 82 of /srv/mediawiki/php-1.34.0-wmf.23/extensions/Wikibase/lib/includes/Store/RevisionBasedEntityLookup.php: Call to a member function getTargetId() on a non-object (null)
Krinkle set Request URL to https://commons.wikimedia.org/w/index.php?title=File:Buddha_with_Naga_(snake).jpg&redirect=no.Sep 30 2019, 10:00 PM
Krinkle updated the task description. (Show Details)

Still reproducible at the example URL as well.

@Cparle @Mholloway @WMDE-leszek Looks like next step is CR for https://gerrit.wikimedia.org/r/534536. Is this something SDC can/should review internally, or is it blocked on WMDE? Looks like the expectation was for SDC to review internally but delayed due to offsites. Perhaps WMDE can help?

These production URLs have been fatal for over 2 months. Would be good to see this resolved soon one way or another – both as user, and as Logstash crawler. Thanks :)

We'll provide some (hopefully helpful) input for SDC folks in the next 24 hours or so

I checked in the Wikidata team what the status was and we hope to have some additional code for review soon (hopefully today). Sorry about the delay.

Change 543860 had a related patch set uploaded (by WMDE-leszek; owner: WMDE-leszek):
[mediawiki/extensions/Wikibase@master] Added dedicated exception to distinct issue when revision is inconsistently marked as redirect

https://gerrit.wikimedia.org/r/543860

Change 543868 had a related patch set uploaded (by WMDE-leszek; owner: WMDE-leszek):
[mediawiki/extensions/Wikibase@master] Extracted logic of load "entity data" from wiki page to own class

https://gerrit.wikimedia.org/r/543868

Change 543872 had a related patch set uploaded (by WMDE-leszek; owner: WMDE-leszek):
[mediawiki/extensions/WikibaseMediaInfo@master] Added Entity Revision Lookup that can handle inconsistent redirects on file pages

https://gerrit.wikimedia.org/r/543872

I've drafted out the version of https://gerrit.wikimedia.org/r/534536 which I like better as https://gerrit.wikimedia.org/r/543872 and https://gerrit.wikimedia.org/r/543868.
These patches are not fully ready yet (as this was draft, I did pretty sloppy work in the test department), but should be rather easy to get to mergeable state.
I intended to create a nice regression test for the issue visible to users, but failed badly (reason for the delay). I wholeheartedly invite others to give it a try!

Change 534536 abandoned by Mholloway:
Allow for entity content existing on redirect pages

Reason:
Abandoning in favor of Leszek's more thought-out approach (see https://gerrit.wikimedia.org/r/#/q/topic:mediainfo-redirect-kaboom)

https://gerrit.wikimedia.org/r/534536

I abandoned my patch in favor of those from @WMDE-leszek, which look good to me. I'll add my +1s but I think it's probably more appropriate for full-time Wikibase and WikibaseMediaInfo devs to merge.

Change 543860 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Added dedicated exception to distinct issue when revision is inconsistently marked as redirect

https://gerrit.wikimedia.org/r/543860

Change 543868 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Extracted logic of load "entity data" from wiki page to own class

https://gerrit.wikimedia.org/r/543868

Change 545297 had a related patch set uploaded (by Alaa Sarhan; owner: WMDE-leszek):
[mediawiki/extensions/Wikibase@wmf/1.35.0-wmf.3] Added dedicated exception to distinct issue when revision is inconsistently marked as redirect

https://gerrit.wikimedia.org/r/545297

Change 545301 had a related patch set uploaded (by Alaa Sarhan; owner: WMDE-leszek):
[mediawiki/extensions/Wikibase@wmf/1.35.0-wmf.3] Extracted logic of load "entity data" from wiki page to own class

https://gerrit.wikimedia.org/r/545301

I see some patches on the branch, I guess the plan is backport these, but I don't see anything scheduled yet.
Will check in our daily which is in just over 1 hour.

Change 545297 abandoned by Addshore:
Added dedicated exception to distinct issue when revision is inconsistently marked as redirect

Reason:
not going to backport now

https://gerrit.wikimedia.org/r/545297

Change 545301 abandoned by Addshore:
Extracted logic of load "entity data" from wiki page to own class

Reason:
not going to backport now

https://gerrit.wikimedia.org/r/545301

Change 543872 merged by jenkins-bot:
[mediawiki/extensions/WikibaseMediaInfo@master] Added Entity Revision Lookup that can handle inconsistent redirects on file pages

https://gerrit.wikimedia.org/r/543872

alaa_wmde claimed this task.