Page MenuHomePhabricator

High frequency production warning: Couldn't find a global ID for user {user}
Closed, ResolvedPublicBUG REPORT

Description

GlobalPreferences frequently fails to determine a user's global user ID and logs a warning, currently over a hundred per minute.
This seems to happen in bursts of up to ten warnings per user over a couple of seconds, often spanning two or more wikis and two app servers.

An obvious candidate cause is replication lag, but the issue seems to occur for old user accounts as well as for newly created users.

Example reqIds: e48af272-371f-4a3e-89e5-22e4772099bd, e48af272-371f-4a3e-89e5-22e4772099bd


Error
normalized_message
Couldn't find a global ID for user {user}

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
BPirkle triaged this task as Medium priority.Jun 22 2021, 9:18 PM
BPirkle moved this task from Inbox to Later on the Platform Team Workboards (Clinic Duty Team) board.
Func changed the subtype of this task from "Bug Report" to "Production Error".Mar 9 2022, 8:13 AM
Krinkle subscribed.

This appears to be about a diagnostic message logged in the GlobalPreferences channel by GlobalPreferencesFactory.php. This is not a production error, untagging as such. This is not a statement of severity, just of category. There are plenty of tasks that represent important bugs or tech debt that aren't prod errors.

Hi, does this still happen frequently these days?
It's weird that an error that can cause a great difference in the appearance of UI is suppressed to a warning in advance. T302627 can be caused by this.

Hi, does this still happen frequently these days?

It does, I see about 1.4k entries in the last 15 minutes.

As far as I can tell the CentralIdLookup API offered by CentralAuth works fine:

taavi@mwmaint1002 ~ $ mwscript shell.php enwiki
Psy Shell v0.11.1 (PHP 7.2.34-18+0~20210223.60+debian10~1.gbpb21322+wmf5 — cli) by Justin Hileman
>>> $lookup = \MediaWiki\MediaWikiServices::getInstance()->getCentralIdLookupFactory()->getLookup();
=> MediaWiki\Extension\CentralAuth\User\CentralAuthIdLookup {#3323}
>>> $lookup->centralIdFromLocalUser( User::newFromName( 'Majavah' ), CentralIdLookup::AUDIENCE_RAW );
=> 57461233

Hmm, all of the requests I just checked on logstash with this warning were new account creations. Maybe some race condition somewhere?

In T285216#7806550, @Majavah wrote:

Hmm, all of the requests I just checked on logstash with this warning were new account creations. Maybe some race condition somewhere?

Just in the same request of new account creations? Maybe we can filter out them by handling the LocalUserCreated hook, since looking for new users' preference don't make much sense. Then we can see if there are any remaining warnings for old accounts mentioned in the task description.

Change 773904 had a related patch set uploaded (by Func; author: Func):

[mediawiki/extensions/GlobalPreferences@master] Avoid unnecessary calls to fetching global preferences

https://gerrit.wikimedia.org/r/773904

Change 773904 merged by jenkins-bot:

[mediawiki/extensions/GlobalPreferences@master] Avoid unnecessary calls to fetching global preferences

https://gerrit.wikimedia.org/r/773904

Noting that the frequency of this warning dropped dramatically around June 30, once https://gerrit.wikimedia.org/r/773904 got to group2. However I still see thousands of entries each day, but they could have different causes: https://logstash.wikimedia.org/goto/b51177280bfcdd35ce6a2e95dbb038fe (this takes a while to load, FYI)

I was able to reproduce this locally by navigating to Special:Preferences with a central session but no local account. Early in the request, I got this stack trace:

  • CentralAuthUser.php:2493, MediaWiki\Extension\CentralAuth\User\CentralAuthUser->loadAttached()
  • CentralAuthUser.php:3177, MediaWiki\Extension\CentralAuth\User\CentralAuthUser->attachedOn()
  • CentralAuthIdLookup.php:122, MediaWiki\Extension\CentralAuth\User\CentralAuthIdLookup->isAttached()
  • CentralAuthIdLookup.php:129, MediaWiki\Extension\CentralAuth\User\CentralAuthIdLookup->centralIdFromLocalUser()
  • GlobalPreferencesFactory.php:508, {closure:GlobalPreferencesFactory.php:506}()
  • MapCacheLRU.php:271, MapCacheLRU->getWithSetCallback()
  • GlobalPreferencesFactory.php:509, GlobalPreferences\GlobalPreferencesFactory->getUserID()
  • GlobalPreferencesFactory.php:494, GlobalPreferences\GlobalPreferencesFactory->isUserGlobalized()
  • Hooks.php:86, GlobalPreferences\Hooks->onLoadUserOptions()
  • HookContainer.php:159, MediaWiki\HookContainer\HookContainer->run()
  • HookRunner.php:4313, MediaWiki\HookContainer\HookRunner->onLoadUserOptions()
  • UserOptionsManager.php:676, MediaWiki\User\Options\UserOptionsManager->loadOriginalOptions()
  • UserOptionsManager.php:452, MediaWiki\User\Options\UserOptionsManager->saveOptionsInternal()
  • UserOptionsManager.php:421, MediaWiki\User\Options\UserOptionsManager->saveOptions()
  • User.php:2757, MediaWiki\User\User->addToDatabase()
  • AuthManager.php:1911, MediaWiki\Auth\AuthManager->autoCreateUser()
  • Setup.php:585, require_once()
  • WebStart.php:85, require()
  • index.php:50, {main}()

Attachment is being queried while User::addToDatabase() is still in the stack. CentralAuth has not yet had a chance to do attachment. A global ID of zero is saved into GlobalPreferencesFactory::$centralIds and that cache is not cleared once the user is attached. So, later in the same request, the global ID is still zero and the warning is logged.

Func changed the subtype of this task from "Production Error" to "Bug Report".Feb 16 2024, 6:30 AM
Func removed a project: Wikimedia-production-error.

Change #1016078 had a related patch set uploaded (by Tim Starling; author: Tim Starling):

[mediawiki/extensions/GlobalPreferences@master] Remove central ID cache

https://gerrit.wikimedia.org/r/1016078

Change #1016078 merged by jenkins-bot:

[mediawiki/extensions/GlobalPreferences@master] Remove central ID cache

https://gerrit.wikimedia.org/r/1016078