Page MenuHomePhabricator

[IP Masking] purgeTemporaryAccounts.php tries to revoke access for non-temporary accounts
Closed, ResolvedPublic4 Estimated Story PointsBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

Run extensions/CentralAuth/maintenance/expireTemporaryAccounts.php --verbose --frequency 1 on a wiki with IP Masking enabled (such as beta commonswiki).

What happens?:

Output from beta commonswiki:

commonswiki:  Revoking access for 108.161.128.185
commonswiki:  Revoking access for 98.170.57.231
commonswiki:  Revoking access for 79.175.106.146
commonswiki:  Revoking access for 194.190.170.151
commonswiki:  Revoking access for 93.93.206.230
commonswiki:  Revoking access for 103.130.113.146
commonswiki:  Revoking access for 139.162.241.108
commonswiki:  Revoking access for 107.175.86.228
commonswiki:  Revoking access for 196.247.18.242
commonswiki:  Revoking access for 192.210.230.251
commonswiki:  Revoking access for 96.8.117.122
commonswiki:  Revoking access for 23.95.219.198
commonswiki:  Revoking access for 96.8.120.86

What should have happened instead?:

Only temporary accounts are affected by the purging script.

Event Timeline

Urbanecm_WMF triaged this task as Medium priority.
Urbanecm_WMF set the point value for this task to 4.

There are two sub-bugs here.

  1. expireTemporaryAccounts.php happily revokes access for non-temporary accounts, even though it is trivial to double-check that is not being done, since it is outside of the script's remit.
  2. Calling iterator_to_array(\MediaWiki\MediaWikiServices::getInstance()->get('CentralAuth.GlobalUserSelectQueryBuilderFactory')->newGlobalUserSelectQueryBuilder()->temp()->limit(2)->fetchLocalUserIdentitites()) in beta commonslabs (a) shows much more than 2 accounts (b) includes non-temporary accounts.

(1) is clear and easy to fix. For (2), this is more interesting. Running shell.php with verbose logging (-d 2) shows the following:

> $r = iterator_to_array(\MediaWiki\MediaWikiServices::getInstance()->get('CentralAuth.GlobalUserSelectQueryBuilderFactory')->newGlobalUserSelectQueryBuilder()->temp()->limit(2)->fetchLocalUserIdentitites())                                                             
[debug] [rdbms] Wikimedia\Rdbms\DatabaseMySQL::doSelectDomain [0.001s] deployment-db12: USE `centralauth`                                                                                                                                                                   
[debug] [rdbms] Wikimedia\Rdbms\SelectQueryBuilder [0.002s] deployment-db12: SELECT  gu_id,gu_name,lu_wiki,gu_salt,gu_password,gu_auth_token,gu_locked,gu_hidden_level,gu_registration,gu_email,gu_email_authenticated,gu_home_db,gu_cas_token,lu_local_id  FROM `globaluser
` LEFT OUTER JOIN `localuser` ON ((gu_name=lu_name) AND lu_wiki = 'commonswiki')   WHERE (gu_name  LIKE '*%' ESCAPE '`' )  LIMIT 2                                                                                                                                          
[debug] [rdbms] Wikimedia\Rdbms\DatabaseMySQL::doSelectDomain [0s] deployment-db12: USE `commonswiki`                                                                                                                                                                       
[debug] [rdbms] Wikimedia\Rdbms\LoadBalancer::reuseOrOpenConnectionForNewRef: reusing connection for 1/commonswiki                                                                                                                                                          
[debug] [rdbms] Wikimedia\Rdbms\SelectQueryBuilder [0.012s] deployment-db12: SELECT  actor_id,actor_name,actor_user  FROM `actor`    WHERE (actor_user = '22824' OR actor_user IS NULL)                                                                                     

[...]

> count($r)
= 2574

> $r[2]->getName()
= "95.103.99.3"

>

The code makes two queries: first it runs a query against centralauth.globaluser and centralauth.localuser to get local user IDs, and then it makes a query against actor to get usernames. The second query is where it breaks: the query includes actor_user IS NULL, which is not intended and it means the query includes all anonymous actors. The second query is made via UserSelectQueryBuilder. However, when running that service directly:

> iterator_to_array(\MediaWiki\MediaWikiServices::getInstance()->getUserIdentityLookup()->newSelectQueryBuilder()->whereUserIds(['22825', '22824'])->fetchUserIdentities())
[debug] [rdbms] Wikimedia\Rdbms\LoadBalancer::reuseOrOpenConnectionForNewRef: reusing connection for 2/commonswiki
[debug] [rdbms] Wikimedia\Rdbms\SelectQueryBuilder [0.001s] deployment-db13: SELECT  actor_id,actor_name,actor_user  FROM `actor`    WHERE actor_user IN ('22825','22824')   
= [
    MediaWiki\User\UserIdentityValue {#8956},
    MediaWiki\User\UserIdentityValue {#6397},
  ]

>

The correct result is obtained, and no actor_user IS NULL is present in the query.

This leads me to believe the CentralAuth part is broken. And indeed that is the case:

> $builder = \MediaWiki\MediaWikiServices::getInstance()->get('CentralAuth.GlobalUserSelectQueryBuilderFactory')->newGlobalUserSelectQueryBuilder()->temp()->limit(2)                                                                                                       
[debug] [rdbms] Wikimedia\Rdbms\DatabaseMySQL::doSelectDomain [0.001s] deployment-db13: USE `centralauth`                                                                                                                                                                   
= MediaWiki\Extension\CentralAuth\User\GlobalUserSelectQueryBuilder {#11579}                                                                                                                                                                                                
                                                                                                                                                                                                                                                                            
> sudo $builder->init()                                                                                                                                                                                                                                                     
= null                                                                                                                                                                                                                                                                      
                                                                                                                                                                                                                                                                            
> sudo $builder->field( 'lu_local_id' );    
                                                                                                                                                                                                                                
> sudo $rs = $builder->fetchResultSet()
[debug] [rdbms] Wikimedia\Rdbms\SelectQueryBuilder [0.002s] deployment-db13: SELECT  gu_id,gu_name,lu_wiki,gu_salt,gu_password,gu_auth_token,gu_locked,gu_hidden_level,gu_registration,gu_email,gu_email_authenticated,gu_home_db,gu_cas_token,lu_local_id  FROM `globaluse$` LEFT OUTER JOIN `localuser` ON ((gu_name=lu_name) AND lu_wiki = 'commonswiki')   WHERE (gu_name  LIKE '*%' ESCAPE '`' )  LIMIT 2

> $rows = iterator_to_array($rs)

> foreach ( $rows as $row ) {                                                                                                                                                                                                                                              
$localUserIds[] = $row->lu_local_id;
}
> $localUserIds
= [
    "22824",
    null,
  ]

>

So, we're calling the user query builder with null as the ID. This should likely throw, but at the very least, we need to deal with null values in CentralAuth's query builder somehow.

Change 975092 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[mediawiki/core@master] expireTemporaryAccounts: Only process temporary accounts

https://gerrit.wikimedia.org/r/975092

Change 975092 merged by jenkins-bot:

[mediawiki/core@master] expireTemporaryAccounts: Only process temporary accounts

https://gerrit.wikimedia.org/r/975092

Etonkovidova subscribed.

Checked on beta - it works:

$ mwscript extensions/CentralAuth/maintenance/expireTemporaryAccounts.php --wiki=commonswiki --verbose --frequency 1  
Revoking access for *Unregistered 86161
Revoked access for 1 temporary users.