Page MenuHomePhabricator

Flaky test MessageCacheTest::testLoadFromDB_fetchLatestRevision
Closed, ResolvedPublic

Description

Sometimes, the flaky test MessageCacheTest::testLoadFromDB_fetchLatestRevision blocks merge, see: https://integration.wikimedia.org/ci/job/mediawiki-quibble-vendor-postgres-php70-docker/2636/console

00:09:16 1) MessageCacheTest::testLoadFromDB_fetchLatestRevision
00:09:16 === Logs generated by test case
00:09:16 [caches] [info] parser: HashBagOStuff {"private":false}
00:09:16 [StashEdit] [debug] Empty cache for key 'postgres-mediawiki-unittest_:stashed-edit-info:8db213b807f4e71df8c09cd5017c785b:5d9ed11feeb6ebe90f34bd8a5086ab44206045d0:90f42dc8c99187877261a42fd0a194b3' ('MediaWiki:Log'); user '127.0.0.1'. []
00:09:16 [MessageCache] [info] MessageCache::load: Loading de... local cache is empty, global cache is empty, loading from database {"private":false}
00:09:16 [MessageCache] [info] MessageCache::load: Loading en... local cache is empty, global cache is empty, loading from database {"private":false}
00:09:16 [wfDebug] [debug] IP: 127.0.0.1 {"private":false}
....
00:09:16 Failed asserting that two strings are identical.
00:09:16 --- Expected
00:09:16 +++ Actual
00:09:16 @@ @@
00:09:16 -' Test drei'
00:09:16 +' IMPORTED OLD TEST'
00:09:16 
00:09:16 /workspace/src/tests/phpunit/includes/cache/MessageCacheTest.php:258
00:09:16 /workspace/src/tests/phpunit/MediaWikiTestCase.php:425
00:09:16 /workspace/src/maintenance/doMaintenance.php:94
00:09:16 
00:09:16 FAILURES!
00:09:16 Tests: 7863, Assertions: 60892, Failures: 1, Skipped: 118.

Patches affected;

Event Timeline

Change 498692 had a related patch set uploaded (by MaxSem; owner: MaxSem):
[mediawiki/core@master] Disable flapping MessageCacheTest::testLoadFromDB_fetchLatestRevision()

https://gerrit.wikimedia.org/r/498692

Change 498692 merged by jenkins-bot:
[mediawiki/core@master] Disable flapping MessageCacheTest::testLoadFromDB_fetchLatestRevision()

https://gerrit.wikimedia.org/r/498692

Hmmm.... I don't see a reason for this to be flaky. If it fails, something is actually wrong. Disabling the test is likely to hide serious errors.

daniel raised the priority of this task from Low to High.Mar 24 2019, 10:29 PM

bumped to high - if this test is "sometimes" failing, it's likely that T218918 is still happening in production "sometimes".

Hmmm.... I don't see a reason for this to be flaky. If it fails, something is actually wrong. Disabling the test is likely to hide serious errors.

@daniel, the reason I mentioned it's flaky is because sometimes it doesn't happen and sometimes it does. Take a look at the patches affected and their history and you'll noticed failure wasn't always. Also, sometimes it blocks merges, sometimes it doesn't. Thanks

Do you suggest we re-enable the test so you can see it happening on patches? Maybe it will help you?

Change 498804 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/core@master] DNM: Fix MessagecacheTest::testLoadFromDB_fetchLatestRevision

https://gerrit.wikimedia.org/r/498804

From the occurrences linked in the description it seems that the patch sometimes fails on mediawiki-quibble-composer-mysql-php70-docker while passing on mediawiki-quibble-vendor-mysql-php70-docker/15406/console on the same patch, and vice versa. Also, a simple recheck apparently made it pass again.

It's unclear to me how that kind of randomness can creep into this test, since loadFromDB hits the database directly. Also, the revision that contains the offending content should never become the page's current revision. So either the import goes wrong somehow (in ImportableOldRevisionImporter::import) and WikiPage::updateIfNewerOn() ends up making the imported revision the latest revision, even though it has an ancient timestamp. Or loadFromDB() is not in fact using the fixed conditional that ensures that rev_id=page_latest. Both seem odd and disturbing.

Found the bug, all my fault.

The reason was an invalid timestamp in the test data, which caused the timestamp to default to "now", which was "sometimes" newer than the timestamp of the revision immediately created before, instead of way in the past. Fix is up. Sorry for the mess.

👏👏👏👏👏👏👏👏👏, thank you @daniel, you rock!

zeljkofilipin raised the priority of this task from High to Unbreak Now!.Mar 25 2019, 2:52 PM
zeljkofilipin subscribed.

Train blockers are UBN!

Change 498804 merged by jenkins-bot:
[mediawiki/core@master] Fix MessagecacheTest::testLoadFromDB_fetchLatestRevision

https://gerrit.wikimedia.org/r/498804

Nirmos added a subscriber: Daniel-Barrows.
Nirmos removed a subscriber: Daniel-Barrows.