Page MenuHomePhabricator

Flow integration tests block GrowthExperiments merges with an unexpected ContainerDisabledException
Closed, ResolvedPublic

Description

It seems Flow tests are consistently failing, example build 138397 and it is blocking GrowthExperiments merging.

1) Flow\Tests\Api\ApiFlowEditTopicSummaryTest::testEditTopicSummary
13:19:25 Wikimedia\Services\ContainerDisabledException: Container disabled!
13:19:25 
13:19:25 /workspace/src/vendor/wikimedia/services/src/ServiceContainer.php:403
13:19:25 /workspace/src/includes/MediaWikiServices.php:349
13:19:25 /workspace/src/includes/MediaWikiServices.php:1602
13:19:25 /workspace/src/includes/ServiceWiring.php:2324
13:19:25 /workspace/src/includes/user/UserGroupManager.php:903
13:19:25 /workspace/src/extensions/Flow/includes/TalkpageManager.php:209
13:19:25 /workspace/src/extensions/Flow/includes/TalkpageManager.php:83
13:19:25 /workspace/src/extensions/Flow/includes/SubmissionHandler.php:158
13:19:25 /workspace/src/extensions/Flow/includes/WorkflowLoader.php:66
13:19:25 /workspace/src/extensions/Flow/includes/Api/ApiFlowBasePost.php:35
13:19:25 /workspace/src/extensions/Flow/includes/Api/ApiFlow.php:100
13:19:25 /workspace/src/includes/api/ApiMain.php:1931
13:19:25 /workspace/src/includes/api/ApiMain.php:877
13:19:25 /workspace/src/tests/phpunit/includes/api/ApiTestCase.php:169
13:19:25 /workspace/src/extensions/Flow/tests/phpunit/Api/ApiTestCase.php:71
13:19:25 /workspace/src/tests/phpunit/includes/api/ApiTestCase.php:209
13:19:25 /workspace/src/extensions/Flow/tests/phpunit/Api/ApiTestCase.php:86
13:19:25 /workspace/src/extensions/Flow/tests/phpunit/Api/ApiFlowEditTopicSummaryTest.php:20
13:19:25 /workspace/src/tests/phpunit/includes/api/ApiTestCase.php:322
13:19:25 phpvfscomposer:///workspace/src/vendor/phpunit/phpunit/phpunit:97

There are 5 errors in total yielding a ContainerDisabledException

Event Timeline

I can reproduce the issue locally, but only when I run all Flow tests together (composer phpunit:entrypoint extensions/Flow/tests/phpunit/). When I run only composer phpunit:entrypoint extensions/Flow/tests/phpunit/Api/ApiFlowEditTopicSummaryTest.php, everything passes. This makes me think the issue is caused by another Flow test not being perfectly self-contained and breaking the service container by somehow calling MediaWikiServices::destroy and not re-initing the container afterwards, which appears to be the only way how this exception can be triggered. The question is why this happens.

Since the issue is reproducible by running only Flow's API tests, the issue has to be somewhere within the API tests. Checking calls to MediaWikiServices::destroy in test code shows:

  • MediaWikiIntegrationTestCase::overrideMwServices
  • MediaWikiIntegrationTestCase::restoreMwServices

I individually commented out both destroy calls to find out if disabling any of them would get the tests passing. When I commented only the second one, the Flow tests passed.

Urbanecm_WMF renamed this task from Exception while running phpunit tests to Flow integration tests block GrowthExperiments merges with an unexpected ContainerDisabledException.Nov 24 2023, 9:20 AM

The issue is caused by 946538: Replace MediaWikiIntegrationTestCase::$tablesUsed with automatic query tracking. I don't currently understand how exactly, but reverting that patch locally makes the issue go away. Pinging @Daimona @tstarling as the patch author/reviewer – can you help with fixing this problem, please?

FWIW, this currently breaks all merges to GrowthExperiments.

I can reproduce the issue locally, but only when I run all Flow tests together (composer phpunit:entrypoint extensions/Flow/tests/phpunit/).

Yup, I too can reproduce this locally, by running just ApiFlowEditTitleTest and ApiFlowEditTopicSummaryTest.

The issue is caused by 946538: Replace MediaWikiIntegrationTestCase::$tablesUsed with automatic query tracking. I don't currently understand how exactly, but reverting that patch locally makes the issue go away.

Reverting that patch does not fix the issue for me. In fact, the test still fails if I check out REL1_41 of both core and Flow. BUT, reverting the tablesUsed patch does fix the issue if I run all the tests in Flow/tests/phpunit/Api, instead of just the two tests above. Which presumably means the code has "always" been broken, but some mysterious side effect (caused by a combination of running multiple tests AND using tablesUsed) was hiding the failure. The tablesUsed patch removed one of the two causes of that side effect, and the bug came up. Mostly, I'm happy that things have (finally) broken visibly.

I still haven't determined what the side effect in question is.

Also, it looks like the failure is caused by the setCurrentUser call in Flow's ApiTestCase::setUp, and ::getContainer in particular. I'm really looking forward to discovering the new, fun way in which Flow's custom service container is setting MediaWiki on fire.

Change 977290 had a related patch set uploaded (by Umherirrender; author: Umherirrender):

[mediawiki/extensions/Flow@master] tests: Remove ApiTestCase::setCurrentUser

https://gerrit.wikimedia.org/r/977290

Just removing this Container call in the test shows other places with failure from AbuseFilter/CheckUser involved, clearing the hooks works in that situation, not sure if that is the best fix.

Flow's container logic is not the best as there are now services (T170330). The global user is in the container and in the FlowUser service, two places are not good to setup tests correctly.

Change 977290 merged by jenkins-bot:

[mediawiki/extensions/Flow@master] tests: Remove ApiTestCase::setCurrentUser

https://gerrit.wikimedia.org/r/977290

Urbanecm_WMF assigned this task to Umherirrender.

Just removing this Container call in the test shows other places with failure from AbuseFilter/CheckUser involved, clearing the hooks works in that situation, not sure if that is the best fix.

Thank you so much for the fix! I've merged it, since the patch does its job and it's preferable having a working CI over a perfect solution. Since the tests pass now, I'm also resolving the task – feel free to log a task for future test improvements if needed.

Flow's container logic is not the best as there are now services (T170330). The global user is in the container and in the FlowUser service, two places are not good to setup tests correctly.

Agreed. FWIW, the Growth-Team is currently exploring ways to potentially deprecate Flow in the future (see T346108 and the project page).