Page MenuHomePhabricator

mwdebug: people in the "deployment" group should be able to launch 'experimental' instances for testing purposes
Open, MediumPublic

Description

Right now the deployer experience for doing any test "in a production environment" is the following:

  • pick a debug server no one is supposedly using
  • patch mediawiki's code or add a new file in /srv/mediawiki/w
  • reach the server via the wikimedia-debug extension

We want developers to have a similar if not better experience on kubernetes. I think something like the following would work:

  • The developer either provides a file to add, or a patch file that can be *cleanly* applied, to a cli utility
  • this cli utility deploys their patch to kubernetes, and wires it into the ingress as <devname>.mwdebug.discovery.wmnet
  • This address can be reached via the debug extension by providing the developer username

To allow the above we'd need:

  • Add a stanza to allow the use of ingress in the mediawiki chart
  • allow applying a patch to the code running. Possibly by building a specialized image, possibly by allowing to patch the code live. For adding files we already have a solution using the debug.php.contents to provide the additional files that will be reachable at /w/debug/<key>.php
  • destroy the additional release once done
  • write a cli utility to manage the above in a semi-automated way

I think we can probably create a simplified version of the above (without the patching part, basically), and unblock the migration

Event Timeline

Joe triaged this task as Medium priority.Nov 29 2022, 12:13 PM
Joe created this task.
Joe moved this task from Incoming 🐫 to Backlog FY24-25 🚜 on the serviceops board.

An alternative idea from @akosiaris which is also very interesting:

Instead of being dynamically created, deployments for individual developers who previously used mwdebug are properly defined in helmfile, and will be limited to run on a specific k8s node by taints. Each of these deployments will load its code from a hostPath where developers will have a checkout of the mediawiki code.

@Krinkle @taavi @Legoktm and @Ladsgroup also expressed the need of getting a shell inside the instance to run modified maintenance scripts - this would need some modifications to our permissions model, but I think that we might have a way to solve that independently once we've wroked on porting over the maintenance scripts.

So the steps to do what Alex proposed would be:

  • Add support for ingress to the mediawiki chart
  • Add new releases to the mw-debug deployment, one per developer who requests it
  • Add support to load /srv/mediawiki from a hostPath to mediawiki
  • Find a way to constantly update the code in those hostPaths via scap deployments (I would say the best way is to unpack there the contents of a mediawiki container, but it also seems like a quite bespoke solution and I'm not sure I love it.)

Regarding the last point - I think the best way to do this is to actually leave those hostpaths empty until someone needs to do something with the code - then allow people to run a command that will:

  • Copy the code out of the latest mediawiki container to a path on the mw-debug kubernetes node
  • Re-deploy the individual developer session, mounting the code from the hostPath
  • tell the developer where they can find their code.

It would be quite amazing to be able to give a gerrit id of a patch that is backported to a release branch or mw-config that is not yet merged and launch a "patchdemo but production" instance for it.