I've temporarily upgraded mwdebug2001 to a HHVM build with a ICU 57 backport and ran the updateCollation script in dry run mode to check which wikis with any form of UCA collation are affected and it turns out all of them need the migration. In fact, of the ten biggest Wikipedias all bar cebwiki and dewiki need to be migrated.
Here's the rundown sorted by shards:
s1:
[] enwiki
s2:
[] cswiki
[] dewikisource
[] fiwiki
[] itwiki
[] nlwiki
[] nowiki
[] plwiki
[] ptwiki
[] svwiki
[] thwiki
s3:
[x] be_x_oldwiki
[] bewiki
[] bewikisource
[] bswiki
[] ckbwiki
[] cswiktionary
[] cywiki
[] cywikibooks
[] cywikiquote
[] cywikisource
[] cywiktionary
[] eswikiversity
[] etwiki
[] etwikibooks
[] etwikimedia
[] etwikiquote
[] etwikisource
[] etwiktionary
[] fawikibooks
[] fawikinews
[] fawikiquote
[] fawikisource
[] fawiktionary
[] fiwikibooks
[] fiwikimedia
[] fiwikinews
[] fiwikiquote
[] fiwikisource
[] fiwikiversity
[] fiwikivoyage
[] frwikibooks
[] frwikinews
[] frwikiversity
[] gdwiki
[] glwiki
[] hrwiki
[] hsbwiki
[] ilowiki
[] iswiki
[] ltwiki
[] lvwiki
[x] mediawikiwiki
[] mkwiki
[] nowikimedia
[] olowiki
[] plwikisource
[] plwikivoyage
[] plwiktionary
[] ptwikibooks
[] rowikibooks
[] rowikinews
[] rowikiquote
[] rowikisource
[] rowikivoyage
[] rowiktionary
[] rswikimedia
[] ruwikibooks
[] ruwikinews
[] ruwikiquote
[] ruwikisource
[] ruwikiversity
[] ruwikivoyage
[] ruwiktionary
[] shwiki
[] skwiki
[] srwiki
[] srwikibooks
[] srwikinews
[] srwikiquote
[] srwikisource
[] srwiktionary
[] svwikisource
[] tawiki
[] tawikibooks
[] tawikinews
[] tawikiquote
[] tawikisource
[] tawiktionary
[] testwiki
[] thwikibooks
[] thwikinews
[] thwikiquote
[] thwikisource
[] thwiktionary
[] uawikimedia
[] ukwikibooks
[] ukwikinews
[] ukwikiquote
[] ukwikisource
[] ukwikivoyage
[] ukwiktionary
[] viwikibooks
[] viwikiquote
[] viwikisource
[] viwikivoyage
[] viwiktionary
s6:
[] frwiki
[] ruwiki
s7:
[] eswiki
[] fawiki
[] frwiktionary
[] huwiki
[] rowiki
[] ukwiki
[] viwiki
The migration causes some unavoidable user-visible impact: The sorting of some category pages will be distorted; all pages which have been updated with the new HHVM build using ICU 57 will use the new sorting while untouched pages use the old sorting. As such, this change needs to be coordinated with Community Liaisons.
These sorting problems will only be fixed once the updateCollation maintenance script [1] has completed to run for an affected wiki. For past collation changes that took e.g. four hours in 2016 for Swedish Wikipedia and six days for English Wikipedia [2].
Current open questions:
- Discuss the possible concurrency of updateCollation runs per shard with DBAs, in particular for wikis in s2 to minimise user visible impact
- Check the head start Community Liaisons needs to prepare user notifications
The plan for the actual migration would look like this:
- Merge a patch to enable component/icu57 for our jessie-based mediawiki app servers
- Migrate the canaries to the new HHVM build and keep an eye on logs/metrics for an hour (new packages are running in beta already for about a month, but nothing beats production traffic)
- If all is well, upgrade HHVM on the remaining app servers in eqiad and terbium
- Initiate updateCollation runs on terbium
- Upgrade app servers in codfw
Once we've migrated to ICU 57 this unblocks our migration of the app servers to Debian stretch (ICU 57 is backported from stretch) (T174431) and allows us to use a more recent version of Unicode in Mediawiki (T188480)
Footnotes:
[1] https://www.mediawiki.org/wiki/Manual:UpdateCollation.php
[2] https://phabricator.wikimedia.org/T146675#2668367