Page MenuHomePhabricator

wikistats xml sometimes contains str instead of unicode on Python 2
Closed, ResolvedPublic

Description

Split off from T128990: English Wikisource has more good pages than French Wikisource, breaking the WikiStats tests for the largest wikisource, wikistats_tests test_xml is failing with the xml data values containing str on Python 2 instead of unicode, which makes it slightly incompatible with the default csv implementation.

Originally identified by @Xqt and solution proposed as https://gerrit.wikimedia.org/r/269730

Additional asserts where added in 0170860dd to confirm the problem.

The problem has not been reproducible on Travis Unix or Appveyor Windows CI builds.

Revisions and Commits

Event Timeline

Change 275780 had a related patch set uploaded (by John Vandenberg):
[bugfix] force wikistats fields to unicode when using xml format

https://gerrit.wikimedia.org/r/275780

Change 275780 merged by jenkins-bot:
[bugfix] force wikistats fields to unicode when using xml format

https://gerrit.wikimedia.org/r/275780

jayvdb triaged this task as Low priority.Mar 8 2016, 7:40 PM
jayvdb removed a project: Patch-For-Review.

To reproduce this bug now, revert d65a6e1c841 and run pwb.py tests/wikistats_tests -v using Python 2

Xqt claimed this task.
Xqt reassigned this task from Xqt to jayvdb.