Page MenuHomePhabricator

config does differ between unicode and str
Closed, DeclinedPublic

Description

As described in T95671, pywikibot.config2 does issue a warning that the types have changed. The reason for the warning of transliteration_target looks straight forward. It is probably manually set in the user-config as a str (aka bytes) while the original value in the config2 module is unicode.

But I'm not sure why the warning of console_encoding appears. It is read from sys.stdout.encoding which is str (even with unicode_literals) which would explain why it was str but if it changed type and the warning appears that would mean @Wesalius has set the encoding using u'…' which seems unlikely.

Now to fix that we could do something similar like int/float handling so that if the type was str or unicode that it also expects the other type.

Event Timeline

XZise raised the priority of this task from to Low.
XZise updated the task description. (Show Details)
XZise added a project: Pywikibot.
XZise added subscribers: gerritbot, Ricordisamoa, jayvdb and 4 others.

I have used the u' prefix, my user-config contains "usernames['wikipedia']['cs'] = u'HypoBOT'"

My console endcoding is set to 'utf-8' and transliteration_target to console_encoding.

That your username is in unicode doesn't have anything to do with the other warnings. And if you use 'utf-8' it should be str, but maybe unicode_literals is inherited from the config2 module. But if you set transliteration_target to the value of console_encoding it should be the same type. I'll try some tests on my own. Could you maybe show the lines where you define both?

Okay I've checked and “user-config.py” is read with unicode_literals so it doesn't matter if you define anything using 'foo' or u'foo' because both are now unicode. But I couldn't get the transliteration_target warning.
This is a snippet from my “user-config.py”:

password_file = '.passwd'                                                       
print(type(password_file))                                                      
console_encoding = 'utf-8'                                                      
transliteration_target = console_encoding

And it's not importing anything but this is the result:

>>> import pywikibot
<type 'unicode'>
WARNING: Type of 'console_encoding' changed
         Was: <type 'str'>
         Now: <type 'unicode'>

My user-config.py

password_file= 'passfile.txt'

transliteration_target = console_encoding
cosmetic_changes = True
console_encoding = 'utf-8'

AH you set the transliteration_target before console_encoding. Then it makes sense :) console_encoding is by default str and transliteration_target is set to 'not set' which is unicode. Then you set transliteration_target to the value of console_encoding so to a str and then you change console_encoding to 'utf-8' which is also unicode. So the result is as the warning presents:

  • transliteration_target changed from unicode ('not set') to str (console_encodingsys.stdout.encoding)
  • console_encoding changed from str (sys.stdout.encoding) to unicode ('utf-8')

Good that is this mystery solved :)

Ok, I changed the transliteration target back to none. But I cant test if the WARNING dissapeared since replace.py doesnt proceed with non-ascii chars (as in T95803)

is there anything left to do here? should config2 automatically upgrade any str from user-config.py to be unicode?

With gerrit 219589 we might be able to do that separately and not just update all strings into unicodes.

With gerrit 219589 we might be able to do that separately and not just update all strings into unicodes.

That has been merged.. ;-)

Xqt subscribed.

Python 2 has been dropped