Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python -m pdb <utf-8 encoding file.py> return UnicodeDecodeError on Windows 8.1/10 (code page 936) #103578

Closed
xond opened this issue Apr 16, 2023 · 5 comments
Labels
3.11 only security fixes 3.12 bugs and security fixes

Comments

@xond
Copy link

xond commented Apr 16, 2023

how to replicate the issue

Python 3.11.3/3.11.2/3.11.1/3.11.0 (python-3.11.3-amd64.exe etc)
Microsoft Windows [Version 10.0.19044.2130] (Active code page: 936)

tmp.py

#coding:utf-8
print("中文")

python -m pdb tmp.py

Traceback (most recent call last):
  File "C:\Users\xond\AppData\Local\Programs\Python\Python311\Lib\pdb.py", line 1774, in main
    pdb._run(target)
  File "C:\Users\xond\AppData\Local\Programs\Python\Python311\Lib\pdb.py", line 1652, in _run
    self.run(target.code)
             ^^^^^^^^^^^
  File "C:\Users\xond\AppData\Local\Programs\Python\Python311\Lib\pdb.py", line 167, in code
    return f"exec(compile({fp.read()!r}, {self!r}, 'exec'))"
                           ^^^^^^^^^
UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 28: illegal multibyte sequence
Uncaught exception. Entering post mortem debugging
Running 'cont' or 'step' will restart the program
> c:\users\xond\appdata\local\programs\python\python311\lib\pdb.py(167)code()
-> return f"exec(compile({fp.read()!r}, {self!r}, 'exec'))"

Linked PRs

@gaogaotiantian
Copy link
Member

This is an encoding issue - we can simply use io.read_code to avoid it, as we are always expecting the read code with pdb. PR is done with the regression test.

hauntsaninja pushed a commit that referenced this issue Apr 26, 2023
`pdb` should use `io.open_code` to open code to avoid encoding issue.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Apr 26, 2023
…H-103581)

`pdb` should use `io.open_code` to open code to avoid encoding issue.
(cherry picked from commit 31acfd7)

Co-authored-by: Tian Gao <[email protected]>
hauntsaninja pushed a commit that referenced this issue Apr 26, 2023
) (#103867)

`pdb` should use `io.open_code` to open code to avoid encoding issue.
(cherry picked from commit 31acfd7)

Co-authored-by: Tian Gao <[email protected]>
itamaro pushed a commit to itamaro/cpython that referenced this issue Apr 26, 2023
…103581)

`pdb` should use `io.open_code` to open code to avoid encoding issue.
@hauntsaninja
Copy link
Contributor

Thanks again!

3.10 only security fixes 3.9 only security fixes 3.8 (EOL) end of life labels Apr 27, 2023
Copy link
Member

This is a case that always should've gone through io.open_code, and people may be caught out by having code execution bypass that path.

We should backport to 3.8+

Copy link
Member

I take that back - 3.8-3.10 were already fixed, it was a change in 3.11 that regressed this. So no backports required.

3.11 only security fixes 3.12 bugs and security fixes and removed 3.10 only security fixes 3.9 only security fixes 3.8 (EOL) end of life labels Apr 27, 2023
@hauntsaninja
Copy link
Contributor

Thanks for checking, sorry I could have more explicitly mentioned this! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes 3.12 bugs and security fixes
Projects
None yet
Development

No branches or pull requests

4 participants