Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undefined behavior in struct.unpack #96735

Closed
kumaraditya303 opened this issue Sep 10, 2022 · 3 comments
Closed

Undefined behavior in struct.unpack #96735

kumaraditya303 opened this issue Sep 10, 2022 · 3 comments
Assignees
Labels
3.10 only security fixes 3.11 only security fixes 3.12 bugs and security fixes extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@kumaraditya303
Copy link
Contributor

Reproducer:

import struct
struct.unpack('!q', b'\xff\xff\xff\xff\xff\xff\xff\xff')

Error:

Modules/_struct.c:842:15: runtime error: left shift of 72057594037927935 by 8 places cannot be represented in type 'long long int'

Compiler options:

./configure -C --with-pydebug --with-address-sanitizer --with-undefined-behavior-sanitizer 

Python version: Python 3.12.0a0 (heads/main:30cc1901efa, Sep 10 2022, 10:14:58) [GCC 9.4.0] on linux

@kumaraditya303 kumaraditya303 added type-bug An unexpected behavior, bug, or error extension-modules C modules in the Modules dir 3.11 only security fixes 3.10 only security fixes type-crash A hard crash of the interpreter, possibly with a core dump 3.12 bugs and security fixes labels Sep 10, 2022
@kumaraditya303
Copy link
Contributor Author

cc @mdickinson

@mdickinson
Copy link
Member

@kumaraditya303 Thanks; this should be an easy fix.

@mdickinson mdickinson self-assigned this Sep 10, 2022
mdickinson added a commit that referenced this issue Sep 25, 2022
This PR fixes undefined behaviour in the struct module unpacking support functions `bu_longlong`, `lu_longlong`, `bu_int` and `lu_int`; thanks to @kumaraditya303 for finding these.

The fix is to accumulate the bytes in an unsigned integer type instead of a signed integer type, then to convert to the appropriate signed type. In cases where the width matches, that conversion will typically be compiled away to a no-op.
(Evidence from Godbolt: https://godbolt.org/z/5zvxodj64 .)

To make the conversions efficient, I've specialised the relevant functions for their output size: for `bu_longlong` and `lu_longlong`, this only entails checking that the output size is indeed `8`. But `bu_int` and `lu_int` were used for format sizes `2` and `4` - I've split those into two separate functions each.

No tests, because all of the affected cases are already exercised by the test suite.
@mdickinson
Copy link
Member

Fixed for 3.12 in #96739. I agree with @matthiasgoergens that it's not worth fixing in 3.10 and 3.11, because the issue doesn't manifest itself in release builds, where -fwrapv is used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.10 only security fixes 3.11 only security fixes 3.12 bugs and security fixes extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

2 participants