Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: descriptor 'some_method' for 'A' objects doesn't apply to a 'B' object #121368

Closed
ambv opened this issue Jul 4, 2024 · 10 comments
Closed
Labels
topic-free-threading type-bug An unexpected behavior, bug, or error

Comments

@ambv
Copy link
Contributor

ambv commented Jul 4, 2024

Bug report

I found a bug that seems to be code corruption.

While working on an example project with ~70 threads, I occasionally (once every hour or so) get the following exception from various locks:

Exception in thread Sequence 2:
Traceback (most recent call last):
  File "/Volumes/RAMDisk/installed-nogil-main/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/threading.py", line 1039, in _bootstrap_inner
    self.run()
    ~~~~~~~~^^
  ...
  File "/Volumes/RAMDisk/installed-nogil-main/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/threading.py", line 656, in wait
    with self._cond:
         ^^^^^^^^^^
  File "/Volumes/RAMDisk/installed-nogil-main/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/threading.py", line 304, in __enter__
    return self._lock.__enter__()
           ~~~~~~~~~~~~~~~~~~~~^^
TypeError: descriptor '__exit__' for '_thread.RLock' objects doesn't apply to a '_thread.lock' object

or

Exception in thread Clock:
Traceback (most recent call last):
  File "/Volumes/RAMDisk/installed-nogil-main/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/threading.py", line 1039, in _bootstrap_inner
    self.run()
    ~~~~~~~~^^
  File "europython.py", line 113, in run
    for message in input:
  File "/Volumes/RAMDisk/nogil-ep-temp/lib/python3.14/site-packages/mido/ports.py", line 243, in __iter__
    yield self.receive()
          ~~~~~~~~~~~~^^
  File "/Volumes/RAMDisk/nogil-ep-temp/lib/python3.14/site-packages/mido/ports.py", line 215, in receive
    with self._lock:
         ^^^^^^^^^^
TypeError: descriptor '__exit__' for '_thread.lock' objects doesn't apply to a '_thread.RLock' object

This looks like it's a bug related to locks but it isn't. It's not even related to descriptors, only descriptors nicely refuse running invalid code.


This issue is also externally reported in PyWavelets/pywt#758 with the same Lock descriptor error message I've seen, and I can reproduce the failure locally, albeit with a different exception:

TypeError: descriptor 'sort' for 'numpy.ndarray' objects doesn't apply to a 'ThreadPoolExecutor' object

To reproduce this with cpython main, do the following:

  • make a venv with a free-threaded build of Python
  • install Cython from main with pip install -e .
  • install Numpy from main with pip install . --no-build-isolation (important: no -e in this case)
  • install pywt from main with pip install -e . --no-build-isolation (important: you DO need -e in this case)
  • run pytest in a loop (or with autoclave) like this: PYTHON_GIL=0 pytest pywt/tests/test_concurrent.py

You will need to run this for a longer while to get to a failure.

By doing this, I managed to find this particular failure case:

self = <concurrent.futures.thread.ThreadPoolExecutor object at 0x225be124750>, fn = functools.partial(<function dwtn at 0x225be6d3b40>, wavelet='haar')
args = (array([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0., 0., 0., ...., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]]),)
kwargs = {}, f = <Future at 0x225bea44310 state=finished returned dict>, w = <concurrent.futures.thread._WorkItem object at 0x225bc875bf0>

    def submit(self, fn, /, *args, **kwargs):
        with self._shutdown_lock, _global_shutdown_lock:
            if self._broken:
                raise BrokenThreadPool(self._broken)

            if self._shutdown:
                raise RuntimeError('cannot schedule new futures after shutdown')
            if _shutdown:
                raise RuntimeError('cannot schedule new futures after '
                                   'interpreter shutdown')

            f = _base.Future()
            w = _WorkItem(f, fn, args, kwargs)

            self._work_queue.put(w)
>           self._adjust_thread_count()
E           TypeError: Future.set_result() missing 1 required positional argument: 'result'

args       = (array([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0., 0., 0., ...., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]]),)
f          = <Future at 0x225bea44310 state=finished returned dict>
fn         = functools.partial(<function dwtn at 0x225be6d3b40>, wavelet='haar')
kwargs     = {}
self       = <concurrent.futures.thread.ThreadPoolExecutor object at 0x225be124750>
w          = <concurrent.futures.thread._WorkItem object at 0x225bc875bf0>

../installed-nogil-main/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/concurrent/futures/thread.py:179: TypeError
======================================================================================================== short test summary info ========================================================================================================
FAILED pywt/tests/test_concurrent.py::test_concurrent_dwt - TypeError: Future.set_result() missing 1 required positional argument: 'result'
====================================================================================================== 1 failed, 3 passed in 0.44s ======================================================================================================
-- 863 runs, 862 passes, 1 failure, 734486 msec

Observe how Python wants to call self._adjust_thread_count() (with no arguments) but ends up calling f.set_result(), which causes an exception due to no arguments being passed.


Tested on macOS Sonoma on M1 Max with Python 3.14.0a0 experimental free-threading build (heads/main:7a807c3efaa, Jul 2 2024, 11:58:38).

AFAICT the problem only occurs with the GIL actually disabled.

Linked PRs

@ngoldbaum
Copy link
Contributor

By the way I hit this last week with 3.13.0b2 so this is likely a problem on the 3.13 branch as well and not something new in 3.14.

@ambv
Copy link
Contributor Author

ambv commented Jul 4, 2024

Another interesting failure from running the pywt tests that shows this is beyond descriptors (despite the exception):

../installed-nogil-main/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/concurrent/futures/_base.py:322: in __init__
    self._condition = threading.Condition()
        self       = <[AttributeError("'Future' object has no attribute '_condition'") raised in repr()] Future object at 0x5ddd8796110>
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <[AttributeError("'Condition' object has no attribute '_waiters'") raised in repr()] Condition object at 0x5ddd8795790>
lock = <unlocked _thread.RLock object owner=0 count=0 at 0x5ddd881a300>

    def __init__(self, lock=None):
        if lock is None:
            lock = RLock()
        self._lock = lock
        # Export the lock's acquire() and release() methods
        self.acquire = lock.acquire
>       self.release = lock.release
E       TypeError: descriptor 'result_type' for 'numpy._core._multiarray_umath._array_converter' objects doesn't apply to a '_thread.RLock' object

Look at the failing self reprs. In both cases we're talking about attributes set in __init__ and never unset/deleted.

@colesbury
Copy link
Contributor

I have tried to reproduce the above issue with pywt, and I'm routinely seeing errors like:

TypeError: fft() got an unexpected keyword argument 'axes'

Has the NumPy argument parsing been made thread-safe yet?

@ngoldbaum
Copy link
Contributor

No, that PR is still in-flight: numpy/numpy#26780

@ambv
Copy link
Contributor Author

ambv commented Jul 4, 2024

I should probably mention, I'm testing all this with CFLAGS="-g0 -O3" and ./configure --enable-optimizations --with-lto --disable-gil.

@ambv
Copy link
Contributor Author

ambv commented Jul 4, 2024

Minimal repro so far

This is somewhat frustrating to reproduce without any larger dependencies or codebases. The smallest thing I managed to get to crash is this:

# repro4.py

from concurrent import futures
import os
import random

def calculate(arr):
    return sum(arr) / len(arr)

print(os.environ["PYTHONHASHSEED"], end=" ", flush=True)

for _ in range(100):
    with futures.ThreadPoolExecutor(max_workers=os.cpu_count()) as ex:
        arrs = [[random.random() for _ in range(100)] for _ in range(50)]
        results = list(ex.map(calculate, arrs))

Run it with

for seed in (seq -f%1.0f 1000000 1100000)
      PYTHONHASHSEED=$seed python repro4.py
  end

with the fish shell. On the bash shell (seq -f%1.0f 1000000 1099999 | while read s; do PYTHONHASHSEED=$s python3.14 repro4.py; done) it reproduces much less often.

The hash seed is a red herring but, believe me, if I remove that bit, I can't repro no more.

The result of running this for a longer while is a similar unexpected code execution, one example being:

  File "/Users/ambv/Documents/Python/europython/repro4.py", line 13, in <module>
    results = list(ex.map(calculate, arrs))
  File "/Volumes/RAMDisk/installed-nogil-main/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/concurrent/futures/_base.py", line 611, in result_iterator
    yield _result_or_cancel(fs.pop())
          ~~~~~~~~~~~~~~~~~^^^^^^^^^^
  File "/Volumes/RAMDisk/installed-nogil-main/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/concurrent/futures/_base.py", line 309, in _result_or_cancel
    return fut.result(timeout)
           ~~~~~~~~~~^^^^^^^^^
  File "/Volumes/RAMDisk/installed-nogil-main/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/threading.py", line 526, in release
    if n < 1:
       ^^^^^
TypeError: '<' not supported between instances of 'NoneType' and 'int'

where the jump from calling concurrent.futures.Future.result(timeout) (where timeout=None) is unexpectedly being executed by threading.Semaphore.release(n=1) that doesn't expect n to ever be None. It's very weird. I managed to run pdb when this situation happens and running fut.result(None) returns the expected outcome. So this must be some momentary corruption.

@colesbury
Copy link
Contributor

Thanks, I'm able to reproduce it intermittently now. So far only on my macOS arm64 laptop, not on x86-64 Linux. Reducing MCACHE_SIZE_EXP in pycore_typeobject.h seems to make it happen more quickly -- I changed it to 6 locally.

@colesbury
Copy link
Contributor

The bug is in our seq lock implementation:

cpython/Python/lock.c

Lines 550 to 560 in cb688ba

uint32_t _PySeqLock_EndRead(_PySeqLock *seqlock, uint32_t previous)
{
// Synchronize again and validate that the entry hasn't been updated
// while we were readying the values.
if (_Py_atomic_load_uint32_acquire(&seqlock->sequence) == previous) {
return 1;
}
_Py_yield();
return 0;
}

cpython/Objects/typeobject.c

Lines 5390 to 5402 in cb688ba

int sequence = _PySeqLock_BeginRead(&entry->sequence);
uint32_t entry_version = _Py_atomic_load_uint32_relaxed(&entry->version);
uint32_t type_version = _Py_atomic_load_uint32_acquire(&type->tp_version_tag);
if (entry_version == type_version &&
_Py_atomic_load_ptr_relaxed(&entry->name) == name) {
OBJECT_STAT_INC_COND(type_cache_hits, !is_dunder_name(name));
OBJECT_STAT_INC_COND(type_cache_dunder_hits, is_dunder_name(name));
PyObject *value = _Py_atomic_load_ptr_relaxed(&entry->value);
// If the sequence is still valid then we're done
if (value == NULL || _Py_TryIncref(value)) {
if (_PySeqLock_EndRead(&entry->sequence, sequence)) {
return value;
}

The memory ordering on the _PySeqLock_EndRead isn't sufficient. We need an explicit fence (i.e., atomic_thread_fence). The "acquire" prevents subsequent loads from being reordered before it, but in _PySeqLock_EndRead we want to prevent earlier loads from being reordered after the _PySeqLock_EndRead().

The bug occurs on arm64, but not x86-64, because x86-64 enforces ordering of loads relative to each other.

colesbury added a commit to colesbury/cpython that referenced this issue Jul 4, 2024
colesbury added a commit to colesbury/cpython that referenced this issue Jul 4, 2024
colesbury added a commit to colesbury/cpython that referenced this issue Jul 4, 2024
colesbury added a commit to colesbury/cpython that referenced this issue Jul 8, 2024
The `_PySeqLock_EndRead` function needs an acquire fence to ensure that
the load of the sequence happens after any loads within the read side
critical section.

The acquire on the final sequence load isn't sufficent (or necessary).
colesbury added a commit to colesbury/cpython that referenced this issue Jul 8, 2024
colesbury added a commit that referenced this issue Jul 8, 2024
The `_PySeqLock_EndRead` function needs an acquire fence to ensure that
the load of the sequence happens after any loads within the read side
critical section. The missing fence can trigger bugs on macOS arm64.

Additionally, we need a release fence in `_PySeqLock_LockWrite` to
ensure that the sequence update is visible before any modifications to
the cache entry.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jul 8, 2024
…onGH-121388)

The `_PySeqLock_EndRead` function needs an acquire fence to ensure that
the load of the sequence happens after any loads within the read side
critical section. The missing fence can trigger bugs on macOS arm64.

Additionally, we need a release fence in `_PySeqLock_LockWrite` to
ensure that the sequence update is visible before any modifications to
the cache entry.
(cherry picked from commit 1d3cf79)

Co-authored-by: Sam Gross <[email protected]>
colesbury added a commit that referenced this issue Jul 8, 2024
…121388) (#121505)

The `_PySeqLock_EndRead` function needs an acquire fence to ensure that
the load of the sequence happens after any loads within the read side
critical section. The missing fence can trigger bugs on macOS arm64.

Additionally, we need a release fence in `_PySeqLock_LockWrite` to
ensure that the sequence update is visible before any modifications to
the cache entry.
(cherry picked from commit 1d3cf79)

Co-authored-by: Sam Gross <[email protected]>
@itamaro
Copy link
Contributor

itamaro commented Jul 10, 2024

With the PRs merged, is there anything left to do here?

@vstinner
Copy link
Member

I close the issue.

noahbkim pushed a commit to hudson-trading/cpython that referenced this issue Jul 11, 2024
…on#121388)

The `_PySeqLock_EndRead` function needs an acquire fence to ensure that
the load of the sequence happens after any loads within the read side
critical section. The missing fence can trigger bugs on macOS arm64.

Additionally, we need a release fence in `_PySeqLock_LockWrite` to
ensure that the sequence update is visible before any modifications to
the cache entry.
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
…on#121388)

The `_PySeqLock_EndRead` function needs an acquire fence to ensure that
the load of the sequence happens after any loads within the read side
critical section. The missing fence can trigger bugs on macOS arm64.

Additionally, we need a release fence in `_PySeqLock_LockWrite` to
ensure that the sequence update is visible before any modifications to
the cache entry.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-free-threading type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

5 participants