-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible race condition on multiprocessing.Manager().dict() on macOS #87934
Comments
I am not sure if this is a bug or an expected case. Long story short, I tried to print the content of a I encountered this error only when the number of pools is rather large (>20) and only on Specs:
The minimum err code is attached: #!/usr/bin/env python3
from contextlib import suppress
import multiprocessing as mp
import time
def run():
D[mp.current_process().name] = 'some val'
time.sleep(0.5)
if __name__ == '__main__':
mp.set_start_method('fork')
D, rets = mp.Manager().dict(), []
with mp.Pool(25) as p:
for _ in range(33):
rets.append(p.apply_async(run))
while rets:
for ret in rets[:]:
with suppress(mp.TimeoutError):
ret.get(timeout=0)
rets.remove(ret)
print(len(D)) Error:
|
I tested the script on my machine (macOS 13.0.1, python 3.9, 3.10 and 3.11 all installed using the python.org installer), and the error occurs intermittently, likewise with a fresh build of 3.12. Disabling the local firewall does not avoid this problem. This appears to be a timing problem, the main process is not yet listening to the socket when the child tries to connect. Below is a crude hack that implements a retry loop and appears to fix the issue for me. Added as an inline patch instead of a PR because I'm far from convinced that this is a correct fix. I've barely used multiprocessing myself and know to little about its design to know what the correct place would be to implement a retry loop. diff --git a/Lib/multiprocessing/connection.py b/Lib/multiprocessing/connection.py
index b08144f7a1..7954fefd62 100644
--- a/Lib/multiprocessing/connection.py
+++ b/Lib/multiprocessing/connection.py
@@ -625,9 +625,13 @@ def SocketClient(address):
'''
family = address_type(address)
with socket.socket( getattr(socket, family) ) as s:
- s.setblocking(True)
- s.connect(address)
- return Connection(s.detach())
+ for _ in range(3):
+ try:
+ s.setblocking(True)
+ s.connect(address)
+ return Connection(s.detach())
+ except socket.error:
+ time.sleep(0.5) |
This is the same problem as #101225, but with a different limit in the backlog. |
The race condition doesn't happen for me with the fix for #101225. That's technically just reducing the size of window where the race condition can happen, but should be fine given that I've increased the backlog far beyond what's needed to avoid hitting the race (famous last words...) |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: