-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Semaphore slowdown #97545
Comments
Here my question is if these callbacks are the same thing as what used to wake up futures and resume tasks. We are not using |
Yeah, when a Future is made done, its callbacks are scheduled using call_soon. And the callbacks are what wakes up the tasks waiting for it. |
(cherry picked from commit 68c46ae) Co-authored-by: Cyker Way <[email protected]>
(cherry picked from commit 68c46ae) Co-authored-by: Cyker Way <[email protected]>
Thanks! |
(cherry picked from commit 68c46ae) Co-authored-by: Cyker Way <[email protected]>
(cherry picked from commit 68c46ae) Co-authored-by: Cyker Way <[email protected]>
(cherry picked from commit 68c46ae) Co-authored-by: Cyker Way <[email protected]>
(cherry picked from commit 68c46ae) Co-authored-by: Cyker Way <[email protected]>
This started in #93222 (comment).
Here are two of my posts there, repeated:
The test program spawns oodles of trivial tasks and tries to rate-limit them by making each task acquire a semaphore first. The semaphore allows 50 tasks at a time. With python 3.11rc2, on my Mac it does nearly 25,000 iterations per second. With the new code it does about 900, or about 27 times slower.
The first 50 tasks take the easy path through acquire(), every following task gets put in the queue first. What makes it so slow is that once 50 tasks have acquired the semaphore in a single event loop iteration, they will also release it all in a single iteration -- but the new algorithm only wakes up the first waiting task, and the next 49 release() calls do nothing to the queue. So from then on we only wake up one task per iteration.
Of course, it's easy to complain when the broken code is fast. :-) But I think we can do better.
(UPDATE: The same code without using a semaphore can do over 200,000 loops/sec. Same if I keep the semaphore but set the initial value to 50,000. If I also remove the sleep(0) from the task it can do over 400,000 loops/sec. A loop that doesn't create tasks but calls sleep(0) and bumps the counter runs around 50,000.)
Actually the behavior is guaranteed. Under Scheduling Callbacks I read
So we can definitely rely on this. (Sorry I hadn't mentioned this earlier, I wasn't actually aware of the guarantee, just of how the code works.)
Something that doesn't affect correctness but may affect performance is that the event loop goes through a cycle:
(*) The timeout for the I/O wait is zero if there are callbacks that are immediately ready, otherwise the time until the first callback scheduled for a particular time in the future (
call_later()
).The I/O wait has expensive fixed overhead, so we want to call as many callbacks in a single iteration as possible.
Therefore I think a it behooves us to make all futures ready for which we have place. I think it can be like this abstract algorithm:
(I.e., we should not promise more guests to seat than we have open tables)
The text was updated successfully, but these errors were encountered: