Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use higher precision time source in retry decorator/tests and avoid flaky failures #12839

Merged
merged 1 commit into from
Jul 13, 2024

Conversation

ichard26
Copy link
Member

@ichard26 ichard26 commented Jul 10, 2024

Despite my best efforts to account for Windows' coarse timer precision, the test_retry_wait and test_retry_time_limit tests are still randomly failing as it isn't sleeping for long enough by a small margin:

Example 1: 472.171 - 472.156 >= 0.015 (~1% too low)
Example 2: 554.406 - 554.265 >= 0.15 (~7% too low)

(call_time - test_start_time >= expected sleep duration between retries)

To avoid these random failures, I've adjusted the required sleep duration to be 10% less which should be enough.

I've also replaced time.monotonic() with time.perf_counter() in the retry utility and test suite since it's effectively monotonic1 while providing much higher resolution (esp. on Windows).

@pradyunsg this should fix the CI failures on main. Sorry for the trouble!

Footnotes

  1. It's not guaranteed to be monotonic in the Python docs, but it is
    indeed monotonic on all platforms we care about.

@ichard26 ichard26 added skip news Does not need a NEWS file entry (eg: trivial changes) C: tests Testing and related things labels Jul 10, 2024
@notatallshaw
Copy link
Member

If time resolution is important, why not switch to perf_counter()?

On Windows:

>>> mono = [time.monotonic() for x in range(10)]
>>> [mono[-1] - m for m in mono]
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]

>>> perf = [time.perf_counter() for x in range(10)]
>>> [perf[-1] - p for p in perf]
[3.4999102354049683e-06, 8.996576070785522e-07, 8.00006091594696e-07, 5.997717380523682e-07, 3.995373845100403e-07, 2.998858690261841e-07, 1.993030309677124e-07, 1.993030309677124e-07, 9.96515154838562e-08, 0.0]

@ichard26
Copy link
Member Author

ichard26 commented Jul 10, 2024

I mean, we could do that. I used monotonic() originally as that's what the tenacity implementation used. Are there any downsides to time.perf_counter()? I can concede this is an extreme edge case, but can it go backwards due to a poorly timed system time update?

Time resolution is not that important in practice, it's just that poor resolution becomes a pain when trying to write unit tests for the retry decorator.

@notatallshaw
Copy link
Member

notatallshaw commented Jul 10, 2024

Are there any downsides to time.perf_counter()? I can concede this is an extreme edge case, but can it go backwards due to a poorly timed system time update?

My understanding is perf_counter is also guaranteed to be monotonic.

Looking at PEP 418 the difference seems to be different API calls (GetTickCount64 vs. QueryPerformanceCounter), I do not know enough about Windows APIs to know much about the difference beyond googling them.

But my personal experience is there is no noticable downside to perf_counter when high resolution timings in Python on Windows are important.

Time resolution is not that important in practice, it's just that poor resolution becomes a pain when trying to write unit tests for the retry decorator.

Yeah, this was a drive by suggestion, I didn't quite grok what was the issue here, just from my windows experience perf_counter has been helpful for high resolution timing.

@pradyunsg
Copy link
Member

Yea, let's switch to perf_counter instead of monotonic here -- it has a higher resolution and doesn't really seem to have major downsides.

@morotti
Copy link
Contributor

morotti commented Jul 11, 2024

FYI: This appears to be a bug in the python interpreter, monotonic() is using a low precision timer, they fixed in main branch (python 3.13+ I think) to use perf_counter instead. python/cpython#116781

@notatallshaw
Copy link
Member

notatallshaw commented Jul 11, 2024

FYI: This appears to be a bug in the python interpreter, monotonic() is using a low precision timer, they fixed in main branch (python 3.13+ I think) to use perf_counter instead. python/cpython#116781

Thanks, I just read through python/cpython#88494 and the tl;dr is Python 3.13 now uses precise timers on Windows for time.monotonic and time.time, and that QueryPerformanceCounter used to be problematic on Windows back when time.monotonic was first implemented but now it is considered highly reliable.

However, I don’t see any effort to backport so for now it seems pip must use time.perf_counter for high resolution timing on Windows, at least until it drops Python 3.12 support.

@ichard26 ichard26 force-pushed the fix-flaky-retry-tests-windows branch 2 times, most recently from 578510a to 98cfa14 Compare July 11, 2024 20:33
@ichard26 ichard26 changed the title Tweak retry tests to account for Windows timer's coarse precision @ichard26 Use higher precision time source in retry decorator/tests and add a margin of error Jul 11, 2024
@ichard26 ichard26 changed the title @ichard26 Use higher precision time source in retry decorator/tests and add a margin of error Use higher precision time source in retry decorator/tests and add a margin of error to avoid flaky failures Jul 11, 2024
@ichard26
Copy link
Member Author

OK, I think this should actually be good now. I took a look at the other Windows CI failures and discovered that test_retry_time_limit was also subject to random failures so I added a margin to its asserts as well.

Sorry all.

@ichard26
Copy link
Member Author

OK, so sleep() on Windows can also apparently suspend execution for shorter than requested...?!1 I think I'm going to only assert the timestamps on Linux and macOS because I have utterly no faith in the reliability of the Windows tests.

Footnotes

  1. Yes, this edge case is documented but AFAIU, there is no signal handler raising an exception involved here...

@ichard26 ichard26 force-pushed the fix-flaky-retry-tests-windows branch from 98cfa14 to f5a6fd0 Compare July 11, 2024 21:05
@ichard26
Copy link
Member Author

🙃 alright, let's just skip these tests entirely on Windows

@ichard26 ichard26 added this to the 24.2 milestone Jul 11, 2024
…laky failures

I've replaced time.monotonic() with time.perf_counter() in the
retry utility and test suite since it's effectively monotonic[^1] while
providing much higher resolution (esp. on Windows).

In addition, to avoid flaky failures (originally on Windows) I've added
a margin of error to the timed tests. I've also outright disabled the
timed tests on Windows as further testing revealed that they were simply
too unreliable to be useful (I once observed a deviation of 30%).

[^1]: It's not guaranteed to be monotonic in the Python docs, but it is
      _indeed_ monotonic on all platforms we care about.
@ichard26 ichard26 force-pushed the fix-flaky-retry-tests-windows branch from f5a6fd0 to 2bec84f Compare July 11, 2024 21:26
@ichard26 ichard26 changed the title Use higher precision time source in retry decorator/tests and add a margin of error to avoid flaky failures Use higher precision time source in retry decorator/tests and avoid flaky failures Jul 11, 2024
@ichard26 ichard26 mentioned this pull request Jul 12, 2024
3 tasks
@pradyunsg pradyunsg merged commit 3228d76 into pypa:main Jul 13, 2024
28 checks passed
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 29, 2024
@ichard26 ichard26 deleted the fix-flaky-retry-tests-windows branch November 2, 2024 16:04
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C: tests Testing and related things skip news Does not need a NEWS file entry (eg: trivial changes)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants