Use higher precision time source in retry decorator/tests and avoid flaky failures #12839

ichard26 · 2024-07-10T15:44:40Z

Despite my best efforts to account for Windows' coarse timer precision, the test_retry_wait and test_retry_time_limit tests are still randomly failing as it isn't sleeping for long enough by a small margin:

Example 1: 472.171 - 472.156 >= 0.015 (~1% too low)
Example 2: 554.406 - 554.265 >= 0.15 (~7% too low)

(call_time - test_start_time >= expected sleep duration between retries)

To avoid these random failures, I've adjusted the required sleep duration to be 10% less which should be enough.

I've also replaced time.monotonic() with time.perf_counter() in the retry utility and test suite since it's effectively monotonic¹ while providing much higher resolution (esp. on Windows).

@pradyunsg this should fix the CI failures on main. Sorry for the trouble!

It's not guaranteed to be monotonic in the Python docs, but it is
indeed monotonic on all platforms we care about. ↩

notatallshaw · 2024-07-10T16:05:20Z

If time resolution is important, why not switch to perf_counter()?

On Windows:

>>> mono = [time.monotonic() for x in range(10)]
>>> [mono[-1] - m for m in mono]
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]

>>> perf = [time.perf_counter() for x in range(10)]
>>> [perf[-1] - p for p in perf]
[3.4999102354049683e-06, 8.996576070785522e-07, 8.00006091594696e-07, 5.997717380523682e-07, 3.995373845100403e-07, 2.998858690261841e-07, 1.993030309677124e-07, 1.993030309677124e-07, 9.96515154838562e-08, 0.0]

ichard26 · 2024-07-10T16:25:14Z

I mean, we could do that. I used monotonic() originally as that's what the tenacity implementation used. Are there any downsides to time.perf_counter()? I can concede this is an extreme edge case, but can it go backwards due to a poorly timed system time update?

Time resolution is not that important in practice, it's just that poor resolution becomes a pain when trying to write unit tests for the retry decorator.

notatallshaw · 2024-07-10T17:27:02Z

Are there any downsides to time.perf_counter()? I can concede this is an extreme edge case, but can it go backwards due to a poorly timed system time update?

My understanding is perf_counter is also guaranteed to be monotonic.

Looking at PEP 418 the difference seems to be different API calls (GetTickCount64 vs. QueryPerformanceCounter), I do not know enough about Windows APIs to know much about the difference beyond googling them.

But my personal experience is there is no noticable downside to perf_counter when high resolution timings in Python on Windows are important.

Time resolution is not that important in practice, it's just that poor resolution becomes a pain when trying to write unit tests for the retry decorator.

Yeah, this was a drive by suggestion, I didn't quite grok what was the issue here, just from my windows experience perf_counter has been helpful for high resolution timing.

pradyunsg · 2024-07-11T07:08:08Z

Yea, let's switch to perf_counter instead of monotonic here -- it has a higher resolution and doesn't really seem to have major downsides.

morotti · 2024-07-11T10:01:27Z

FYI: This appears to be a bug in the python interpreter, monotonic() is using a low precision timer, they fixed in main branch (python 3.13+ I think) to use perf_counter instead. python/cpython#116781

notatallshaw · 2024-07-11T11:59:16Z

FYI: This appears to be a bug in the python interpreter, monotonic() is using a low precision timer, they fixed in main branch (python 3.13+ I think) to use perf_counter instead. python/cpython#116781

Thanks, I just read through python/cpython#88494 and the tl;dr is Python 3.13 now uses precise timers on Windows for time.monotonic and time.time, and that QueryPerformanceCounter used to be problematic on Windows back when time.monotonic was first implemented but now it is considered highly reliable.

However, I don’t see any effort to backport so for now it seems pip must use time.perf_counter for high resolution timing on Windows, at least until it drops Python 3.12 support.

ichard26 · 2024-07-11T20:36:23Z

OK, I think this should actually be good now. I took a look at the other Windows CI failures and discovered that test_retry_time_limit was also subject to random failures so I added a margin to its asserts as well.

Sorry all.

ichard26 · 2024-07-11T21:02:24Z

OK, so sleep() on Windows can also apparently suspend execution for shorter than requested...?!¹ I think I'm going to only assert the timestamps on Linux and macOS because I have utterly no faith in the reliability of the Windows tests.

Yes, this edge case is documented but AFAIU, there is no signal handler raising an exception involved here... ↩

ichard26 · 2024-07-11T21:20:15Z

🙃 alright, let's just skip these tests entirely on Windows

…laky failures I've replaced time.monotonic() with time.perf_counter() in the retry utility and test suite since it's effectively monotonic[^1] while providing much higher resolution (esp. on Windows). In addition, to avoid flaky failures (originally on Windows) I've added a margin of error to the timed tests. I've also outright disabled the timed tests on Windows as further testing revealed that they were simply too unreliable to be useful (I once observed a deviation of 30%). [^1]: It's not guaranteed to be monotonic in the Python docs, but it is _indeed_ monotonic on all platforms we care about.

ichard26 added skip news Does not need a NEWS file entry (eg: trivial changes) C: tests Testing and related things labels Jul 10, 2024

morotti mentioned this pull request Jul 11, 2024

PERF: download and compute hashes in chunks of 1MB, did you know the progress bar was 30% of the runtime! #12810

Merged

ichard26 mentioned this pull request Jul 11, 2024

Ignore --require-virtualenv for check and freeze commands #12842

Merged

ichard26 force-pushed the fix-flaky-retry-tests-windows branch 2 times, most recently from 578510a to 98cfa14 Compare July 11, 2024 20:33

ichard26 changed the title ~~Tweak retry tests to account for Windows timer's coarse precision~~ @ichard26 Use higher precision time source in retry decorator/tests and add a margin of error Jul 11, 2024

ichard26 changed the title ~~@ichard26 Use higher precision time source in retry decorator/tests and add a margin of error~~ Use higher precision time source in retry decorator/tests and add a margin of error to avoid flaky failures Jul 11, 2024

ichard26 force-pushed the fix-flaky-retry-tests-windows branch from 98cfa14 to f5a6fd0 Compare July 11, 2024 21:05

ichard26 added this to the 24.2 milestone Jul 11, 2024

ichard26 force-pushed the fix-flaky-retry-tests-windows branch from f5a6fd0 to 2bec84f Compare July 11, 2024 21:26

ichard26 changed the title ~~Use higher precision time source in retry decorator/tests and add a margin of error to avoid flaky failures~~ Use higher precision time source in retry decorator/tests and avoid flaky failures Jul 11, 2024

pradyunsg approved these changes Jul 12, 2024

View reviewed changes

ichard26 mentioned this pull request Jul 12, 2024

Truststore by default #11647

Merged

3 tasks

pradyunsg merged commit 3228d76 into pypa:main Jul 13, 2024
28 checks passed

github-actions bot locked as resolved and limited conversation to collaborators Jul 29, 2024

ichard26 deleted the fix-flaky-retry-tests-windows branch November 2, 2024 16:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use higher precision time source in retry decorator/tests and avoid flaky failures #12839

Use higher precision time source in retry decorator/tests and avoid flaky failures #12839

ichard26 commented Jul 10, 2024 •

edited

Loading

notatallshaw commented Jul 10, 2024

ichard26 commented Jul 10, 2024 •

edited

Loading

notatallshaw commented Jul 10, 2024 •

edited

Loading

pradyunsg commented Jul 11, 2024

morotti commented Jul 11, 2024

notatallshaw commented Jul 11, 2024 •

edited

Loading

ichard26 commented Jul 11, 2024

ichard26 commented Jul 11, 2024

ichard26 commented Jul 11, 2024

Use higher precision time source in retry decorator/tests and avoid flaky failures #12839

Use higher precision time source in retry decorator/tests and avoid flaky failures #12839

Conversation

ichard26 commented Jul 10, 2024 • edited Loading

Footnotes

notatallshaw commented Jul 10, 2024

ichard26 commented Jul 10, 2024 • edited Loading

notatallshaw commented Jul 10, 2024 • edited Loading

pradyunsg commented Jul 11, 2024

morotti commented Jul 11, 2024

notatallshaw commented Jul 11, 2024 • edited Loading

ichard26 commented Jul 11, 2024

ichard26 commented Jul 11, 2024

Footnotes

ichard26 commented Jul 11, 2024

ichard26 commented Jul 10, 2024 •

edited

Loading

ichard26 commented Jul 10, 2024 •

edited

Loading

notatallshaw commented Jul 10, 2024 •

edited

Loading

notatallshaw commented Jul 11, 2024 •

edited

Loading