-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_typing leaked [1, 1, 1] memory blocks: fail randomly on Refleak buildbots #121084
Comments
It does not reproduce on macos, so it is windows-specific. |
|
Even if I run more iterations, I only see a total of sum=3. "Leaks" are only in the first 5 runs (after the 3 warmups):
|
I tried tracemalloc: diff --git a/Lib/test/libregrtest/refleak.py b/Lib/test/libregrtest/refleak.py
index 25e71305402..5da93117945 100644
--- a/Lib/test/libregrtest/refleak.py
+++ b/Lib/test/libregrtest/refleak.py
@@ -120,6 +120,7 @@ def get_pooled_int(value):
numbers = numbers[:warmups] + ':' + numbers[warmups:]
print(numbers, file=sys.stderr, flush=True)
+ import tracemalloc
xml_filename = 'refleak-xml.tmp'
result = None
dash_R_cleanup(fs, ps, pic, zdc, abcs)
@@ -137,6 +138,11 @@ def get_pooled_int(value):
dash_R_cleanup(fs, ps, pic, zdc, abcs)
support.gc_collect()
+ if i == 2:
+ tracemalloc.take_snapshot().dump('/tmp/snapshot.before')
+ elif i == 3:
+ tracemalloc.take_snapshot().dump('/tmp/snapshot.after')
+
# Read memory statistics immediately after the garbage collection.
# Also, readjust the reference counts and alloc blocks by ignoring
# any strings that might have been interned during test_func. These Script: import tracemalloc
snapshot1 = tracemalloc.Snapshot.load('/tmp/snapshot.before')
snapshot2 = tracemalloc.Snapshot.load('/tmp/snapshot.after')
top_stats = snapshot2.compare_to(snapshot1, 'lineno')
print("[ Top 10 differences ]")
for stat in top_stats[:10]:
print(stat) Run tests: Example of script output:
Only growing counts are interesting:
Many creation of classes. The question is why these classes still seem to be alive after the test completed. Maybe something is delaying the deletion of these temporary classes. Maybe they are registered in caches and caches are not fully cleared "immediately" but "later". |
I recall @encukou telling me it had something to do with materialization of |
This is a different pattern than the one related to dict rematerialization (#115822) |
I don't know if its helpful but for i in {1..5} ; do ./python -m test test_typing -R3:4 -m test_protocols_isinstance_py36 ; done You'll see patterns of the form # 123:4567
XX* *...
# or
XX. *... (The test invokes In addition, we seem to have some pattern: Either it is
or it is
(either you have a 2 at position 4 and nothing afterwards or a 1 at 4 and a 1 at 9). |
GenericAlias and Union have no clear() function, but it doesn't seem to make any difference to add them: diff --git a/Objects/genericaliasobject.c b/Objects/genericaliasobject.c
index f5fefd65653..59ef57acc4e 100644
--- a/Objects/genericaliasobject.c
+++ b/Objects/genericaliasobject.c
@@ -50,6 +50,16 @@ ga_traverse(PyObject *self, visitproc visit, void *arg)
return 0;
}
+static int
+ga_clear(PyObject *self)
+{
+ gaobject *alias = (gaobject *)self;
+ Py_CLEAR(alias->origin);
+ Py_CLEAR(alias->args);
+ Py_CLEAR(alias->parameters);
+ return 0;
+}
+
static int
ga_repr_item(PyUnicodeWriter *writer, PyObject *p)
{
@@ -974,6 +984,7 @@ PyTypeObject Py_GenericAliasType = {
.tp_getattro = ga_getattro,
.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_VECTORCALL,
.tp_traverse = ga_traverse,
+ .tp_clear = ga_clear,
.tp_richcompare = ga_richcompare,
.tp_weaklistoffset = offsetof(gaobject, weakreflist),
.tp_methods = ga_methods,
diff --git a/Objects/unionobject.c b/Objects/unionobject.c
index 7931f4345f7..192ba4ef1ed 100644
--- a/Objects/unionobject.c
+++ b/Objects/unionobject.c
@@ -36,6 +36,15 @@ union_traverse(PyObject *self, visitproc visit, void *arg)
return 0;
}
+static int
+union_clear(PyObject *self)
+{
+ unionobject *alias = (unionobject *)self;
+ Py_CLEAR(alias->args);
+ Py_CLEAR(alias->parameters);
+ return 0;
+}
+
static Py_hash_t
union_hash(PyObject *self)
{
@@ -380,6 +389,7 @@ PyTypeObject _PyUnion_Type = {
.tp_free = PyObject_GC_Del,
.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,
.tp_traverse = union_traverse,
+ .tp_clear = union_clear,
.tp_hash = union_hash,
.tp_getattro = union_getattro,
.tp_members = union_members, |
test_functools has a similar issue: #120163 |
If I understand correctly, they don't need clear() because it is impossible for a cycle to exist that contains exclusively GenericAlias or Union objects, because both are immutable. |
About test_functools, I can reproduce a "leak" of 1 memory allocation with this simplified test suite. It's a random failure, like 1 run on 50 runs. According to tracemalloc, there are 2 "leaks":
I'm not sure why this P class is not deallocated immediately.
I'm not 100% sure of tracemalloc result. import collections.abc
import functools
import unittest
class TestSingleDispatch(unittest.TestCase):
def test_mro_conflicts(self):
c = collections.abc
@functools.singledispatch
def g(arg):
return "base"
class O(c.Sized):
def __len__(self):
return 0
o = O()
g.register(c.Iterable, lambda arg: "iterable")
g.register(c.Container, lambda arg: "container")
g.register(c.Sized, lambda arg: "sized")
g.register(c.Set, lambda arg: "set")
c.Iterable.register(O)
c.Container.register(O)
c.Set.register(O)
class P: # <========== HERE ============
pass
p = P()
c.Iterable.register(P)
c.Container.register(P)
with self.assertRaises(RuntimeError) as re_one:
g(p)
class Q(c.Sized):
def __len__(self):
return 0
q = Q()
c.Iterable.register(Q) # <========== HERE ============
c.Set.register(Q) |
Does it reproduce if you get rid of the |
This comment was marked as off-topic.
This comment was marked as off-topic.
test_typing and test_functools "leaks" are very random. I'm using a loop to run the test until it fails. Sometimes, it fails at the first attempt. Sometimes, it takes 100 runs. |
Yes, sadly, I can still reproduce the "leak" if I remove assertRaises(). And you're right that storing an exception in a local variable creates a reference cycle! |
This "leak" is really random, I'm not sure that it's a real leak. It's really hard to debug, I'm considering to give up for my mental health sanity :-) Maybe libregrtest should just ignore "leaks" of [1, 0, 0] memory blocks, [1, 1, 1] memory blocks, etc. Maybe only complain if the total is greater than 3? Or maybe even remove the "memory block" feature? I'm not sure. Another option is to ignore it only for "test_functools" and "test_typing". |
I often see this issue when running the test suite on my Mac OS & Linux setups. So it's not specific to Windows. |
FWIW, I've executed |
You can make the problem more likely with this change: diff --git a/Lib/test/libregrtest/refleak.py b/Lib/test/libregrtest/refleak.py
index 25e71305402..571d4cfeb22 100644
--- a/Lib/test/libregrtest/refleak.py
+++ b/Lib/test/libregrtest/refleak.py
@@ -201,7 +201,7 @@ def check_fd_deltas(deltas):
failed = False
for deltas, item_name, checker in [
(rc_deltas, 'references', check_rc_deltas),
- (alloc_deltas, 'memory blocks', check_rc_deltas),
+ (alloc_deltas, 'memory blocks', lambda deltas: any(delta >= 1 for delta in deltas)),
(fd_deltas, 'file descriptors', check_fd_deltas)
]:
# ignore warmup runs |
Here is a reproducer of test_functools leak in less than 120 lines without unittest or anything: leak.py. Run the script in a loop until it fails with the shell command:
Example of output:
|
IMO the code to clear Abstract Base Classes is not reliable: def runtest_refleak(test_name, test_func):
# (...)
abcs = {}
for abc in [getattr(collections.abc, a) for a in collections.abc.__all__]:
if not isabstract(abc):
continue
for obj in abc.__subclasses__() + [abc]:
abcs[obj] = _get_dump(obj)[0]
# (...)
def dash_R_cleanup(abcs):
abs_classes = [getattr(collections.abc, a) for a in collections.abc.__all__]
abs_classes = filter(isabstract, abs_classes)
for abc in abs_classes:
for obj in abc.__subclasses__() + [abc]:
for ref in abcs.get(obj, set()):
if ref() is not None:
obj.register(ref())
obj._abc_caches_clear() |
According to tracemalloc, the only additional memory allocation is:
|
dash_R_cleanup() now calls _abc_registry_clear() before calling again register().
I wrote PR gh-121191 which seems to fix test_functools. |
dash_R_cleanup() now calls _abc_registry_clear() before calling again register().
…ythonGH-121191) dash_R_cleanup() now calls _abc_registry_clear() before calling again register(). (cherry picked from commit c766ad2) Co-authored-by: Victor Stinner <[email protected]>
…ythonGH-121191) dash_R_cleanup() now calls _abc_registry_clear() before calling again register(). (cherry picked from commit c766ad2) Co-authored-by: Victor Stinner <[email protected]>
Example of leak:
The problem is that the ABC cache is not cleared explicitly. Example of local fix to show the problem, but a more generic one should be written: diff --git a/Lib/test/test_typing.py b/Lib/test/test_typing.py
index a931da55908..b2b22668e07 100644
--- a/Lib/test/test_typing.py
+++ b/Lib/test/test_typing.py
@@ -4137,6 +4137,8 @@ def __bytes__(self):
self.assertIsSubclass(B, typing.SupportsBytes)
self.assertNotIsSubclass(str, typing.SupportsBytes)
+ typing.SupportsBytes._abc_caches_clear()
+
def test_supports_abs(self):
self.assertIsSubclass(float, typing.SupportsAbs)
self.assertIsSubclass(int, typing.SupportsAbs) |
More generic fix: diff --git a/Lib/test/libregrtest/utils.py b/Lib/test/libregrtest/utils.py
index 8253d330b95..056fc9ba919 100644
--- a/Lib/test/libregrtest/utils.py
+++ b/Lib/test/libregrtest/utils.py
@@ -263,6 +263,13 @@ def clear_caches():
for f in typing._cleanups:
f()
+ import inspect
+ abs_classes = [getattr(typing, attr) for attr in typing.__all__]
+ abs_classes = filter(inspect.isabstract, abs_classes)
+ for abc in abs_classes:
+ for obj in abc.__subclasses__() + [abc]:
+ obj._abc_caches_clear()
+
try:
fractions = sys.modules['fractions']
except KeyError: |
…ython#121191) dash_R_cleanup() now calls _abc_registry_clear() before calling again register().
Call _abc_caches_clear() on typing abstract classes and their subclasses when running tests to check for refleaks (-R option).
Clear typing ABC caches when running tests for refleaks (-R option): call _abc_caches_clear() on typing abstract classes and their subclasses.
Clear typing ABC caches when running tests for refleaks (-R option): call _abc_caches_clear() on typing abstract classes and their subclasses.
Clear typing ABC caches when running tests for refleaks (-R option): call _abc_caches_clear() on typing abstract classes and their subclasses. (cherry picked from commit 5f660e8) Co-authored-by: Victor Stinner <[email protected]>
Clear typing ABC caches when running tests for refleaks (-R option): call _abc_caches_clear() on typing abstract classes and their subclasses. (cherry picked from commit 5f660e8) Co-authored-by: Victor Stinner <[email protected]>
test_typing was fixed by 5f660e8. |
gh-121084: Fix test_typing random leaks (GH-121360) Clear typing ABC caches when running tests for refleaks (-R option): call _abc_caches_clear() on typing abstract classes and their subclasses. (cherry picked from commit 5f660e8) Co-authored-by: Victor Stinner <[email protected]>
gh-121084: Fix test_typing random leaks (GH-121360) Clear typing ABC caches when running tests for refleaks (-R option): call _abc_caches_clear() on typing abstract classes and their subclasses. (cherry picked from commit 5f660e8) Co-authored-by: Victor Stinner <[email protected]>
Thanks Victor for solving this. I was a little late to the "party", but I can confirm that it solves the leak at least in every one of my setups (Linux, Windows, MacOS). |
…ython#121191) dash_R_cleanup() now calls _abc_registry_clear() before calling again register().
Clear typing ABC caches when running tests for refleaks (-R option): call _abc_caches_clear() on typing abstract classes and their subclasses.
…ython#121191) dash_R_cleanup() now calls _abc_registry_clear() before calling again register().
Clear typing ABC caches when running tests for refleaks (-R option): call _abc_caches_clear() on typing abstract classes and their subclasses.
Examples of failure:
Linked PRs
The text was updated successfully, but these errors were encountered: