Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise the way tracemalloc and PyRefTracer hooks work #125790

Open
pablogsal opened this issue Oct 21, 2024 · 1 comment
Open

Optimise the way tracemalloc and PyRefTracer hooks work #125790

pablogsal opened this issue Oct 21, 2024 · 1 comment
Labels
performance Performance or resource usage

Comments

@pablogsal
Copy link
Member

In #125703 @markshannon has raised that he is unhappy about the performance implications of where these hooks are placed and in a call we discussed that he has some ideas on how to make them more performant by moving them elsewhere or adapting then.

I am opening this issue to track and sync about these improvements for 3.14 and beyond.

@markshannon
Copy link
Member

I think we can fix the performance issues by raising the level at which allocation/free goes through a function pointer.

Instead of a malloc-like interface void *malloc(size_t size), we should be returning partially initialized objects.
PyObject *obj_malloc(PyTypeObject *tp, size_t size, size_t presize) would allocate a chunk of memory size + presize, returning a PyObject * pointing to that memory + presize, with the ob_type field set to tp and the ob_refcount set to one.

This is low-enough level to be fully general, but with enough context to support tracemalloc.

I think we would need the following implementations, switchable at runtime:

  • default
  • tracemalloc
  • custom (used when the underlying PEP 445 allocator is changed)
  • custom-tracemalloc (used when the underlying PEP 445 allocator is changed)
  • freethreading
  • freethreading-tracemalloc

We don't need (or want) to switch between the free-threading and default allocators, but it keeps the rest of the code simpler if they have the same interface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage
Projects
None yet
Development

No branches or pull requests

2 participants