Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New sysconfig API 🤞 (meta-issue) #103480

Labels
topic-sysconfig type-feature A feature request or enhancement

Comments

Copy link
Member

Apr 12, 2023

Feature or enhancement

Provide new syconfig API that is aligned to how things have evolved.

Fixes GH-99942
Fixes GH-99560

Pitch

The sysconfig module was added in Python 3.2, over 10 years ago, and though there were some minor changes and additions, its API stayed mostly the same. Since then things have evolved, with new platform support (eg. WASM) and features being added, and we are now at a point where the current design is not great and has been the source of several issues, especially since the distutils deprecation (see GH-99942 for eg.).

Changing the existing API to better represent modern Python installations would cause a lot of breakage on existing code. Considering that, on top of several of its shortcomings, we think the best step at the moment would be to introduce a new API and mark part of the existing one as either deprecated, or pending deprecation, depending how development goes and on feedback from the impacted parties.

Here are the main points we'd like to improve on:

  1. Users currently rely on sysconfig.get_config_vars for a lot of things

On Posix, sysconfig.get_config_vars exports all the variables present in the installation's Makefile, and on Windows, it exports a handful of selected variables.

cpython/Lib/sysconfig.py

Lines 540 to 556 in 52f96d3

def _init_non_posix(vars):
"""Initialize the module as appropriate for NT"""
# set basic install directories
import _imp
vars['LIBDEST'] = get_path('stdlib')
vars['BINLIBDEST'] = get_path('platstdlib')
vars['INCLUDEPY'] = get_path('include')
try:
# GH-99201: _imp.extension_suffixes may be empty when
# HAVE_DYNAMIC_LOADING is not set. In this case, don't set EXT_SUFFIX.
vars['EXT_SUFFIX'] = _imp.extension_suffixes()[0]
except IndexError:
pass
vars['EXE'] = '.exe'
vars['VERSION'] = _PY_VERSION_SHORT_NO_DOT
vars['BINDIR'] = os.path.dirname(_safe_realpath(sys.executable))
vars['TZPATH'] = ''

There are several issues with this, the main being that the exported variables are not documented, so there are no guarantees about what is information is gonna be available, or the variable names.

While this is a good escape-hatch mechanism, it is a very poor option for the intended interface for key information. We should be exporting that information in a proper API.

  1. The installation paths model is outdated

There are a couple of issues and things to improve here here:

  • A scheme should be able to not provide non-critial paths (scripts for eg.)
  • We should introduce the concept of writable and read-only paths (users should not install to any path in the installed base, like include, for eg.)
  • We should introduce the concept of active location schemes (eg. posix_user should not be available on virtual environments)
    • This would extend to vendor schemes outside the default environment, if we ever get around to standardize that

Discussion on this should go to GH-103482

  1. Cross-compilation support is very poor

Cross-compilation at the moment is done by patching several aspects of Python, with the sysconfig module being the main thing. We provide several undocumented features to support this, like the allowing the project base path and sysconfig data module name to be overwritten via the _PYTHON_PROJECT_BASE and _PYTHON_SYSCONFIGDATA_NAME environment variables, respectively.

We believe one of the reasons that plays into users choosing to support cross-compilation by monkey patching Python, instead of directly adding cross-compilation support to the 3rd party building code, is because with the current sysconfig design, a lot of custom code would have to be added to properly support cross builds.

Our proposal to solve this is to make the new API standalone, meaning it would provide all the information users may need for binary interfacing (eg. building extension modules) and similar applications. On top of this, we'd shift the design from using module functions to data holder objects, meaning all the relevant details of the Python installation would be represented by just a couple objects. Users that adopt this new API should then be able to support the cross build by simply changing the data objects for ones that represent the target installation.

This has the drawback that we'd be re-exporting information already present in other places (eg. sys.version, sys.implementation), but we believe it is worth.

Future work

In a future step we would also like to get Python to provide a static file detailing all the information required by the new sysconfig API. With this, we could then allow the new API to use an external installation/interpreter config file as the data source, making cross builds even simpler to support.
See the discussion at https://discuss.python.org/t/what-information-is-useful-to-know-statically-about-an-interpreter/25563 for examples of users that would benefit from this feature.

Migration plan

To help migration, and to make these improvements available on older Python versions, we plan to provide a backport package on PyPI, similarly with we already do with importlib_metadata (for importlib.metadata), importlib_resources (for importlib.resources), and typing_extensions (for typing).

Sub-issues

The proposal is quite big, and may require extensive discussions with different parties, so I am splitting it into a couple separate issues.

Previous discussion

This has been mentioned in a couple places, here are some of the most relevant:

https://discuss.python.org/t/building-extensions-modules-in-a-post-distutils-world/23938
https://discuss.python.org/t/sysconfig-should-provide-more-information-to-help-packaging/22950
https://discuss.python.org/t/pep-582-python-local-packages-directory/963/391
https://discuss.python.org/t/linux-distro-patches-to-sysconfig-are-changing-pip-install-prefix-outside-virtual-environments/18240/28
#102522 (comment)
https://github.com/PyO3/maturin/blob/5d5b96a9974eac26b8cdf601cd2faf64f4999de9/src/python_interpreter/sysconfig-freebsd.json

Linked PRs

@encukou
Copy link
Member

encukou commented Jul 10, 2023

I don't know if this is tracked anywhere, but could you consider the CLI as well?

  • python -m sysconfig ... should be CLI-compatible with python-config .... That is, it should support as much of --prefix|--exec-prefix|--includes|--libs|--cflags|--ldflags|--extension-suffix|--abiflags|--configdir|--embed as possible.

Copy link
Member Author

Yep. That's a parallel improvement, and is being tracked in GH-77620. It is definitely something that I'd like to look at, but I think the new API proposal and subsequent work (see the "Future work" section above) may change things a bit, so IMO that's something that should be done after we have a good understanding of the direction we are going, at least.

Copy link
Member Author

After discussions with a couple people, and realizing the goals and priorities of this work are not well-defined, I am gonna document them here.


The main goal of this API is to provide information missing in the current API, and other parts of the interpreter. This includes:

  • Information that is entirely missing
    • Example: Determining when native modules should link against libpython.
  • Information that is available but incomplete
    • Example: Supported file extensions for modules, which are currently available in importlib.machinery but without the necessary specificity (eg. we can find the file extensions for native modules, but not know which one is the one referent to stable ABI)
  • Information which is available but via an unreliable method, such as sysconfig.get_config_vars
    • Example: Using sysconfig.get_config_var('SOABI') to find the PEP 3149 ABI string.

As a secondary goal, given we are introducing a new API, we have the opportunity to design it in such a way that helps with the current challenges around cross-compilation of 3rd party modules. Specifically, one way we can improve it is by shifting the burden of handling cross-build specificities from build backends, who are generally not subject-experts, to the user performing the build, who generally is. Thus making it much easier for build backends, and projects with similar applications, to support cross-compilation workflows.

The proposal in this issue achieves this by specifying a set of (typing) protocols for data holder objects that expose all the details (as reasonably as possible) required for compiling code and providing in sysconfig the implementation for the current environment. Using this approach, any code using these data holder objects, could support cross-compilation by simply using external versions of these objects, provided by the (expert) user.


Here's the summarized priority list:

  • Provide currently missing information (see comments above and the original for examples)
    • Target users: just overall sysconfig users -- packaging utility code, build backends, external build systems, etc. (eg. pypa/packaging)
  • Design the API in a way that makes it easier to handle cross-compilation
    • Target users: projects that are interested in working with cross-compilation, primarily build backends, and cross-compilation users (Pyodide, conda-forge, etc.)

And here's a list of other related work that might be impacted, or impact, the new sysconfig API work:

@rgommers
Copy link

Thanks @FFY00, that is a useful summary.

Specifically, one way we can improve it is by shifting the burden of handling cross-build specificities from build backends, who are generally not subject-experts, to the user performing the build, who generally is. Thus making it much easier for build backends, and projects with similar applications, to support cross-compilation workflows.

I'd suggest refraining from this type of qualification. Most users certainly are not cross-compilation experts, even when they need to do a cross build. I'd personally have higher expectations from build backend authors than from users. Either way, there's no need to determine who is an expert. I think you can say something like: "we can improve here by avoiding the need to monkeypatch sysconfig as much as possible (xref PEP 720), and in the limited cases where information from outside sysconfig/stdlib may be required, allow providing that in a structured way by either the build backend or the user invoking the build".

Copy link
Member Author

Aug 15, 2023

I'd suggest refraining from this type of qualification. Most users certainly are not cross-compilation experts, even when they need to do a cross build.

That is fair. The core motivation here still stands, which is shifting the burden from the build backends (et all) to the user that is setting up the cross-build. Even if the user is not necessarily an expert, generally they should be much better equipped to handle it.

Copy link
Member

The core motivation here still stands, which is shifting the burden from the build backends (et all) to the user that is setting up the cross-build.

I think this is the right motivation, as the user is the one who knows they want a cross-build. There's no real way for a backend to detect it on their own, or to know which platform to build for, so if we have a standard API in new-sysconfig that provides the info, and a mechanism for the user to specify it, build backends don't have to get that info themselves (and we can cross-build a set of packages that may all be using different backends).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-sysconfig type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

4 participants