Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Lib/_pydatetime.py file to optimize "import datetime" when _datetime is available #84976

Closed
vstinner opened this issue May 28, 2020 · 26 comments
Labels
3.10 only security fixes stdlib Python modules in the Lib dir

Comments

@vstinner
Copy link
Member

vstinner commented May 28, 2020

BPO 40799
Nosy @vstinner, @ezio-melotti, @serhiy-storchaka, @pganssle, @shihai1991
PRs
  • bpo-40799: Add _pydatetime module (Python datetime impl) #20472
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2020-05-28.00:06:58.019>
    labels = ['library', '3.10']
    title = 'Create Lib/_pydatetime.py file to optimize "import datetime" when _datetime is available'
    updated_at = <Date 2020-06-26.14:58:45.408>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2020-06-26.14:58:45.408>
    actor = 'p-ganssle'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2020-05-28.00:06:58.019>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 40799
    keywords = ['patch']
    message_count = 8.0
    messages = ['370153', '370165', '370216', '370222', '370223', '370584', '372425', '372427']
    nosy_count = 5.0
    nosy_names = ['vstinner', 'ezio.melotti', 'serhiy.storchaka', 'p-ganssle', 'shihai1991']
    pr_nums = ['20472']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue40799'
    versions = ['Python 3.10']

    @vstinner
    Copy link
    Member Author

    vstinner commented May 28, 2020

    Currently, "import datetime" starts by importing time, math, sys and operator modules, and then execute 2500 lines of Python code, define 7 classes, etc. For what? Just to remove all classes, functions, etc. to replace them with symbols from _decimal module:

    try:
        from _datetime import *
    except ImportError:
        pass
    else:
        # Clean up unused names
        del (_DAYNAMES, _DAYS_BEFORE_MONTH, _DAYS_IN_MONTH, _DI100Y, _DI400Y,
             _DI4Y, _EPOCH, _MAXORDINAL, _MONTHNAMES, _build_struct_time,
             _check_date_fields, _check_time_fields,
             _check_tzinfo_arg, _check_tzname, _check_utc_offset, _cmp, _cmperror,
             _date_class, _days_before_month, _days_before_year, _days_in_month,
             _format_time, _format_offset, _index, _is_leap, _isoweek1monday, _math,
             _ord2ymd, _time, _time_class, _tzinfo_class, _wrap_strftime, _ymd2ord,
             _divide_and_round, _parse_isoformat_date, _parse_isoformat_time,
             _parse_hh_mm_ss_ff, _IsoCalendarDate)
        # XXX Since import * above excludes names that start with _,
        # docstring does not get overwritten. In the future, it may be
        # appropriate to maintain a single module level docstring and
        # remove the following line.
        from _datetime import __doc__
    

    I would prefer to use the same approach than the decimal module which also has large C and Python implementation. Lib/decimal.py is just:

    try:
        from _decimal import *
        from _decimal import __doc__
        from _decimal import __version__
        from _decimal import __libmpdec_version__
    except ImportError:
        from _pydecimal import *
        from _pydecimal import __doc__
        from _pydecimal import __version__
        from _pydecimal import __libmpdec_version__
    

    Advantages:

    • Faster import time
    • Avoid importing indirectly time, math, sys and operator modules, whereas they are not used

    IMO it also better separate the C and the Python implementations.

    Attached PR implements this idea.

    Currently, "import datetime" imports 4 modules:

      ['_operator', 'encodings.ascii', 'math', 'operator']
    

    With the PR, "import datetime" imports only 1 module:

      ['encodings.ascii']
    

    Import performance:

    • [ref] 814 us +- 32 us -> [change] 189 us +- 4 us: 4.31x faster (-77%)

    Measured by:

    env/bin/python -m pyperf timeit -s 'import sys' 'import datetime; del sys.modules["datetime"]; del sys.modules["_datetime"]; del datetime'
    

    Note: I noticed that "import datetime" imports the math module while working on minimizing "import test.support" imports, bpo-40275.

    @vstinner vstinner added 3.10 only security fixes stdlib Python modules in the Lib dir labels May 28, 2020
    @serhiy-storchaka
    Copy link
    Member

    What do decimals have to datetime?

    @vstinner vstinner changed the title Create Lib/_pydecimal.py file to optimize "import datetime" when _decimal is available Create Lib/_pydatetime.py file to optimize "import datetime" when _datetime is available May 28, 2020
    @vstinner vstinner changed the title Create Lib/_pydecimal.py file to optimize "import datetime" when _decimal is available Create Lib/_pydatetime.py file to optimize "import datetime" when _datetime is available May 28, 2020
    @vstinner
    Copy link
    Member Author

    What do decimals have to datetime?

    Oops. Sorry, I was confused between "datetime" and "decimal" when I created this issue. I fixed the issue title.

    My idea is to mimick Lib/decimal.py design for Lib/datetime.py.

    @pganssle
    Copy link
    Member

    I basically agree with this — this is one of the reasons I structured the zoneinfo module the way I did rather than mimicking the pattern in datetime.

    I believe that there are other modules that have similar situations like heapq, but datetime is probably the worst offender.

    I am inclined to say that we should restructure datetime into a folder, containing __init__.py, _datetime.py and possibly _strptime.py (which I think is also only used in datetime), but I think that sort of restructuring is way more sensitive to weird import bugs than this one.

    As it is now, I would be shocked if this didn't break *someone*, because people are always relying on weird implementation details (knowingly or unknowingly), but I think it's worth doing; it's good to tackle it this early in the cycle.

    @vstinner What do you think about restructuring into a folder-based submodule rather than _pydatetime.py? It's way more likely to break someone, but I think it might be the better way to organize the code, and I don't want to have to go through *two* refactors of this sort.

    @vstinner
    Copy link
    Member Author

    I believe that there are other modules that have similar situations like heapq, but datetime is probably the worst offender.

    heapq seems to be a little bit different. _heapq is not a drop-in replacement of heapq.py. For example, nlargest() seems to only be implemented in pure Python.

    I am inclined to say that we should restructure datetime into a folder, containing __init__.py, _datetime.py and possibly _strptime.py (which I think is also only used in datetime), but I think that sort of restructuring is way more sensitive to weird import bugs than this one.

    I have no idea what are the side effects of converting datetime.py file into a package.

    A single file _pydatetime.py seems more convenient to me. I'm aware of _strptime.py but I don't see it as a datetime submodule and I don't see the value of moving it as a datetime submodule.

    I'm fine with _datetime accessing _strptime module. It sounds more complex to me if _datetime would be imported by datetime which contains datetime._strptime. I see a higher risk of subtle import issues, since datetime has two implementations (C and Python). But it may be wrong :-)

    Also, all other stdlib modules which have a C implementation are designed with files, not folders: io.py (_io and _pyio) and decimal.py (_decimal and _pydecimal) are good examples.

    I mostly case about reducing the number of indirect imports and import performance. I don't have a strong opinion about file vs folder.

    As it is now, I would be shocked if this didn't break *someone*, because people are always relying on weird implementation details (knowingly or unknowingly), but I think it's worth doing; it's good to tackle it this early in the cycle.

    I'm fine with breaking applications relying on implementation details. Also, we can adjust the code to fix such corner cases later if it's needed, possible and justified :-)

    @vstinner
    Copy link
    Member Author

    vstinner commented Jun 2, 2020

    About _strptime, I see that the time.strptime() imports internally the _strptime module. If we move _strptime inside datetime: does it mean that calling time.strptime() would have to import the datetime module? It doesn't sound right to me. I see the time as the low-level interface: it should not rely on the high-level interface. I prefer to separate _strptime module from the datetime module.

    @pganssle
    Copy link
    Member

    bout _strptime, I see that the time.strptime() imports internally the _strptime module.

    Ah, sorry, my remark about including _strptime was off the cuff — I thought it was only used in datetime, which is why I said "possibly _strptime". If it's used for time as well, we should leave it where it is.

    @pganssle
    Copy link
    Member

    As for deciding between moving to datetime/ and moving to _pydatetime, I think we should send an e-mail to Python-Dev about it to get a wider perspective, because the import machinery is a lot of black magic, and I know that there are large proprietary code bases out there that pile weird stuff on top of it. I'm not sure I can fully appreciate the trade-offs.

    The biggest advantage I see to moving datetime into its own folder is that it gives us a lot more freedom to expand into smaller sub-modules in the future. For example, in zoneinfo, we have zoneinfo/_common.py (https://github.com/python/cpython/blob/2e0a920e9eb540654c0bb2298143b00637dc5961/Lib/zoneinfo/_common.py), which is some logic shared between the C and Python implementations; _zoneinfo.c is able to rely directly on _common.py without importing zoneinfo/_zoneinfo.py (which saves us a bunch of other module imports as well).

    Right now the C implementation of datetime only directly imports time and _strptime, but I could imagine future enhancements that would be stupidly inconvenient to implement in C, but where we wouldn't want to implement all of _pydatetime just to get a pure-Python implementation. Having a namespace available for such packages would be useful.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @vstinner
    Copy link
    Member Author

    vstinner commented Nov 3, 2022

    I don't have the bandwidth to work on this issue, so I just close it.

    @pganssle
    Copy link
    Member

    pganssle commented Nov 4, 2022

    I don't think we need to close this just because you won't be fixing it yourself. It is still a valid issue

    @pganssle pganssle reopened this Nov 4, 2022
    @vstinner
    Copy link
    Member Author

    vstinner commented Nov 4, 2022

    It is still a valid issue

    Well, I made a first attempt in 2020: PR #20472 which it went nowhere.

    Let me re-create the change: PR #99090. It's basically the same one, but I made tiny doc changes (ex: use GitHub issue number, not bpo number) and datetimetester.py tolerates that _pydatetime.py doesn't exist (use datetime in this case).

    I ran two benchmarks.

    Startup time:

    env/bin/python -m pyperf command -v -o pydatetime.json -- ./python -IsS -c 'import datetime'
    

    Import time:

    env/bin/python -m pyperf timeit -v -o import_change.json -s 'import sys' 'import datetime; del sys.modules["datetime"]; del sys.modules["_datetime"]; del datetime'
    

    Results:

    • Startup time: Mean +- std dev: [startup_ref] 6.93 ms +- 0.26 ms -> [startup_change] 6.12 ms +- 0.47 ms: 1.13x faster
    • Import time: Mean +- std dev: [import_ref] 688 us +- 17 us -> [import_change] 133 us +- 12 us: 5.18x faster

    IMO the speedup is quite interesting.

    @serhiy-storchaka
    Copy link
    Member

    You need to add del sys.modules["_pydatetime"] and the same for other indirectly imported modules (math, operator, _operator).

    @vstinner
    Copy link
    Member Author

    vstinner commented Nov 4, 2022

    The biggest advantage I see to moving datetime into its own folder is that it gives us a lot more freedom to expand into smaller sub-modules in the future.

    I don't think that this constraint (convert datetime into a package) should only this optimization. For me, it can be done later. Honestly, I'm not convinced that converting datetime into a module right now brings any benefits. I prefer a simple .py file (a module, not a package). I don't expect major datetime redesign soon.

    If tomorrow, there is strong pressure for large datetime feature which requires creating a package, sure, we can consider that.

    Today, Python stdlib is mostly flat and IMO it's a good thing and also a deliberate choice.

    @vstinner
    Copy link
    Member Author

    vstinner commented Nov 4, 2022

    Ok, I wrote a more elaborated script to measure import datetime:

    import pyperf
    import sys
    
    
    def bench_import_datetime(modules):
        import datetime
    
        del modules["datetime"]
        del modules["_datetime"]
        modules.pop("time", None)
        modules.pop("math", None)
        modules.pop("operator", None)
        modules.pop("_operator", None)
        modules.pop("_pydatetime", None)
    
        del datetime
    
    runner = pyperf.Runner()
    del pyperf
    
    runner.bench_func('import', bench_import_datetime, sys.modules)

    Result (main => PR): Mean +- std dev: [bench_ref] 1.07 ms +- 0.02 ms -> [bench_change] 116 us +- 5 us: 9.20x faster.

    @pganssle
    Copy link
    Member

    pganssle commented Nov 4, 2022

    So I think on the mailing list we decided that switching to just _pydatetime was fine, but the blocker last time we tried to do this was now making sure that miss-islington would be able to do backports. I can't exactly remember where we left off on that — whether it's the case that the change wouldn't break miss-islington and if it would how big of a problem that is.

    I think I vaguely recall it being the case that datetime.py has had very few backports in the last few years — almost all of the code change in datetime is coming from new features that don't get backported rather than bugs in existing features, so if we're looking at 2-3 manual backports per year for the next few years it's probably worth doing this even if it breaks miss-islington, but obviously if there is a way to do it that does not break miss-islington, it's preferable to do it that way.

    I'm not sure how to test the miss-islington thing because I'm not sure how easy it is to point miss islington at a different repo and say, "Treat this like it's CPython main, and this other branch like it's 3.11" or whatever. We can try using the same git commands miss-islington does as a first pass check, but before merge I'd be way more comfortable if we could see an actual miss-islington instance creating a backport.

    @vstinner
    Copy link
    Member Author

    vstinner commented Nov 4, 2022

    By default, git log Lib/_pydatetime.py only shows my commit as if I created it from scratch. But git log --follow Lib/_pydatetime.py works as expected and shows 81 commits up to:

    commit cf86e368ebd17e10f68306ebad314eea31daaa1e
    Author: Alexander Belopolsky <[email protected]>
    Date:   Fri Jul 23 19:25:47 2010 +0000
    
        Issue #7989: Added pure python implementation of the datetime module.
    

    @vstinner
    Copy link
    Member Author

    vstinner commented Nov 4, 2022

    For miss-islington, I did a test locally.

    I create a local bugfix from my PR pydatetime branch:

    $ git switch -c pydatetime_bugfix
    $ vim
    $ git diff
    diff --git a/Lib/_pydatetime.py b/Lib/_pydatetime.py
    index fb1e139379..d1a735e485 100644
    --- a/Lib/_pydatetime.py
    +++ b/Lib/_pydatetime.py
    @@ -2373,7 +2373,7 @@ def tzname(self, dt):
             if isinstance(dt, datetime) or dt is None:
                 if self._name is None:
                     return self._name_from_offset(self._offset)
    -            return self._name
    +            return self._name + "bugfix"
             raise TypeError("tzname() argument must be a datetime instance"
                             " or None")
    
    $ git ci -a -m bugfix
    [pydatetime_bugfix 2a89c2221b] bugfix
     1 file changed, 1 insertion(+), 1 deletion(-)
    

    Backport the bugfix from main to the 3.11 branch:

    # move the the 3.11 branch
    $ cd ../3.11
    $ git switch -c pydatetime_bugfix311 
    $ git cherry-pick -x pydatetime_bugfix
    CONFLICT (modify/delete): Lib/_pydatetime.py deleted in HEAD and modified in 2a89c2221b (bugfix).  Version 2a89c2221b (bugfix) of Lib/_pydatetime.py left in tree.
    error: could not apply 2a89c2221b... bugfix
    hint: After resolving the conflicts, mark them with
    hint: "git add/rm <pathspec>", then run
    hint: "git cherry-pick --continue".
    hint: You can instead skip this commit with "git cherry-pick --skip".
    hint: To abort and get back to the state before "git cherry-pick",
    hint: run "git cherry-pick --abort".
    
    $ git status
    On branch pydatetime_bugfix311
    You are currently cherry-picking commit 2a89c2221b.
      (fix conflicts and run "git cherry-pick --continue")
      (use "git cherry-pick --skip" to skip this patch)
      (use "git cherry-pick --abort" to cancel the cherry-pick operation)
    
    Unmerged paths:
      (use "git add/rm <file>..." as appropriate to mark resolution)
    	deleted by us:   Lib/_pydatetime.py
    
    no changes added to commit (use "git add" and/or "git commit -a")
    

    Well. It doesn't work as expected. git cherry-pick -x is unable to understand that changes in _pydatetime.py should be backported to datetime.py.


    Maybe the problem is that in my PR, datetime.py is renamed to _pydatetime.py... but at the same time, datetime.py is recreated with a new content. Maybe at the end, Git doesn't understand that it's a rename.

    Example if I recreated my PR quickly (first steps) from my existing main (reference) and pydatetime (PR) branches:

    $ git switch -c clean_pydatetime  main
    $ git mv Lib/datetime.py Lib/_pydatetime.py 
    $ git status
    On branch clean_pydatetime
    Changes to be committed:
      (use "git restore --staged <file>..." to unstage)
    	renamed:    Lib/datetime.py -> Lib/_pydatetime.py
    

    Good, at this stage, Git understand that the file was renamed.

    Create a new datetime.py file:

    $ git show pydatetime:Lib/datetime.py > Lib/datetime.py
    $ git add Lib/datetime.py
    $ git status
    On branch clean_pydatetime
    Changes to be committed:
      (use "git restore --staged <file>..." to unstage)
    	new file:   Lib/_pydatetime.py
    	modified:   Lib/datetime.py
    

    Not good: _pydatetime.py is now seen as a "new file" :-(

    @vstinner
    Copy link
    Member Author

    vstinner commented Nov 4, 2022

    Another test where the rename is a dedicated commit. Spoiler: not good, I get the same conflict on backport.

    First commit, only to rename:

    $ git switch -c clean_pydatetime  main
    $ git mv Lib/datetime.py Lib/_pydatetime.py 
    $ git ci -a -m "Rename Lib/datetime.py to Lib/_pydatetime.py"
    

    Second commit, to recreate datetime.py:

    $ git show pydatetime:Lib/datetime.py > Lib/datetime.py
    $ git add Lib/datetime.py
    $ git ci -a -m "Recreate Lib/datetime.py"
    

    Now I created a bugfix (in the "main" branch);

    $ git switch -c clean_pydatetime_bugfix
    $ vim Lib/_pydatetime.py 
    $ git ci -a -m "bugfix"
    $ git show
     git show
    commit f5e444261a1321b79bf27a4e0a05a422d6069c30 (HEAD -> clean_pydatetime_bugfix)
    Author: Victor Stinner <[email protected]>
    Date:   Fri Nov 4 20:19:33 2022 +0100
    
        bugfix
    
    diff --git a/Lib/_pydatetime.py b/Lib/_pydatetime.py
    index 01742680a9..c22c6ccfb6 100644
    --- a/Lib/_pydatetime.py
    +++ b/Lib/_pydatetime.py
    @@ -2371,7 +2371,7 @@ def tzname(self, dt):
             if isinstance(dt, datetime) or dt is None:
                 if self._name is None:
                     return self._name_from_offset(self._offset)
    -            return self._name
    +            return self._name + "bugfix"
             raise TypeError("tzname() argument must be a datetime instance"
                             " or None")
    

    What if I backport it to the 3.11 branch?

    $ cd ../3.11
    $ git switch -c clean_pydatetime_bugfix311
    $ git cherry-pick -x clean_pydatetime_bugfix
    CONFLICT (modify/delete): Lib/_pydatetime.py deleted in HEAD and modified in f5e444261a (bugfix).  Version f5e444261a (bugfix) of Lib/_pydatetime.py left in tree.
    error: could not apply f5e444261a... bugfix
    hint: After resolving the conflicts, mark them with
    hint: "git add/rm <pathspec>", then run
    hint: "git cherry-pick --continue".
    hint: You can instead skip this commit with "git cherry-pick --skip".
    hint: To abort and get back to the state before "git cherry-pick",
    hint: run "git cherry-pick --abort".
    
    $ git status
    On branch clean_pydatetime_bugfix311
    You are currently cherry-picking commit f5e444261a.
      (fix conflicts and run "git cherry-pick --continue")
      (use "git cherry-pick --skip" to skip this patch)
      (use "git cherry-pick --abort" to cancel the cherry-pick operation)
    
    Unmerged paths:
      (use "git add/rm <file>..." as appropriate to mark resolution)
    	deleted by us:   Lib/_pydatetime.py
    
    no changes added to commit (use "git add" and/or "git commit -a")
    

    Well, I get the exact same conflict. git cherry-pick -x doesn't get that changes made in _pydatetime.py must be backported to datetime.py.

    Note: git cherry-pick -x clean_pydatetime_bugfix is the same than git cherry-pick -x f5e444261a1321b79bf27a4e0a05a422d6069c30 and both commands only backport a single commit: the one changing a single line:

    -            return self._name
    +            return self._name + "bugfix"
    

    @vstinner
    Copy link
    Member Author

    vstinner commented Nov 4, 2022

    For me, the quick conclusion is that we have 2 choices:

    • No change: keep 2642 lines of Python code in datetime.py
    • Create Lib/_pydatetime.py and backport manually changes from 3.12 to 3.11, and then rely on miss-ilington to automate the backport from 3.11 to 3.10

    About manual backports, in 2 years (since 2021-01-01), Lib/datetime.py got 7 commits in the main branch. For comparison, it only got 2 commits in the 3.10 branch since 2021-01-01.

    $ git log --after=2021-01-01 main Lib/datetime.py|grep ^commit|wc -l
    7
    $ git log --after=2021-01-01 3.10 Lib/datetime.py|grep ^commit|wc -l
    2
    

    IMO the low number of changes make it manageable to be done manually. What do you think?

    Maybe a Git guru can help us here?

    @vstinner
    Copy link
    Member Author

    vstinner commented Nov 9, 2022

    Create Lib/_pydatetime.py and backport manually changes from 3.12 to 3.11, and then rely on miss-ilington to automate the backport from 3.11 to 3.10

    In clear, I'm in favor of this option.

    @pganssle: Are you ok with that?

    @pganssle
    Copy link
    Member

    pganssle commented Nov 10, 2022

    I think we should explore the option of a merge strategy that doesn't require manual backports, but if we can't find one, or if someone with git expertise can explain why there's no way to accomplish this without breaking backports, I'm fine with breaking backports.

    I wonder if we could even special-case datetime in miss islington to preserve automated backports even without clean cherry picks.

    @vstinner
    Copy link
    Member Author

    I think we should explore the option of a merge strategy that doesn't require manual backports,

    This issue is open for 2 years. I did many tests with Git.

    So far, I didn't find anything obvious. Git doesn't give much freedom for that. Git doesn't store renames: it "recomputes" them in all operations, check if two files look similar or not.

    @ezio-melotti
    Copy link
    Member

    I think we should explore the option of a merge strategy that doesn't require manual backports,

    FWIW I did some additional tests, without finding an actual solution.

    In the first test I tried to rename datetime.py to _pydatetime.py in a separate branch (let's call it B), then edit datetime.py in the previous branch (where it still existed -- let's call it A), and then merge B into A. When I did this, I found the edits I did in A to datetime.py ended up in _pydatetime.py.

    Then I tried another strategy. I started from A and created a B branch where I renamed datetime.py to _pydatetime.py (just like above). Then I created a C branch from A where I edited datetime.py. By doing this git supposedly "knew" that in B, the content of _pydatetime.py came from datetime.py, and that in C we still had the same (now edited) datetime.py file (instead of having a new one (re)created from scratch). However when I tried to octopus merge B and C into A (with git merge B C while in A) I got some conflict. Swapping C and B also resulted in a conflict.

    In the third and final attempt I was able to duplicate the file while preserving the history of both copies, by doing:

    $ git switch -c B
    Switched to a new branch 'B'
    $ git mv Lib/datetime.py Lib/_pydatetime.py 
    $ git commit -m 'Rename datetime.py to _pydatetime.py.'
    [B a932519123] Rename datetime.py to _pydatetime.py.
     1 file changed, 0 insertions(+), 0 deletions(-)
     rename Lib/{datetime.py => _pydatetime.py} (100%)
    $ git show A:Lib/datetime.py > Lib/datetime.py
    $ git add Lib/datetime.py
    $ git commit -m 'Add back datetime.py.'
    [B e1e1c92a5b] Add back datetime.py.
     1 file changed, 2642 insertions(+)
     create mode 100644 Lib/datetime.py
    $ git switch A 
    Switched to branch 'A'
    $ git merge --no-ff B  # apparently the --no-ff is important here
    Merge made by the 'ort' strategy.
     Lib/_pydatetime.py | 2642 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
     1 file changed, 2642 insertions(+)
     create mode 100644 Lib/_pydatetime.py
    $ vim Lib/datetime.py  # edit/remove stuff
    $ git add Lib/datetime.py 
    $ git commit -m 'Update datetime.py'
    [A 1d46a8cd30] Update datetime.py
     1 file changed, 21 insertions(+), 2642 deletions(-)
     rewrite Lib/datetime.py (99%)

    Now _pydatetime.py contains the content of the original datetime.py (with all the history) and datetime.py contains the updated content (also with the history of whatever is left in there) and git should "know" that both files come from the original datetime.py. However I'm not sure if git is able to figure out how to backport changes correctly and I ran out of time before being able to test backports.

    @vstinner
    Copy link
    Member Author

    I think we should explore the option of a merge strategy that doesn't require manual backports, but if we can't find one, or if someone with git expertise can explain why there's no way to accomplish this without breaking backports, I'm fine with breaking backports.

    This issue is open for 2 years. @ezio-melotti and me investigated options for backports and no solution was found.

    @pganssle: So are you ok to create Lib/_pydateime.py to make "import datetime" faster?

    I have no idea of miss-islington. The bot just runs git cherry-pick -x. I don't know how to customize this command.

    @pganssle
    Copy link
    Member

    This issue is open for 2 years.

    Sounds like it's not very urgent, then.

    Seriously, Victor, I appreciate the work you've put in on this, but I don't appreciate the rush you are imparting here. I didn't ask you to revive your PR when you closed the issue, I just said that it's still an open issue. I think it is worth taking a little bit of time investigating, but no one has provided a compelling explanation as to why this cannot work, nor have I had time to investigate this sufficiently to rule out a solution that doesn't require several years of manual backports. The feature freeze isn't for ~half a year.

    @vstinner
    Copy link
    Member Author

    What I mean is that people had 2 years to look into the Git issue and so far nobody was able to provide a solution to the Git issue. So for me, there is no solution, and there are only 2 choices: do nothing (keep datetime.py as it is), or create Lib/_pydatetime.py and handle backports manually.

    You need more time, ok. In this case, I abandon my PR and I leave the issue for someone else. I consider that @pganssle is the datetime maintainer, and I prefer to have his bless for this change. But I don't have the bandwidth to spend more time on this topic (investigate the Git question). I care about "import datetime" performance, but not enough to spend more time. For me, I don't think that it's worth it.

    Anyone is free to clone my PR #99090 and restart from that. By the way, there is now a conflict, I guess that datetime.py was modified since I created my PR. That's one of the reason why I don't like to keep a PR open for too long. The other reason is that PR that I created are my TODO list and I want to keep this list as short as possible. Otherwise, I lost track.

    I closed this issue 1 month ago because the issues that I created are my second "TODO list" (and I want to keep it short). Here it seems like the consensus is towards "do nothing". But well, I will not close again the issue. I will just learn to skip it from my "TODO list" :-) Or maybe there is a way in GitHub to transfer the "ownership" of an issue to someone else. I don't know. I don't care much in fact :-)

    pganssle added a commit to pganssle/cpython that referenced this issue Apr 27, 2023
    This breaks the tests, but we are keeping it as a separate commit so
    that the move operation and editing of the moved files are separate, for
    a cleaner history.
    pganssle added a commit to pganssle/cpython that referenced this issue Apr 27, 2023
    Without the change to the reprs, pure-python classes would have a repr
    of `datetime._pydatetime.time`, etc.
    pganssle added a commit to pganssle/cpython that referenced this issue May 2, 2023
    This breaks the tests, but we are keeping it as a separate commit so
    that the move operation and editing of the moved files are separate, for
    a cleaner history.
    pganssle added a commit to pganssle/cpython that referenced this issue May 2, 2023
    Without the change to the reprs, pure-python classes would have a repr
    of `datetime._pydatetime.time`, etc.
    May 3, 2023
    This breaks the tests, but we are keeping it as a separate commit so
    that the move operation and editing of the moved files are separate, for
    a cleaner history.
    May 3, 2023
    Without the change to the reprs, pure-python classes would have a repr
    of `datetime._pydatetime.time`, etc.
    pganssle pushed a commit that referenced this issue May 25, 2023
    gh-84796: Add back UTC to datetime.__all__
    
    This was mistakenly dropped in #103637
    
    Noticed when updating typeshed for Python 3.12
    miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jun 23, 2023
    pythongh-84796: Add back UTC to datetime.__all__
    
    This was mistakenly dropped in pythonGH-103637
    
    Noticed when updating typeshed for Python 3.12
    (cherry picked from commit 076b620)
    
    Co-authored-by: Shantanu <[email protected]>
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes stdlib Python modules in the Lib dir
    Projects
    Archived in project
    Development

    No branches or pull requests

    5 participants