Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve name suggestions for NameError/AttributeError by respecting underscore conventions #116871

Closed
zahlman opened this issue Mar 15, 2024 · 9 comments
Labels
type-feature A feature request or enhancement

Comments

@zahlman
Copy link

zahlman commented Mar 15, 2024

Feature or enhancement

Proposal:

Problem description

In more recent versions of Python, for uncaught NameErrors and AttributeErrors, the system tries to suggest names that might have been typoed:

>>> intt
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'intt' is not defined. Did you mean: 'int'?

However, the suggestion logic apparently does not take underscore conventions into account:

>>> _foo = 1
>>> foo
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'foo' is not defined. Did you mean: '_foo'?

Commonly, leading underscores on names are used by convention to mark names that should not be referenced directly in client code. (Of course, it's occasionally necessary or desirable to call dunder methods directly, particularly when inheritance is involved, but generally not outside of classes.)

This has the negative effect, that Python itself could recommend that users invoke undocumented, unstable or unintentional APIs without good reason. One current real-world example of this occurs with Pandas, where version 2.0 removed the append method from DataFrames, but there happens to be an _append method left behind as an implementation detail:

>>> import pandas as pd
>>> pd.DataFrame().append
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/path/to/lib/python3.11/site-packages/pandas/core/generic.py", line 6296, in __getattr__
    return object.__getattribute__(self, name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'DataFrame' object has no attribute 'append'. Did you mean: '_append'?

Proposal

I propose that: when the invalid identifier or attribute name does not already start with an underscore, valid names that do start with an underscore should be excluded from suggestions. As mentioned in the linked discussion, there is already precedent for this -

  1. As implemented by help:
>>> import pydoc
>>> pydoc.pager = print # to simplify behaviour for the example
>>> class Example:
...     __slots__ = [] # to simplify behaviour for the example
...     def method(self): pass
...     def _method(self): pass
... 
>>> help(Example)
Help on class Example in module __main__:

class Example(builtins.object)
 |  Methods defined here:
 |  
 |  method(self)


>>>
  1. When using star-imports:
>>> _, name, result = 1, 2, {}
>>> exec('from __main__ import *', result)
>>> '_' in result
False
>>> 'name' in result
True

It still makes sense IMO to suggest underscore names when the invalid name already starts with an underscore - even mixing and matching single- and double-underscore cases. For example, _init_ is a very plausible typo for __init__, especially for beginners who are learning from a book or an old PDF.

Has this already been discussed elsewhere?

I have already discussed this feature proposal on Discourse

Links to previous discussion of this feature:

https://discuss.python.org/t/name-suggestions-for-attributeerrors-and-possibly-nameerrors-should-not-include-names-with-single-leading-underscores/48588

Linked PRs

@zahlman zahlman added the type-feature A feature request or enhancement label Mar 15, 2024
@gaogaotiantian
Copy link
Member

I agree with the pandas example, but I disagree with the _foo one. Variables starting with an underscore is semantically suggesting that they should not be used with external code. It's perfectly valid to use it internally.

_foo = 3
print(foo)

For the code above, suggesting using _foo is a correct response. However, the following is not ideal:

import bar
bar.foo  # there's a bar._foo

For the same reason, self.foo should get a suggestion if there's only a self._foo, but obj.foo should not. I don't believe simply removing all the names starting with an underscore is the correct solution. It's debatable whether it's a better solution than the current one - it might be for some (a lot of?) people.

Both existing behaviors that exclude underscores listed are a very clear usage from external. They are valid, but that does not justify we remove underscored variables from NameError suggestions.

@sobolevn
Copy link
Member

I would also note that help() and suggestions are not the same thing.
help() is a documentation tool, which documents public entities.
SUggestions are more about raw names, without any context.

I think that deciding "can we use _-names here?" is way too hard and will have inconsistent results in the end.

serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Mar 17, 2024
Only include underscored names in name suggestions for AttributeError and
ImportError if the original name was underscored.
@serhiy-storchaka
Copy link
Member

It is difficult to distinguish self.foo from obj.foo. It is impossible in most cases, and unreliable in the rest.

@gaogaotiantian
Copy link
Member

It is difficult to distinguish self.foo from obj.foo. It is impossible in most cases, and unreliable in the rest.

_compute_suggestion_error already checks self in frame.f_locals, which would generate both false negative and positive output in extreme cases (where the first argument of a method is not named self and there's a random self local variable). I'm not saying it would be 100% accurate, but making it ignore _ is not 100% accurate either. One could argue that false negative is always better than false positive, but checking for exc_value.obj is frame.f_locals[self] will cover 99% method cases.

@zahlman
Copy link
Author

zahlman commented Mar 20, 2024

I agree generally that it would be difficult to distinguish self.foo from obj.foo cases, although a simple heuristic might be good enough. But honestly I feel like the suggestion should still be suppressed - unless, as I described, the incorrect, attempted name starts with an underscore already.

If some obj has a _foo attribute and I type obj._fo, it seems likely that I know what I'm doing and simply had a typo. But if I type obj.foo, it seems much more likely (to me, at least) that I made a false assumption about how obj works, or ran into a deprecated API, and that I shouldn't actually use obj._foo for what I'm doing. And, honestly, I think the same is true in the self cases. If I don't remember that I marked something (by convention) "for internal use" in my own class implementation, that's different from simply typing too quickly and/or not looking at it (and also a bit worrying). It's pretty hard to "accidentally" not type an underscore, because (at least for standard US keyboards) it's way off in the corner of the keyboard and requires the shift key.

Of course, IDEs will implement their own suggestions (and auto-completions) anyway....

@gaogaotiantian
Copy link
Member

It's very common that someone forgets a method/attribute is internal only. I don't think the suggestion is only for typos. It's helpful to remind the developers that they might've forgotten the underscore.

If someone types self.foo, and there's no such attribute, what would be a better user experience? Remind them that there's a self._foo, or tell them nothing / suggest a completely irrelavent attribute? I don't believe when the developers implement their classes and forget something is internal, they would want the interpreter to shield the "private attributes" - by definition those can be used internally.

@wimglenn
Copy link
Contributor

wimglenn commented Mar 25, 2024

They may not be considered important enough to add special-case handling for in suggestions, but in some instances there are underscore names which are public/documented, e.g. namedtuple types have _make, _asdict, _replace, _fields, _fields_defaults etc

@serhiy-storchaka
Copy link
Member

I implemented @gaogaotiantian's suggestions. Is it all?

serhiy-storchaka added a commit that referenced this issue May 6, 2024
Only include underscored names in name suggestions for AttributeError and
ImportError if the original name was underscored.
@serhiy-storchaka
Copy link
Member

I have merged the PR with @gaogaotiantian's suggestions. If there are other suggestions, they can be implemented in a separate PR.

SonicField pushed a commit to SonicField/cpython that referenced this issue May 8, 2024
)

Only include underscored names in name suggestions for AttributeError and
ImportError if the original name was underscored.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

5 participants