According to the Classes section of the Python docs:
“Private” instance variables that cannot be accessed except from inside an object don’t exist in Python. However, there is a convention that is followed by most Python code: a name prefixed with an underscore (e.g. _spam) should be treated as a non-public part of the API (whether it is a function, a method or a data member). It should be considered an implementation detail and subject to change without notice.
Since there is a valid use-case for class-private members (namely to avoid name clashes of names with names defined by subclasses), there is limited support for such a mechanism, called name mangling. Any identifier of the form __spam (at least two leading underscores, at most one trailing underscore) is textually replaced with _classname__spam, where classname is the current class name with leading underscore(s) stripped. This mangling is done without regard to the syntactic position of the identifier, as long as it occurs within the definition of a class.
_something indicates to others that something isn't part of the API and can/will be changed without notice, ie should be treated as internal/private.
If you are using inheritance, then __something is a better choice as it both indicates it's an implementation detail and avoids name conflicts with subclasses.
I’m taking a class in python that teaches me to use a single underscore (_classname) but I’ve noticed at work, my colleagues often use a double underscore in some code (__classname). What’s the difference in using a single versus double underscore? Both ways seem to make the things private?
Videos
Single Underscore
In a class, names with a leading underscore indicate to other programmers that the attribute or method is intended to be be used inside that class. However, privacy is not enforced in any way. Using leading underscores for functions in a module indicates it should not be imported from somewhere else.
From the PEP-8 style guide:
_single_leading_underscore: weak "internal use" indicator. E.g.from M import *does not import objects whose name starts with an underscore.
Double Underscore (Name Mangling)
From the Python docs:
Any identifier of the form
__spam(at least two leading underscores, at most one trailing underscore) is textually replaced with_classname__spam, whereclassnameis the current class name with leading underscore(s) stripped. This mangling is done without regard to the syntactic position of the identifier, so it can be used to define class-private instance and class variables, methods, variables stored in globals, and even variables stored in instances. private to this class on instances of other classes.
And a warning from the same page:
Name mangling is intended to give classes an easy way to define “private” instance variables and methods, without having to worry about instance variables defined by derived classes, or mucking with instance variables by code outside the class. Note that the mangling rules are designed mostly to avoid accidents; it still is possible for a determined soul to access or modify a variable that is considered private.
Example
>>> class MyClass():
... def __init__(self):
... self.__superprivate = "Hello"
... self._semiprivate = ", world!"
...
>>> mc = MyClass()
>>> print mc.__superprivate
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: myClass instance has no attribute '__superprivate'
>>> print mc._semiprivate
, world!
>>> print mc.__dict__
{'_MyClass__superprivate': 'Hello', '_semiprivate': ', world!'}
_foo: Only a convention. A way for the programmer to indicate that the variable is private (whatever that means in Python).__foo: This has real meaning. The interpreter replaces this name with_classname__fooas a way to ensure that the name will not overlap with a similar name in another class.__foo__: Only a convention. A way for the Python system to use names that won't conflict with user names.
No other form of underscores have meaning in the Python world. Also, there's no difference between class, variable, global, etc in these conventions.
In many places, I've just seen a single underscore meant mostly as a hint to the developer.
What's the best practice here?
Note: I'm aware that Python isn't truly private. This question is focusing more on the principles and rationale driving code-level design choices.
Private attributes and methods are supposed to be accessible by only the class and its instantiated objects, not external parties.
In Python, prefixing class attributes/methods with one underscore is, from what I've read, just a symbolic indication to readers of the code that this is intended to be private, but it's not enforced. Prefixing with two underscores implements name-mangling, which is seen as a stricter enforcement of privacy (within Python's language constraints). For example:
class Test:
abc = 1
_abc = 2
__abc = 3
print(Test.abc) # This works
print(Test._abc) # This also works
print(Test.__abc) # This throws an error
print(Test._Test__abc) # This worksSo, this is how I'm currently interpreting these two prefixes: Single underscore has no practical benefit. If someone intentionally wishes to access names with either prefix, they're able to do it regardless. What double-underscore-prefixes can do, which single underscore can't, is protect against accidental access: the IDEs I've worked with to date don't auto-suggest mangled names, so it takes conscious effort to bypass double-underscore-prefixes.
tldr: names with a single underscore prefix are functionally the same as not having any at all, while double underscore prefixes at least protect against unintended direct access. If my intent is to make something private, why would I use single underscore at all? So, under what circumstances should I use single underscore?