For classes in general, you can access the __annotations__:
>>> class Foo:
... bar: int
... baz: str
...
>>> Foo.__annotations__
{'bar': <class 'int'>, 'baz': <class 'str'>}
This returns a dict mapping attribute name to annotation.
However, dataclasses use dataclasses.Field objects to encapsulate a lot of this information. You can use dataclasses.fields on an instance or on the class:
>>> import dataclasses
>>> @dataclasses.dataclass
... class Foo:
... bar: int
... baz: str
...
>>> dataclasses.fields(Foo)
(Field(name='bar',type=<class 'int'>,default=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,default_factory=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), Field(name='baz',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,default_factory=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD))
NOTE:
Starting in Python 3.7, the evaluation of annotations can be postponed:
>>> from __future__ import annotations
>>> class Foo:
... bar: int
... baz: str
...
>>> Foo.__annotations__
{'bar': 'int', 'baz': 'str'}
note, the annotation is kept as a string, this also affects dataclasses as well:
>>> @dataclasses.dataclass
... class Foo:
... bar: int
... baz: str
...
>>> dataclasses.fields(Foo)
(Field(name='bar',type='int',default=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,default_factory=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), Field(name='baz',type='str',default=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,default_factory=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD))
So, just be aware, since this will become the standard behavior, code you write should probably use the __future__ import and work under that assumption, because in Python 3.10, this will become the standard behavior.
The motivation behind this behavior is that the following currently raises an error:
>>> class Node:
... def foo(self) -> Node:
... return Node()
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in Node
NameError: name 'Node' is not defined
But with the new behavior:
>>> from __future__ import annotations
>>> class Node:
... def foo(self) -> Node:
... return Node()
...
>>>
One way to handle this is to use the typing.get_type_hints, which I believe just basically eval's the type hints:
>>> import typing
>>> typing.get_type_hints(Node.foo)
{'return': <class '__main__.Node'>}
>>> class Foo:
... bar: int
... baz: str
...
>>> Foo.__annotations__
{'bar': 'int', 'baz': 'str'}
>>> import typing
>>> typing.get_type_hints(Foo)
{'bar': <class 'int'>, 'baz': <class 'str'>}
Not sure how reliable this function is, but basically, it handles getting the appropriate globals and locals of where the class was defined. So, consider:
(py38) juanarrivillaga@Juan-Arrivillaga-MacBook-Pro ~ % cat test.py
from __future__ import annotations
import typing
class Node:
next: Node
(py38) juanarrivillaga@Juan-Arrivillaga-MacBook-Pro ~ % python
Python 3.8.5 (default, Sep 4 2020, 02:22:02)
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import test
>>> test.Node
<class 'test.Node'>
>>> import typing
>>> typing.get_type_hints(test.Node)
{'next': <class 'test.Node'>}
Naively, you might try something like:
>>> test.Node.__annotations__
{'next': 'Node'}
>>> eval(test.Node.__annotations__['next'])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 1, in <module>
NameError: name 'Node' is not defined
You could hack together something like:
>>> eval(test.Node.__annotations__['next'], vars(test))
<class 'test.Node'>
But it can get tricky
Answer from juanpa.arrivillaga on Stack OverflowFor classes in general, you can access the __annotations__:
>>> class Foo:
... bar: int
... baz: str
...
>>> Foo.__annotations__
{'bar': <class 'int'>, 'baz': <class 'str'>}
This returns a dict mapping attribute name to annotation.
However, dataclasses use dataclasses.Field objects to encapsulate a lot of this information. You can use dataclasses.fields on an instance or on the class:
>>> import dataclasses
>>> @dataclasses.dataclass
... class Foo:
... bar: int
... baz: str
...
>>> dataclasses.fields(Foo)
(Field(name='bar',type=<class 'int'>,default=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,default_factory=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), Field(name='baz',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,default_factory=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD))
NOTE:
Starting in Python 3.7, the evaluation of annotations can be postponed:
>>> from __future__ import annotations
>>> class Foo:
... bar: int
... baz: str
...
>>> Foo.__annotations__
{'bar': 'int', 'baz': 'str'}
note, the annotation is kept as a string, this also affects dataclasses as well:
>>> @dataclasses.dataclass
... class Foo:
... bar: int
... baz: str
...
>>> dataclasses.fields(Foo)
(Field(name='bar',type='int',default=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,default_factory=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), Field(name='baz',type='str',default=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,default_factory=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD))
So, just be aware, since this will become the standard behavior, code you write should probably use the __future__ import and work under that assumption, because in Python 3.10, this will become the standard behavior.
The motivation behind this behavior is that the following currently raises an error:
>>> class Node:
... def foo(self) -> Node:
... return Node()
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in Node
NameError: name 'Node' is not defined
But with the new behavior:
>>> from __future__ import annotations
>>> class Node:
... def foo(self) -> Node:
... return Node()
...
>>>
One way to handle this is to use the typing.get_type_hints, which I believe just basically eval's the type hints:
>>> import typing
>>> typing.get_type_hints(Node.foo)
{'return': <class '__main__.Node'>}
>>> class Foo:
... bar: int
... baz: str
...
>>> Foo.__annotations__
{'bar': 'int', 'baz': 'str'}
>>> import typing
>>> typing.get_type_hints(Foo)
{'bar': <class 'int'>, 'baz': <class 'str'>}
Not sure how reliable this function is, but basically, it handles getting the appropriate globals and locals of where the class was defined. So, consider:
(py38) juanarrivillaga@Juan-Arrivillaga-MacBook-Pro ~ % cat test.py
from __future__ import annotations
import typing
class Node:
next: Node
(py38) juanarrivillaga@Juan-Arrivillaga-MacBook-Pro ~ % python
Python 3.8.5 (default, Sep 4 2020, 02:22:02)
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import test
>>> test.Node
<class 'test.Node'>
>>> import typing
>>> typing.get_type_hints(test.Node)
{'next': <class 'test.Node'>}
Naively, you might try something like:
>>> test.Node.__annotations__
{'next': 'Node'}
>>> eval(test.Node.__annotations__['next'])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 1, in <module>
NameError: name 'Node' is not defined
You could hack together something like:
>>> eval(test.Node.__annotations__['next'], vars(test))
<class 'test.Node'>
But it can get tricky
Check this out:
from dataclasses import dataclass
@dataclass
class Point:
x: int
y: int
Point.__annotations__ returns {'x': <class 'int'>, 'y': <class 'int'>}.
Print all values within a class?
python - How do I pull out the attributes or field names from a dataclass? - Stack Overflow
How can I get Python 3.7 new dataclass field types? - Stack Overflow
python - Get the name of all fields in a dataclass - Stack Overflow
Videos
If I have the below code, how can I write a print statement to show the values of self.1, self.2, and self.3?
class Name:
def __init__(self):
self.1 = "a"
self.2 = "b"
self.3 = "c"Inspecting __annotations__ gives you the raw annotations, but those don't necessarily correspond to a dataclass's field types. Things like ClassVar and InitVar show up in __annotations__, even though they're not fields, and inherited fields don't show up.
Instead, call dataclasses.fields on the dataclass, and inspect the field objects:
field_types = {field.name: field.type for field in fields(MyClass)}
Neither __annotations__ nor fields will resolve string annotations. If you want to resolve string annotations, the best way is probably typing.get_type_hints. get_type_hints will include ClassVars and InitVars, so we use fields to filter those out:
resolved_hints = typing.get_type_hints(MyClass)
field_names = [field.name for field in fields(MyClass)]
resolved_field_types = {name: resolved_hints[name] for name in field_names}
from dataclasses import dataclass
@dataclass
class MyClass:
id: int = 0
name: str = ''
myclass = MyClass()
myclass.__annotations__
>> {'id': int, 'name': str}
myclass.__dataclass_fields__
>> {'id': Field(name='id',type=<class 'int'>,default=0,default_factory=<dataclasses._MISSING_TYPE object at 0x0000000004EED668>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD),
'name': Field(name='name',type=<class 'str'>,default='',default_factory=<dataclasses._MISSING_TYPE object at 0x0000000004EED668>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD)}
on a side note there is also:
myclass.__dataclass_params__
>>_DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)
This example shows only a name, type and value, however, __dataclass_fields__ is a dict of Field objects, each containing information such as name, type, default value, etc.
Using dataclasses.fields()
Using dataclasses.fields() you can access fields you defined in your dataclass.
fields = dataclasses.fields(dataclass_instance)
Using inspect.getmembers()
Using inspect.getmembers() you can access all fields in your dataclass.
members = inspect.getmembers(type(dataclass_instance))
fields = list(dict(members)['__dataclass_fields__'].values())
Complete code solution
import dataclasses
import inspect
@dataclasses.dataclass
class Test:
a: str = "a value"
b: str = "b value"
def print_data_class(dataclass_instance):
# option 1: fields
fields = dataclasses.fields(dataclass_instance)
# option 2: inspect
members = inspect.getmembers(type(dataclass_instance))
fields = list(dict(members)['__dataclass_fields__'].values())
for v in fields:
print(f'{v.name}: ({v.type.__name__}) = {getattr(dataclass_instance, v.name)}')
print_data_class(Test())
# a: (str) = a value
# b: (str) = b value
print_data_class(Test(a="1", b="2"))
# a: (str) = 1
# b: (str) = 2
Also, you can use __annotations__, well, because data fields are always annotated. This is the essense of dataclasses usage.
It works with classes
fields = list(Test.__annotations__)
and with instances
fields = list(test.__annotations__)
There should be noted that it doesn't work with dataclass subclasses. Obviously. However, simplicity gives you fields names directly, without extra code for extraction from Field objects.
The dataclasses module doesn't provide built-in support for this use case, i.e. loading YAML data to a nested class model.
In such a scenario, I would turn to a ser/de library such as dataclass-wizard, which provides OOTB support for (de)serializing YAML data, via the PyYAML library.
Disclaimer: I am the creator and maintener of this library.
Step 1: Generate a Dataclass Model
Note: I will likely need to make this step easier for generating a dataclass model for YAML data. Perhaps worth creating an issue to look into as time allows. Ideally, usage is from the CLI, however since we have YAML data, it is tricky, because the utility tool expects JSON.
So easiest to do this in Python itself, for now:
from json import dumps
# pip install PyYAML dataclass-wizard
from yaml import safe_load
from dataclass_wizard.wizard_cli import PyCodeGenerator
yaml_string = """
account: 12345
clusters:
- name: cluster_1
endpoint: https://cluster_2
certificate: abcdef
- name: cluster_1
endpoint: https://cluster_2
certificate: abcdef
"""
py_code = PyCodeGenerator(experimental=True, file_contents=dumps(safe_load(yaml_string))).py_code
print(py_code)
Prints:
from __future__ import annotations
from dataclasses import dataclass
from dataclass_wizard import JSONWizard
@dataclass
class Data(JSONWizard):
"""
Data dataclass
"""
account: int
clusters: list[Cluster]
@dataclass
class Cluster:
"""
Cluster dataclass
"""
name: str
endpoint: str
certificate: str
Step 2: Use Generated Dataclass Model, alongside YAMLWizard
Contents of my_file.yml:
account: 12345
clusters:
- name: cluster_1
endpoint: https://cluster_5
certificate: abcdef
- name: cluster_2
endpoint: https://cluster_7
certificate: xyz
Python code:
from __future__ import annotations
from dataclasses import dataclass
from pprint import pprint
from dataclass_wizard import YAMLWizard
@dataclass
class Data(YAMLWizard):
account: int
clusters: list[Cluster]
@dataclass
class Cluster:
name: str
endpoint: str
certificate: str
data = Data.from_yaml_file('./my_file.yml')
pprint(data)
for c in data.clusters:
print(c.endpoint)
Result:
Data(account=12345,
clusters=[Cluster(name='cluster_1',
endpoint='https://cluster_5',
certificate='abcdef'),
Cluster(name='cluster_2',
endpoint='https://cluster_7',
certificate='xyz')])
https://cluster_5
https://cluster_7
As Barmar points out in a comment, even though you have correctly typed the _clusters key in your AWSInfo dataclass...
@dataclass
class AWSInfo:
_account: int
_clusters: list[ClusterInfo]
...the dataclasses module isn't smart enough to automatically convert the members of the clusters list in in your input data into the appropriate data type. If you use a more comprehensive data model library like Pydantic, things will work like you expect:
import yaml
from pydantic import BaseModel
class ClusterInfo(BaseModel):
name: str
endpoint: str
certificate: str
class AWSInfo(BaseModel):
account: int
clusters: list[ClusterInfo]
with open('clusters.yml', 'r') as fd:
clusters = yaml.safe_load(fd)
a = AWSInfo(**clusters)
print(a.account) #prints 12345
print(a.clusters) #prints the dict of both clusters
print(a.clusters[0]) #prints the dict of the first cluster
#These prints fails with AttributeError: 'dict' object has no attribute '_endpoint'
print(a.clusters[0].endpoint)
for c in a.clusters:
print(c.endpoint)
Running the above code (with your sample input) produces:
12345
[ClusterInfo(name='cluster_1', endpoint='https://cluster_2', certificate='abcdef'), ClusterInfo(name='cluster_1', endpoint='https://cluster_2', certificate='abcdef')]
name='cluster_1' endpoint='https://cluster_2' certificate='abcdef'
https://cluster_2
https://cluster_2
https://cluster_2