Cleaning the argument list before passing it to the constructor is probably the best way to go about it. I'd advice against writing your own __init__ function though, since the dataclass' __init__ does a couple of other convenient things that you'll lose by overriding it.
Also, since the argument-cleaning logic is very tightly bound to the behavior of the class and returns an instance, it might make sense to put it into a classmethod:
from dataclasses import dataclass
import inspect
@dataclass
class Config:
var_1: str
var_2: str
@classmethod
def from_dict(cls, env):
return cls(**{
k: v for k, v in env.items()
if k in inspect.signature(cls).parameters
})
# usage:
params = {'var_1': 'a', 'var_2': 'b', 'var_3': 'c'}
c = Config.from_dict(params) # works without raising a TypeError
print(c)
# prints: Config(var_1='a', var_2='b')
Answer from Arne on Stack OverflowCleaning the argument list before passing it to the constructor is probably the best way to go about it. I'd advice against writing your own __init__ function though, since the dataclass' __init__ does a couple of other convenient things that you'll lose by overriding it.
Also, since the argument-cleaning logic is very tightly bound to the behavior of the class and returns an instance, it might make sense to put it into a classmethod:
from dataclasses import dataclass
import inspect
@dataclass
class Config:
var_1: str
var_2: str
@classmethod
def from_dict(cls, env):
return cls(**{
k: v for k, v in env.items()
if k in inspect.signature(cls).parameters
})
# usage:
params = {'var_1': 'a', 'var_2': 'b', 'var_3': 'c'}
c = Config.from_dict(params) # works without raising a TypeError
print(c)
# prints: Config(var_1='a', var_2='b')
I would just provide an explicit __init__ instead of using the autogenerated one. The body of the loop only sets recognized value, ignoring unexpected ones.
Note that this won't complain about missing values without defaults until later, though.
import dataclasses
@dataclasses.dataclass(init=False)
class Config:
VAR_NAME_1: str
VAR_NAME_2: str
def __init__(self, **kwargs):
names = set([f.name for f in dataclasses.fields(self)])
for k, v in kwargs.items():
if k in names:
setattr(self, k, v)
Alternatively, you can pass a filtered environment to the default Config.__init__.
field_names = set(f.name for f in dataclasses.fields(Config))
c = Config(**{k:v for k,v in os.environ.items() if k in field_names})
Example code below. This works, but only if the .csv has only name, age, and city fields. If the .csv has more fields than the dataclass has defined, it throws an error like: TypeError: Person.__init__() got an unexpected keyword argument 'state'
Is there a way to have it ignore extra fields? I'm trying to avoid having to remove the fields first from the .csv, or iterate row by row, value by value...but obvs will do that if there's no 'smart' way to ignore. Like, wondering if we can pass desired fields to csv.DictReader? I see it has a fieldnames parameter, but the docs seem to suggest that is for generating a header row when one is missing (meaing, I'd have to pass a value for each column, so I'm back where I started)
Thanks!
import csv
from dataclasses import dataclass
@dataclass
class Person:
name: str
age: int
city: str
with open('people.csv', 'r') as f:
reader = csv.DictReader(f)
people = [Person(**row) for row in reader]
print(people)Videos
I've ended up defining dict_factory in dataclass as staticmethod and then using in as_dict(). Found it more straightforward than messing with metadata.
from typing import Optional, Tuple
from dataclasses import asdict, dataclass
@dataclass
class Space:
size: Optional[int] = None
dtype: Optional[str] = None
shape: Optional[Tuple[int]] = None
@staticmethod
def dict_factory(x):
exclude_fields = ("shape", )
return {k: v for (k, v) in x if ((v is not None) and (k not in exclude_fields))}
s1 = Space(size=2)
s1_dict = asdict(s1, dict_factory=Space.dict_factory)
print(s1_dict)
# {"size": 2}
s2 = Space(dtype='int', shape=(2, 5))
s2_dict = asdict(s2, dict_factory=Space.dict_factory)
print(s2_dict)
# {"dtype": "int"}
# no "shape" key, because it is excluded in dict_factory of the class.
You can add custom metadata to field like field(metadata={"include_in_dict":True}) and in the dict_factory you can check this before anything else and skip the field if needed.
if field_.metadata.get("include_in_dict", False):
continue
» pip install dataclass-wizard
» pip install dataclassy