Python Dataclasses
The @dataclass decorator, introduced in Python 3.7 (PEP 557), automatically generates boilerplate special methods like __init__, __repr__, and __eq__ based on class-level field annotations. This lets you define data-holding classes cleanly, without writing repetitive initialization code.
The Problem Dataclasses Solve
Before dataclasses, a simple data-holding class required a lot of ceremony:
With @dataclass, this collapses to:
The decorator reads the annotated class variables and generates __init__, __repr__, and __eq__ for you.
Generated Methods
By default, @dataclass generates three methods:
__init__: Accepts all fields as parameters and assigns them.__repr__: Returns a string likeClassName(field=value, ...).__eq__: Compares all fields for equality.
You can control what gets generated with decorator parameters:
Setting order=True generates __lt__, __le__, __gt__, __ge__ so instances can be sorted.
Default Values
Fields can have default values. Fields with defaults must come after fields without:
The field() Function
For more complex defaults and field configuration, use field():
Key field() parameters:
default: A static default value (for immutable types).default_factory: A zero-argument callable called to produce the default. Use this for mutable defaults like lists and dicts โ never setdefault=[]directly.repr: Include this field in__repr__output (defaultTrue).compare: Include this field in__eq__and ordering comparisons (defaultTrue).init: Include this field as a parameter in__init__(defaultTrue).
Frozen Dataclasses
Setting frozen=True makes instances immutable โ attempting to assign to any field raises a FrozenInstanceError:
Frozen dataclasses are also hashable by default (since their fields can't change), which means they can be used as dictionary keys or in sets:
__post_init__ for Validation
__post_init__ is called by the generated __init__ after all fields have been assigned. Use it for validation and computed fields:
InitVar: Parameters That Don't Become Fields
InitVar lets you accept parameters in __init__ that are passed to __post_init__ but not stored as fields:
Dataclass Inheritance
Dataclasses support inheritance. Child classes can add new fields, but fields with defaults in the parent cause an issue if the child tries to add fields without defaults after them:
asdict() and astuple()
The dataclasses module provides utility functions to convert instances to plain Python structures:
asdict() is particularly useful for serializing to JSON:
Dataclass vs namedtuple vs Plain Class
| Feature | @dataclass | namedtuple | Plain class |
|---|---|---|---|
| Mutable | Yes (or frozen) | No | Yes |
Auto __repr__ | Yes | Yes | No |
Auto __eq__ | Yes | Yes | No |
| Hashable | Only if frozen | Yes | No (by default) |
| Default values | Yes (field()) | Limited | Yes |
| Inheritance | Full support | Limited | Full support |
| Type hints | Required | Optional | Optional |
__post_init__ | Yes | No | N/A |
Use @dataclass for mutable records with validation. Use namedtuple for lightweight immutable records where memory is critical. Use plain classes when you need full control or the object has complex behavior beyond data storage.
Knowledge Check
Why should you use `field(default_factory=list)` instead of `field(default=[])` for a list field in a dataclass?
What does setting `frozen=True` on a dataclass enable?
When is `__post_init__` called in a dataclass?