๐Ÿ—ƒ๏ธ@dataclass: fields, frozen, __post_init__LESSON

Python Dataclasses

The @dataclass decorator, introduced in Python 3.7 (PEP 557), automatically generates boilerplate special methods like __init__, __repr__, and __eq__ based on class-level field annotations. This lets you define data-holding classes cleanly, without writing repetitive initialization code.

The Problem Dataclasses Solve

Before dataclasses, a simple data-holding class required a lot of ceremony:

With @dataclass, this collapses to:

The decorator reads the annotated class variables and generates __init__, __repr__, and __eq__ for you.

Generated Methods

By default, @dataclass generates three methods:

  • __init__: Accepts all fields as parameters and assigns them.
  • __repr__: Returns a string like ClassName(field=value, ...).
  • __eq__: Compares all fields for equality.

You can control what gets generated with decorator parameters:

Setting order=True generates __lt__, __le__, __gt__, __ge__ so instances can be sorted.

Default Values

Fields can have default values. Fields with defaults must come after fields without:

The field() Function

For more complex defaults and field configuration, use field():

Key field() parameters:

  • default: A static default value (for immutable types).
  • default_factory: A zero-argument callable called to produce the default. Use this for mutable defaults like lists and dicts โ€” never set default=[] directly.
  • repr: Include this field in __repr__ output (default True).
  • compare: Include this field in __eq__ and ordering comparisons (default True).
  • init: Include this field as a parameter in __init__ (default True).

Frozen Dataclasses

Setting frozen=True makes instances immutable โ€” attempting to assign to any field raises a FrozenInstanceError:

Frozen dataclasses are also hashable by default (since their fields can't change), which means they can be used as dictionary keys or in sets:

__post_init__ for Validation

__post_init__ is called by the generated __init__ after all fields have been assigned. Use it for validation and computed fields:

InitVar: Parameters That Don't Become Fields

InitVar lets you accept parameters in __init__ that are passed to __post_init__ but not stored as fields:

Dataclass Inheritance

Dataclasses support inheritance. Child classes can add new fields, but fields with defaults in the parent cause an issue if the child tries to add fields without defaults after them:

asdict() and astuple()

The dataclasses module provides utility functions to convert instances to plain Python structures:

asdict() is particularly useful for serializing to JSON:

Dataclass vs namedtuple vs Plain Class

Feature@dataclassnamedtuplePlain class
MutableYes (or frozen)NoYes
Auto __repr__YesYesNo
Auto __eq__YesYesNo
HashableOnly if frozenYesNo (by default)
Default valuesYes (field())LimitedYes
InheritanceFull supportLimitedFull support
Type hintsRequiredOptionalOptional
__post_init__YesNoN/A

Use @dataclass for mutable records with validation. Use namedtuple for lightweight immutable records where memory is critical. Use plain classes when you need full control or the object has complex behavior beyond data storage.

Knowledge Check

Why should you use `field(default_factory=list)` instead of `field(default=[])` for a list field in a dataclass?

What does setting `frozen=True` on a dataclass enable?

When is `__post_init__` called in a dataclass?