How to use Python dataclasses


Everything in Python is an object, or so the saying goes. If you want to create your own custom objects, with their own properties and methods, you use Python’s class object to make that happen. But creating classes in Python sometimes means writing loads of repetitive, boilerplate code to set up the class instance from the parameters passed to it or to create common functions like comparison operators.

Dataclasses, introduced in Python 3.7 (), provide a handy way to make classes less verbose. Many of the common things you do in a class, like instantiating properties from the arguments passed to the class, can be reduced to a few basic instructions.

Python dataclass example

Here is a simple example of a conventional class in Python:

class Book:
'''Object for tracking physical books in a collection.'''
def __init__(self, name: str, weight: float, shelf_id:int = 0): = name
self.weight = weight # in grams, for calculating shipping
self.shelf_id = shelf_id
def __repr__(self):
weight={self.weight!r}, shelf_id={self.shelf_id!r})")

The biggest headache here is the way each of the arguments passed to __init__ has to be copied to the object’s properties. This isn’t so bad if you’re only dealing with Book, but what if you have to deal with BookshelfLibraryWarehouse, and so on? Plus, the more code you have to type by hand, the greater the chances you’ll make a mistake.

Here is the same Python class, implemented as a Python dataclass:

from dataclasses import dataclass

class Book:
    '''Object for tracking physical books in a collection.'''
    name: str
    weight: float 
    shelf_id: int = 0

When you specify properties, called fields, in a dataclass, @dataclass automatically generates all of the code needed to initialize them. It also preserves the type information for each property, so if you use a code linter like mypy, it will ensure that you’re supplying the right kinds of variables to the class constructor.

. Dataclasses offer the same behaviors and more, and they can be made immutable (as namedtuples are) by simply using @dataclass(frozen=True) as the decorator.

Another possible use case is replacing nested dictionaries, which can be clumsy to work with, with nested instances of dataclasses. If you have a dataclass Library, with a list property shelves, you could use a dataclass ReadingRoom to populate that list, and then add methods to make it easy to access nested items (e.g., a book on a shelf in a particular room).

But not every Python class needs to be a dataclass. If you’re creating a class mainly as a way to group together a bunch of static methods, rather than as a container for data, you don’t need to make it a dataclass. For instance, a common pattern with parsers is to have a class that takes in an abstract syntax tree, walks the tree, and dispatches calls to different methods in the class based on the node type. Because the parser class has very little data of its own, a dataclass isn’t useful here.

How to do more with Python

Copyright © 2020 IDG Communications, Inc.