3.7. Data Classes#

3.7.1. Data Classes vs Normal Classes#

If you want to use classes to store data, use the dataclass module. This module is available in Python 3.7+.

With dataclass, you can create a class with attributes, type hints, and a nice representation of the data in a few lines of code. To use dataclass, simply add the @dataclass decorator on top of a class.

from dataclasses import dataclass


@dataclass
class DataClassDog:
    color: str
    age: int
DataClassDog(color="black", age=9)

Without dataclass, you need to use __init__ to assign values to appropriate variables and use __repr__ to create a nice presentation of the data, which can be very cumbersome.

class Dog:
    def __init__(self, color, age):
        self.color = color
        self.age = age

    def __repr__(self):
        return f"Dog(color={self.color} age={self.age})"
Dog(color="black", age=9)

3.7.2. frozen=True: Make Your Data Classes Read-Only#

If you don’t want anybody to adjust the attributes of a class, use @dataclass(frozen=True).

from dataclasses import dataclass


@dataclass(frozen=True)
class DataClassDog:
    color: str
    age: int

Now changing the attribute color of the DataClassDog’s instance will throw an error.

pepper = DataClassDog(color="black", age=9)
pepper.color = "golden"

3.7.3. Compare Between Two Data Classes#

Normally, you need to implement the __eq__ method so that you can compare between two classes.

class Dog:
    def __init__(self, type, age):
        self.type = type
        self.age = age

    def __eq__(self, other):
        return self.type == other.type and self.age == other.age


pepper = Dog(type="Dachshund", age=7)
bim = Dog(type="Dachshund", age=7)
pepper == bim

dataclasses automatically implements the __eq__ method for you. With dataclasses, you can compare between 2 classes by only specifying their attributes.

from dataclasses import dataclass


@dataclass
class DataClassDog:
    type: str
    age: int
pepper = DataClassDog(type="Dachshund", age=7)
bim = DataClassDog(type="Dachshund", age=7)
pepper == bim

3.7.4. Post-init: Add Init Method to a Data Class#

With a data class, you don’t need an __init__ method to assign values to its attributes. However, sometimes you might want to use an ___init__ method to initialize certain attributes. That is when data class’s __post_init__ comes in handy.

In the code below, I use __post_init__ to initialize the attribute info using the attributes names and ages.

from dataclasses import dataclass
from typing import List


@dataclass
class Dog:
    names: str
    age: int


@dataclass
class Dogs:
    names: List[str]
    ages: List[int]

    def __post_init__(self):
        self.info = [Dog(name, age) for name, age in zip(self.names, self.ages)]
names = ["Bim", "Pepper"]
ages = [5, 6]
dogs = Dogs(names, ages)
dogs.info
from dataclasses import dataclass


@dataclass
class Dog:
    names: str
    age: int


dog = Dog(names="Bim", age="ten")
if not isinstance(dog.age, int):
    raise ValueError("Dog's age must be an integer.")

3.7.5. Python Best Practices: Using default_factory for Mutable Defaults#

When defining classes in Python, using mutable default values for instance variables can lead to unexpected behavior.

For example, if you use a list as a default value in a class’s __init__ method, all instances of the class will share the same list object:

class Book:
    def __init__(self, title, authors=[]):
        self.title = title
        self.authors = authors


book1 = Book("Book 1")
book1.authors.append("Author 1")

book2 = Book("Book 2")
print(book2.authors)

In this example, book1 and book2 share the same list object, which is why modifying the list in book1 affects book2.

To avoid this issue, you can use the default_factory parameter in dataclasses, which creates a new object for each instance:

from dataclasses import dataclass, field


@dataclass
class Book:
    title: str
    authors: list = field(default_factory=list)


book1 = Book("Book 1")
book1.authors.append("Author 1")

book2 = Book("Book 2")
print(book2.authors)

Now, each instance has its own separate list object, and modifying one instance’s list does not affect others.

3.7.6. Simplify Nested Structures with Python Data Classes#

Working with nested structures typically requires manual management of nested dictionaries or objects. This can lead to errors and increase code complexity.

# Example of managing nested structures manually
person_data = {
    "name": "Alice",
    "address": {"street": "123 Maple St", "city": "Springfield", "zip": "12345"},
    "contacts": {"email": "alice@example.com", "phone": "555-1234"},
}

# Accessing nested data
street = person_data["address"]["street"]
email = person_data["contacts"]["email"]

print(f"Street: {street}, Email: {email}")
Street: 123 Maple St, Email: alice@example.com

The dataclasses module in Python simplifies the creation and handling of nested structures by providing a clean, declarative syntax.

Define nested data classes to represent more complex hierarchical structures:

from dataclasses import dataclass


## Define nested data classes
@dataclass
class Address:
    street: str
    city: str
    zip: str


@dataclass
class Contacts:
    email: str
    phone: str


@dataclass
class Person:
    name: str
    address: Address
    contacts: Contacts


# Create and use nested data classes
person = Person(
    name="Alice",
    address=Address(street="123 Maple St", city="Springfield", zip="12345"),
    contacts=Contacts(email="alice@example.com", phone="555-1234"),
)

print(f"Street: {person.address.street}, Email: {person.contacts.email}")
Street: 123 Maple St, Email: alice@example.com

In the code above:

  • dataclass is used to define simple, immutable, and type-safe structures.

  • Address, Contacts, and Person are nested classes representing various levels of the hierarchy.

  • Accessing nested data like person.address.street is more intuitive and less error-prone compared to using dictionaries.

The output displays the street and email address of the person. Using nested data classes, each level of the hierarchy is type-safe and easy to access.