Wednesday, October 30, 2024

Using Pydantic models instead of dictionaries increases maintainability in Python apps

The dictionary is a fundamental data structure in python and the uses of them occur often. We get parameters from a request to an endpoint, responses from APIs and internally have functions and methods to pass data to. The dictionary works in many of these cases and is easy to create. The ease of this creation comes at a cost of maintainability and clarity of the intentions and responsibilities of any data object.
In this post I outline the advantages of using pydantic to represent data internally and when integrating with external sources.
 

Objects have responsibilities


For dictionaries, you have to deal with the mental overhead of what the data is, where it came from and what it is used for.
Improved code readability

Pydantic models are explicitly defined, making it clear what data is expected and what types are required. This improves code readability, as the structure and constraints of the data are evident from the model definition.

Dataclasses and dictionaries can become unwieldy and hard to read, especially when dealing with complex data structures.

Stronger typing and validation


Pydantic models provide strong typing and validation out of the box. When you define a pydantic model, you specify the types of each attribute, and Pydantic will automatically validate the data at runtime. This ensures that your data conforms to the expected structure and types, reducing the likelihood of errors. It is also easier to add complex validation and keep it in the scope of the object

When using mypy for checking type safety and maintainability, the explicit types and nameing allow more coherent code and easier collaboration.

In contrast, dataclasses and dictionaries do not provide built-in validation. While you can add validation manually, it's easy to forget or make mistakes. Pydantic's automatic validation helps! I guess the case can be made that dataclasses and Basemodel are on par with the intention of the responsibilities through the class naming, the easier syntax in pydantic is easier to read

Example:

https://gist.github.com/jseller/65be3a2b30adc749496b6854dee097fc

Conclusion


For quality code that is easier to understand (in your own head and in PRs) try using the BaseModel of a dictionary and see the differences in the quality of your code. Dataclasses provided a big improvement in the developer experience but pydantic has taken this a step further.

Objects have responsibilities and behaviour: using a specific object that fulfils the responsibility of the object. Data objects are responsible for their attributes and validation of them. Full Stop! 
Use logical classes to use these data objects and maintain a cleaner separation of responsibilities. When unit testing the behaviour of using the logical object, the higher quality of the data objects used will show.