Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python 3.7: Utility of Dataclasses and SimpleNameSpace

Python 3.7 provides new dataclasses which have predefined special functions.

From an overview point, dataclasses and SimpleNamespace both provide nice data encapsulating facility.

@dataclass
class MyData:
    name:str
    age: int

data_1 = MyData(name = 'JohnDoe' , age = 23)

data_2 = SimpleNamespace(name = 'JohnDoe' , age = 23)

A lot of times I use SimpleNamespace just to wrap data and move it around.

I even subclass it to add special functions:

from types import SimpleNamespace

class NewSimpleNameSpace(SimpleNamespace):
    def __hash__(self):
        return some_hashing_func(self.__dict__)

For my question:

  1. How does someone choose between SimpleNamespace and dataclasses?
  2. Why were they necessary, when the same effect can be achieved with extending the SimpleNamespace?
  3. What all other use cases dataclasses cater to?
like image 253
xssChauhan Avatar asked Jun 28 '18 12:06

xssChauhan


People also ask

What are Python Dataclasses for?

Python introduced the dataclass in version 3.7 (PEP 557). The dataclass allows you to define classes with less code and more functionality out of the box.

What is SimpleNamespace Python?

SimpleNamespace. A simple object subclass that provides attribute access to its namespace, as well as a meaningful repr. Unlike object , with SimpleNamespace you can add and remove attributes. If a SimpleNamespace object is initialized with keyword arguments, those are directly added to the underlying namespace.

What are Dataclasses used for?

A dataclass is a Python module with which user-defined classes can be decorated and can be exclusively used to store data/attributes in Python. All these days we never had classes in python just to store attributes like we do in other languages.


2 Answers

Dataclasses is much more like namedtuple and the popular attrs package than SimpleNamespace (which isn't even mentioned in the PEP). They serve two different intended purposes.

Dataclasses

  • Structured
  • Typed (by default, but optional)
  • Writes most of the boilerplate for basic dunder methods (__init__, __hash__, __eq__, and many more)
  • Provide easy mechanism for default values for attributes
  • Can easily add __slots__ and methods

SimpleNamespace

  • "Grab bag" data structure
  • Used where you need more than a dictionary but less than a class
  • Not intended to be use things like __slots__

From the SimpleNamespace documentation:

SimpleNamespace may be useful as a replacement for class NS: pass. However, for a structured record type use namedtuple() instead.

Since @dataclass is supposed to replace a lot of the use cases of namedtuple, named records/structs should be done with @dataclass, not SimpleNamespace.

You may also want to look at this PyCon talk by Raymond Hettinger, where he goes into the backstory of @dataclass and it's use.

like image 74
Edward Minnix Avatar answered Oct 02 '22 07:10

Edward Minnix


The short answer is that this is all covered by PEP 557. Taking your questions slightly out of order...

Why?

  1. To leverage PEP 526 to provide a simple way to define such classes.
  2. To support static type checkers.

How to pick when to use them?

The PEP is quite clear that they are not a replacement and expect the other solutions to have their own place.

Like any other design decision, you'll therefore need to decide exactly what features you care about. If that includes the following, you definitely don't want dataclasses.

Where is it not appropriate to use Data Classes?

API compatibility with tuples or dicts is required. Type validation beyond that provided by PEPs 484 and 526 is required, or value validation or conversion is required.

That said, the same is true for SimpleNameSpace, so what else can we look at to decide? Let's take a closer look at the extra features provided by dataclasses...

The existing definition of SimpleNameSpace is as follows:

A simple object subclass that provides attribute access to its namespace, as well as a meaningful repr.

The python docs then go on to say that it provides a simple __init__, __repr__ and __eq__ implementation. Comparing this to PEP 557, dataclasses also give you options for:

  • ordering - comparing the class as if it were a tuple of its fields, in order.
  • immutability - where assigning to fields will generate an exception
  • control of the hashing - though this isn't recommended.

Clearly, then, you should use dataclasses if you care about ordering or immutability (or need the niche hashing control).

Other use cases?

None that I can see, though you could argue that the initial "why?" covers other use cases.

like image 20
Peter Brittain Avatar answered Oct 03 '22 07:10

Peter Brittain