Chapter 6: Descriptors: The Power Behind Attribute Access

If you have ever used a property to create a managed attribute, or wondered how a method call like my_instance.my_method() automatically passes self as the first argument, you have used descriptors. Descriptors are the low-level mechanism that powers much of Python's attribute access magic.

A descriptor is an object that has at least one of the __get__, __set__, or __delete__ methods. They are defined on a class and manage attribute access for instances of that class. Understanding them is key to mastering object-oriented programming in Python and creating highly flexible and reusable code.

This chapter covers:

  1. The descriptor protocol and its methods.

  2. The critical difference between data and non-data descriptors.

  3. How property is just a user-friendly descriptor.

  4. Using __slots__ for memory optimization, a feature powered by descriptors.

The Descriptor Protocol

The protocol consists of three optional methods. A class only needs to implement one of them to be considered a descriptor.

  1. __get__(self, instance, owner): Called when the descriptor's attribute is retrieved.

    • self: The descriptor instance itself.

    • instance: The instance the attribute was accessed through, or None if accessed through the class.

    • owner: The class that owns the descriptor.

  2. __set__(self, instance, value): Called when the attribute is set on an instance.

    • instance: The instance whose attribute is being set.

    • value: The value being assigned to the attribute.

  3. __delete__(self, instance): Called when del is used on the attribute.

Let's build a practical descriptor that validates data, ensuring an attribute is a positive number.

Notice a slight problem: we have to pass the name 'price' to the descriptor. We can improve this using __set_name__, a special method called on a descriptor when it's assigned to an attribute on a class.

Data vs. Non-Data Descriptors

The distinction between these two is critical to understanding attribute lookup order.

  • A data descriptor is a descriptor that implements __set__ or __delete__.

  • A non-data descriptor only implements __get__.

This matters because data descriptors have a higher precedence in the attribute lookup chain. When you access an attribute like obj.attr:

  1. Python checks if attr is a data descriptor on the class of obj. If so, its __get__ method is called and that's the end of it.

  2. If not, it checks the instance's __dict__ for attr.

  3. If not found there, it checks if attr is a non-data descriptor on the class.

  4. Finally, it looks for attr on the class itself.

This is why you can't shadow a property (a data descriptor) by setting an instance attribute of the same name, but you can shadow a regular method (a non-data descriptor).

property: A Convenient Descriptor

The built-in property function is simply a concise, high-level way to create data descriptors. Our PositiveNumber example could be rewritten inside the Product class using properties.

While this version is often cleaner for simple cases within a single class, the descriptor approach is far more reusable. We can use PositiveNumber in any class that needs validated attributes, adhering to the Don't Repeat Yourself (DRY) principle.

__slots__: Memory Optimization with Descriptors

By default, every instance of a class has a __dict__ to store its attributes. For applications creating millions of small objects, the memory overhead of these dictionaries can be significant.

The __slots__ class attribute provides a solution. It's a sequence of strings that declares the instance attributes. When __slots__ is defined:

  1. Python does not create a __dict__ for each instance.

  2. Instead, it uses a more compact, fixed-size array for the attributes.

  3. Behind the scenes, Python creates a descriptor for each attribute name in __slots__.

The memory savings can be substantial, but this comes at the cost of flexibility—you can no longer add new attributes to instances on the fly.

Summary

Descriptors are a fundamental part of Python's object model. They provide a powerful, low-level mechanism for intercepting and customizing attribute access. Understanding the protocol, the difference between data and non-data descriptors, and their relationship to features like property and __slots__ allows you to write more efficient, reusable, and robust object-oriented code.

Last updated