Exception groups

Probably the most prominent feature that stands out is the addition of a new concept of an exception group. Exception groups were proposed in PEP 654. Basically, they allow for raising multiple exceptions at once.

This can be helpful in the case of running complex code where numerous exceptions can occur. Python 3.11 provides two new builtins: BaseExceptionGroup(BaseException) and ExceptionGroup(BaseExceptionGroup, Exception). Let's examine the basic syntax for the ExceptionGroup:

import traceback as tb

eg = ExceptionGroup(
    "Many things went wrong",
    [
        TypeError(1),
        ValueError(2),
        ExceptionGroup("Even more things went wrong!", [ArithmeticError(3)]),
    ],
)
tb.print_exception(eg)
 | ExceptionGroup: Many things went wrong (3 sub-exceptions)
  +-+---------------- 1 ----------------
    | TypeError: 1
    +---------------- 2 ----------------
    | ValueError: 2
    +---------------- 3 ----------------
    | ExceptionGroup: Even more things went wrong! (1 sub-exception)
    +-+---------------- 1 ----------------
      | ArithmeticError: 3
      +------------------------------------

It's also possible to "filter" exceptions by means of the subgroup method. subgroup returns exception group with the same metadata, however, it contains only exceptions that satisfy provided condition:

filtered = eg.subgroup(lambda e: isinstance(e, ArithmeticError))
tb.print_exception(filtered)
  | ExceptionGroup: Many things went wrong (1 sub-exception)
  +-+---------------- 1 ----------------
    | ExceptionGroup: Even more things went wrong! (1 sub-exception)
    +-+---------------- 1 ----------------
      | ArithmeticError: 3
      +------------------------------------

Similarly, it is possible to obtain a complement of the filtered set by use of the split method:

filtered, others = eg.split(lambda e: isinstance(e, ArithmeticError))
tb.print_exception(filtered)
  | ExceptionGroup: Many things went wrong (1 sub-exception)
  +-+---------------- 1 ----------------
    | ExceptionGroup: Even more things went wrong! (1 sub-exception)
    +-+---------------- 1 ----------------
      | ArithmeticError: 3
      +------------------------------------
tb.print_exception(others)
  | ExceptionGroup: Many things went wrong (2 sub-exceptions)
  +-+---------------- 1 ----------------
    | TypeError: 1
    +---------------- 2 ----------------
    | ValueError: 2
    +------------------------------------

Obviously, Python 3.11 allows also for handling these new exception groups. They can be handled just like the "old" Exception type - in the try ... except statement.

If we imagine handling a complex group of exceptions where we're only concerned with one kind of error, we may use, for example, the subgroup method. Here's an idea of how it might be implemented, again, based on PEP 654.

import traceback as tb

# Some imaginary exception type we want to handle.
class LogicError(Exception):
    pass


# Factory function to create handlers reacting on different exception types. 
# Handlers will be then passed to the
# subgroup method.
def handler_factory(handler_func, exception_type):
    def handler(exception):
        # We do not want to return true on exception groups. 
        # It'd then return the core exception group without
        # traversing the subexceptions.
        if isinstance(exception, ExceptionGroup):
            return False

        if isinstance(exception, exception_type):
            # We first apply injected handler func to the exception
            handler_func(exception)

            # Then return False on handled exceptions, 
            # since we later want to reraise anything that wasn't handled
            return False

        return True

    return handler


def logging_handler(exception):
    print(f"{exception.__class__.__name__} was handled.")


try:
    raise ExceptionGroup("Some runtime errors", [LogicError(), ZeroDivisionError()])
except ExceptionGroup as eg:
    eg = eg.subgroup(handler_factory(logging_handler, LogicError))

    # Examine what would be then reraised as a remainder
    tb.print_exception(eg)

LogicError was handled.

+ Exception Group Traceback (most recent call last):
  |   File "...", line 34, in 
  |     raise ExceptionGroup("Some runtime errors", [LogicError(), 
ZeroDivisionError()])
  | ExceptionGroup: Some runtime errors (1 sub-exception)
  +-+---------------- 1 ----------------
    | ZeroDivisionError
    +------------------------------------

That way, we raised regular exception handler, the logging_handler, that might've existed in our code before 3.11, to handle the exception group.

You may also like: AWS X-Ray - A Distributed Tracing System for Debugging Applications

The except* clause

This new clause was introduced in Python 3.11 to make working with exception groups easier and is again based on the PEP 654. It is a generalization of the standard except clause.

In short, it extracts all exceptions being subtypes of a given type as an exception group and leaves the remainder for further propagation. Here's an example of code using this new syntax:

class LogicError(Exception):
    pass


class LogicErrorA(LogicError):
    pass


class LogicErrorB(LogicError):
    pass


def some_operation():
    raise ExceptionGroup(
        "Operation errors",
        [
            LogicErrorA("First subtype of logic error"),
            LogicErrorB("Second subtype of logic error"),
            # Matching done by the except* clause is recursive
            ZeroDivisionError(),
            ExceptionGroup("", [ZeroDivisionError(), ArithmeticError()]),
            TypeError(),
            ExceptionGroup("", [TypeError(), UnicodeError()]),
        ],
    )


try:
    try:
        some_operation()
    except* LogicError as eg:
        print(eg.exceptions)
    except* ArithmeticError as eg:
        print(eg.exceptions)

# Wrapping in external try ... except statement is necessary 
# because except* can't be mixed with except.
except ExceptionGroup as eg:
    print(f"Not handled: {eg!r}")
(LogicErrorA('First subtype of logic error'), 
LogicErrorB('Second subtype of logic error'))
(ZeroDivisionError(), 
ExceptionGroup('', [ZeroDivisionError(), ArithmeticError()]))
Not handled: ExceptionGroup('Operation errors', 
[TypeError(), ExceptionGroup('', [TypeError(), UnicodeError()])])

Regular Exception caught by the except* (so called naked exception) is wrapped as an ExceptionGroup for consistency:

try:
    raise TypeError("Something went wrong!")
except* TypeError as eg:
    print(f"Exception group caught: {eg!r}")
Exception group caught: ExceptionGroup('', (TypeError('Something went wrong!'),))

In case of the exception raised not being caught by the except*, then, when caught by the regular except clause, it appears as an Exception again:

try:
    try:
        raise TypeError("Something went wrong!")
    except* ValueError as eg:
        print(f"Exception group caught: {eg!r}")
except TypeError as e:
    print(f"Exception caught: {e!r}")
Exception caught: TypeError('Something went wrong!')
Might be interesting: How to manage AWS ECS environment variables with Chamber?

Typing improvements

In 3.11 significant amount of typing features was introduced.

Variadic generics

As proposed by PEP-646, TypeVarTuple was introduced to the typing module. Similarly to how the TypeVar can be parametrized with a single type, the TypeVarTuple can be parametrized with a variadic number of types thus enabling variadic generics.

Use cases for that were originally identified in the numerical computing libraries like NumPy or TensorFlow, where array types could be parametrized with array shape. More about potential use cases can be read here. We may imagine multidimensional array generic to be defined as follows:

from typing import Generic, TypeVar, TypeVarTuple

DType = TypeVar("DType", int, float)
Shape = TypeVarTuple("Shape")


class Array(Generic[DType, *Shape]):
    def __add__(self, other: "Array[DType, *Shape]") -> "Array[DType, *Shape]":
        ...


# And then, we can specify a 2D float array as follows:
float_2d_array: Array[float, int, int] = Array()
from typing import Generic, Callable, TypeVarTuple

Ts = TypeVarTuple("Ts")


class Task(Generic[*Ts]):
    def __init__(self, handler: Callable[[*Ts], None], *args: *Ts):
        self._handler = handler
        self._args = args

    def apply(self) -> None:
        self._handler(*self._args)


def logging_float_sum_handler(x: float, y: float):
    print(f"Logged sum: {x}+{y}={x+y}")


logging_task: Task[Callable[[float, float], None], float, float] = Task(
    logging_float_sum_handler, 1.0, 2.0
)

# ... some time later in the queue:
logging_task.apply()
Logged sum: 1.0+2.0=3.0

Required and NotRequired in TypedDict

Proposed in PEP-655. Enables to declare required and optional elements on a dict in a simple and readable way:

from typing import TypedDict, Required, NotRequired


class EmailData(TypedDict):
    to: Required[str]
    subject: Required[str]
    body: NotRequired[str]


valid_email: EmailData = {
    "to": "john.smith@example.org",
    "subject": "Welcome",
    "body": "Hi John!",
}
also_a_valid_email: EmailData = {
    "to": "john.smith@example.org",
    "subject": "Notification: You we're mentioned",
}  # email body it not really required
invalid_email: EmailData = {
    "to": "john.smith@example.org",
    "body": "John, we're just spamming you at this point.",
}

Self type

The Self type was implemented as proposed in the PEP-673. It offers great help in designing any kind of fluent interface being much more readable than the TypeVar approach used prior to 3.11:

from typing import Self


class EmailBuilder:
    def __init__(self):
        self._to = None
        self._subject = None
        self._body = None

    def set_recipient(self, to: str) -> Self:
        self._to = to
        return self

    def set_subject(self, subject: str) -> Self:
        self._subject = subject
        return self

    def set_body(self, body: str) -> Self:
        self._body = body
        return self

    def build(self) -> EmailData:
        return {"to": self._to, "subject": self._subject, "body": self._body}


email = (
    EmailBuilder()
    .set_recipient("john.smith@example.org")
    .set_subject("Hello again!")
    .set_body("We're not done with you yet, John!")
    .build()
)

print(email)
{'to': 'john.smith@example.org', 'subject': 'Hello again!', 
'body': "We're not done with you yet, John!"}

LiteralString annotation

As proposed in PEP-675. The main motivation for this feature was, again, to make Python more secure. For a secure API, it's best to accept only literal strings or string constructed from literal values.

As described in the original PEP, code that executes raw SQL queries can greatly benefit from this:

from typing import LiteralString


def run_query(query: LiteralString):
    pass  # here an SQL statement is being run


# run_query won't allow for writing code that 
# carelessly passes user controlled input strings as query
def register_user_for_mailing_list(user_email: str):
    run_query(
        f"INSERT INTO mailing_list (user_email) "
        f"VALUES ({user_email})"
    )  # this is a type error

Data class transforms

Proposed in PEP-681. While many libraries with dataclass-like behavior exist, it wasn't really possible to be annotated using standard typing. By dataclass-like we understand decorators that perform synthesis of dunder methods, define frozen behavior and the field specifiers.

Therefore, a new annotation, dataclass_transform, was introduced to be used with classes, metaclasses, and functions that are decorators and which represent said behavior.

This decorator being used indicates to a static type checker that this function, class, or metaclass alters the target class to act in a dataclass-like manner.

dataclass_transform decorator has a runtime effect of setting __dataclass_transform__ dict on a decorated item. Below is a simplistic example of a dataclass_transform usage with a function decorator.

Decorated as_model function itself creates the init method that accepts all class annotations as kwargs and sets as respective attributes at init. We may also observe the __dataclass_transform__ being set on the as_model decorator:

from typing import dataclass_transform, Type, TypeVar

T = TypeVar("T")


@dataclass_transform(kw_only_default=True)
def as_model(cls: Type[T]) -> Type[T]:
    print(f"__dataclass_transform__: {as_model.__dataclass_transform__}")

    def init_instance_with_kwargs(instance: T, **kwargs):
        for item, _type in instance.__annotations__.items():
            argument = kwargs.get(item)
            assert isinstance(argument, _type)
            setattr(instance, item, argument)

    cls.__init__ = init_instance_with_kwargs
    return cls


@as_model
class BookModel:
    id: int
    title: str
    author: str


book = BookModel(id=1, title="Blade runner", author="Philip K. Dick")
print(f"id: {book.id}")
print(f"title: {book.title}")
print(f"author: {book.author}")
__dataclass_transform__: {'eq_default': True, 'order_default': False, 
'kw_only_default': True, 'field_specifiers': (), 'kwargs': {}}
id: 1
title: Blade runner
author: Philip K. Dick

Here's what would be the class equivalent of the above:

@dataclass_transform(kw_only_default=True)
class ModelBase:
    def __init_subclass__(cls):
        print(f"__dataclass_transform__: {ModelBase.__dataclass_transform__}")

        def init_instance_with_kwargs(instance: T, **kwargs):
            for item, _type in instance.__annotations__.items():
                argument = kwargs.get(item)
                assert isinstance(argument, _type)
                setattr(instance, item, argument)

        cls.__init__ = init_instance_with_kwargs


class BookModel(ModelBase):
    id: int
    title: str
    author: str


book = BookModel(
    id=2, title="The Three Stigmata of Palmer Eldritch", author="Philip K. Dick"
)
print(f"id: {book.id}")
print(f"title: {book.title}")
print(f"author: {book.author}")
__dataclass_transform__: {'eq_default': True, 'order_default': False, 
'kw_only_default': True, 'field_specifiers': (), 'kwargs': {}}
id: 2
title: The Three Stigmata of Palmer Eldritch
author: Philip K. Dick

And, for sake of completeness, the metaclass variant:

@dataclass_transform(kw_only_default=True)
class ModelMeta(type):
    def __new__(mcs, name, bases, dict_):
        print(f'__dataclass_transform__: {mcs.__dataclass_transform__}')
        obj = super(ModelMeta, mcs).__new__(mcs, name, bases, dict_)

        def init_instance_with_kwargs(instance: T, **kwargs):
            for item, _type in instance.__annotations__.items():
                argument = kwargs.get(item)
                assert isinstance(argument, _type)
                setattr(instance, item, argument)

        obj.__init__ = init_instance_with_kwargs
        return obj


class BookModel(metaclass=ModelMeta):
    id: int
    title: str
    author: str


book = BookModel(
    id=3, 
    title='Flow My Tears, the Policeman Said', 
    author='Philip K. Dick'
)
print(f'id: {book.id}')
print(f'title: {book.title}')
print(f'author: {book.author}')
__dataclass_transform__: {'eq_default': True, 'order_default': False, 
'kw_only_default': True, 'field_specifiers': (), 'kwargs': {}}
id: 3
title: Flow My Tears, the Policeman Said
author: Philip K. Dick

New tomllib module

A new module for to support TOML language was added. As proposed by PEP-680. It is only right to include such a module in the standard library, given the general drive towards TOML format in python.

As of now, it's used for packaging and configuration of multiple popular tools. The inclusion of a dedicated TOML parsing module certainly is a step towards making the python ecosystem more consistent as it will encourage more projects to support configuration via TOML.

The new module allows for reading a file-like binary object via the toml.load function. Results are written to a dict that consists of basic python types:

import tomllib
from pprint import pprint

# resources/data.toml:
#
# [person]
# name = "John Smith"
# email = "john.smith@example.org"

with open('resources/data.toml', 'rb') as f:
    result = tomllib.load(f)

pprint(result)
{'person': {'email': 
'john.smith@example.org', 'name': 'John Smith'}}

Similarly to other parsing libraries,like json for instance,tomllib comes equipped with a loadsfunction to read data from string:

toml_data = '''
    [event]
    name = "Johns birthday"
    date = 1968-08-28
'''

result = tomllib.loads(toml_data)

pprint(result)

Unfortunately, this is where the similarities end - tomllib doesn't have a way to export data. There is no dump or dumps functionality provided. The reasons are described here. In short - reading is sufficient to satisfy most use cases in the standard python ecosystem.

Additionally, both load and loads provide a hook for parsing floats in the TOML files. This enables configuring how the float values are represented in the result dict, thus results of parsing floats can be types other than the basic python types.

import decimal

toml_data = '''
    [stock.shoe]
    qty = 40
    ean = "2230000000000"
    price = 10.23
    currency = "USD"

    [stock.boot]
    qty = 20
    ean = "2240000000000"
    price = 23.10
    currency = "USD"
'''

regular_floats = tomllib.loads(toml_data, 
parse_float=float)

print("As regular float:")
pprint(regular_floats)

decimals = tomllib.loads(toml_data, 
parse_float=decimal.Decimal)

print("As decimal:")
pprint(decimals)
As regular float:
{'stock': {'boot': {'currency': 'USD',
                    'ean': '2240000000000',
                    'price': 23.1,
                    'qty': 20},
           'shoe': {'currency': 'USD',
                    'ean': '2230000000000',
                    'price': 10.23,
                    'qty': 40}}}
As decimal:
{'stock': {'boot': {'currency': 'USD',
                    'ean': '2240000000000',
                    'price': Decimal('23.10'),
                    'qty': 20},
           'shoe': {'currency': 'USD',
                    'ean': '2230000000000',
                    'price': Decimal('10.23'),
                    'qty': 40}}}

Other module changes

WSGI static typing

Another new module, called wsgiref.types was added for static checking of WSGI-related types. More here.

Changes in existing modules

Python 3.11 introduced a lot of exciting changes to existing modules. A full list is available here.

Optimizations

In version 3.11, Python received a significant performance boost. Few builtins like lists and dicts were optimized. List comprehensions are now between 20 and 30 % faster. list.append() method was improved as well by ~15% on average.

For the dictionaries, when all of the keys are Unicode objects, hash values aren't stored which in turn reduced their size by a factor of ~20%. Other built-in optimizations were made to integers. For integers smaller than 2 ** 30, both sum() and // are significantly faster on x86-64 architectures.

On top of that optimizations, the CPython interpreter itself was optimized greatly for both startup and runtime. Bytecode objects are now statically allocated and stored in the __pycache__.

This results in a 10-15% speedup of the startup which is of great importance for short-lived python programs as for them the startup contributes to most of the total time.

Runtime was boosted by optimizing python frames. Frame objects are now smaller and contain less low-level data. The process of creation and memory allocation was improved.

In fact, frames aren't created at all for most of the user code. They're evaluated lazily whenever they're requested by the debugger. The other performance improvement was inlining python functions.

Prior to 3.11, whenever a python function was called, an underlying C function was called to interpret it. In the new version, whenever the interpreted detects a function call from another python function, it sets a new frame for that function and executes it without calling the C function, which effectively inlines the function call. This results in a significantly bigger recursion limit as well as faster-running recursive functions.

Finally, the 3.11 version introduces the adaptive python interpreter specialization proposed in PEP-659. The general idea is that while python is a dynamic languages, most types rarely change at runtime - a concept known as type stability.

The interpreter is able to detect such code fragments and replace bytecode with more efficient variants in a process called quickening. This optimization is only applied for hot code - that is code that's run multiple types and results in up to 25% performance gain in some scenarios.

Want to learn more from professional backend developers? Check out the other blog post from our Feel The Tech series.