Skip to content

gh-137855: Lazy import inspect module in dataclasses#144387

Open
danielhollas wants to merge 9 commits intopython:mainfrom
danielhollas:dataclass-lazy-inspect
Open

gh-137855: Lazy import inspect module in dataclasses#144387
danielhollas wants to merge 9 commits intopython:mainfrom
danielhollas:dataclass-lazy-inspect

Conversation

@danielhollas
Copy link
Contributor

@danielhollas danielhollas commented Feb 2, 2026

inspect module is slow to import (see #117865) and is dragging down dataclasses with it.

There are currently only two uses of inspect in dataclasses, but they are a bit tricky to inline since they are on a direct code path when the @dataclass decorator is executed.

  1. inspect.signature is used to autogenerate class docstring (if one is not provided already)
  2. inspect.unwrap is used in a rather esoteric code path only for slotted classes, added in gh-90562: Improve zero argument support for super() in dataclasses when slots=True #124692)

For 1. I have used a descriptor protocol to generate the __doc__ attribute on demand (this is my first time messing with descriptors, apologies if I overlooked something).

For 2. can be defferred by calling the unwrap functions only when really necessary (and hopefully this path is not common)

Benchmarks

./python -Ximporttime -c "import dataclasses"

Before

image
hyperfine -w 10 './python -c "import dataclasses"' 
Benchmark 1: ./python -c "import dataclasses"
  Time (mean ± σ):      19.2 ms ±   2.1 ms    [User: 14.9 ms, System: 4.1 ms]
  Range (min … max):    16.6 ms …  24.1 ms    154 runs

After

image
hyperfine -w 10 './python -c "import dataclasses"' 
Benchmark 1: ./python -c "import dataclasses"
  Time (mean ± σ):      15.2 ms ±   1.8 ms    [User: 11.2 ms, System: 3.7 ms]
  Range (min … max):    12.9 ms …  19.4 ms    212 runs

Overall seems to be a solid 20-30% improvement.

TODO (if things done here seem acceptable):

  • Add blurb
  • Add test to guard against import regressions, using ensure_lazy_imports test fixture


# If this is a wrapped function, unwrap it.
member = inspect.unwrap(member)
if not isinstance(member, type) and hasattr(member, '__wrapped__'):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check was copied from the while loop in inspect.unwrap

@danielhollas
Copy link
Contributor Author

Deferring the call to inspect.signature to create __doc__ might also speed up the dataclasses creation a bit. e.g. here's what I see in a cProfile of the _colorize module which uses dataclasses heavily (on main branch)

$ ./python -m cProfile -m _colorize | head 20
         5604 function calls (5484 primitive calls) in 0.007 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      9/1    0.004    0.000    0.007    0.007 {built-in method builtins.exec}
        1    0.000    0.000    0.007    0.007 <string>:1(<module>)
        1    0.000    0.000    0.007    0.007 <frozen runpy>:199(run_module)
        1    0.000    0.000    0.007    0.007 <frozen runpy>:65(_run_code)
        1    0.000    0.000    0.007    0.007 _colorize.py:1(<module>)
        7    0.000    0.000    0.006    0.001 dataclasses.py:1432(wrap)
        7    0.000    0.000    0.006    0.001 dataclasses.py:986(_process_class)
        7    0.000    0.000    0.004    0.001 dataclasses.py:478(add_fns_to_class)
        4    0.000    0.000    0.001    0.000 inspect.py:3343(signature)
        4    0.000    0.000    0.001    0.000 inspect.py:3056(from_callable)
     12/4    0.000    0.000    0.001    0.000 inspect.py:2437(_signature_from_callable)
        4    0.000    0.000    0.001    0.000 inspect.py:2331(_signature_from_function)
    71/11    0.000    0.000    0.000    0.000 annotationlib.py:907(get_annotations)
        8    0.000    0.000    0.000    0.000 dataclasses.py:541(__annotate__)
     17/9    0.000    0.000    0.000    0.000 annotationlib.py:1114(_get_and_call_annotate)

@danielhollas danielhollas force-pushed the dataclass-lazy-inspect branch from ca36c68 to 6bc6199 Compare February 12, 2026 01:19
import abc
from reprlib import recursive_repr

lazy import inspect
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is at this point perhaps the first use of lazy import in stdlib. Not sure where it should go in terms of sorting the imports?

Copy link
Member

@johnslavik johnslavik Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we follow any particular guideline. Personally I'd keep it in the same line as the original import -- the change is then more localized. Not my call though!

Copy link
Contributor Author

@danielhollas danielhollas Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, I moved it, since another core dev expressed similar sentiment over at #144756

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @JelleZijlstra 😁

@sergey-miryanov
Copy link
Contributor

sergey-miryanov commented Feb 12, 2026

I believe we should add a NEWS entry, because it is user-facing change (at least in the performance terms).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants