Skip to content

Conversation

@andrewloux
Copy link

@andrewloux andrewloux commented Feb 13, 2026

Summary

This PR replaces PyTuple_Pack with PyTuple_FromArray in Objects/dictobject.c within the dictitems_xor_lock_held function.

By avoiding the variadic argument (va_args) processing overhead of PyTuple_Pack, we reduce the per-item cost of symmetric difference operations (dict.items() ^ dict.items()) that involve value mismatches. The change uses a stack-allocated array to pass arguments directly to the tuple constructor.

Benchmarks (PGO+LTO)

Validated using pyperf in --rigorous mode on a full production build.

  • Platform: macOS arm64 (Apple M-series)
  • Build: --enable-optimizations --with-lto
  • Tool: pyperf (--rigorous mode)
  • Baseline: upstream/main
  • Candidate: pytuple-dictitems-xor-fromarray (714fb11)
Benchmark Baseline (Mean ± Std Dev) Candidate (Mean ± Std Dev) Speedup
dict_items_xor_overlap_neq 75.3 ms ± 4.6 ms 74.3 ms ± 2.5 ms 1.01x faster
dict_items_xor_disjoint 142 ms ± 5 ms 141 ms ± 4 ms Neutral
dict_items_xor_overlap_equal_control 52.1 ms ± 2.1 ms 51.9 ms ± 1.8 ms Neutral

Geometric mean: 1.00x faster (1.01x on target path)

Repro commands
# Target workload: High overlap, mismatched values (stresses tuple creation)
python -m pyperf command --rigorous --name dict_items_xor_overlap_neq
  ./python.exe -c "d1={i:i for i in range(4000)}; d2={i:i+1 for i in range(4000)}; print(sum(len(d1.items() ^ d2.items()) for _ in range(120)))"

# Control workload: Identical dicts (no tuple creation)
python -m pyperf command --rigorous --name dict_items_xor_overlap_equal_control
  ./python.exe -c "d1={i:i for i in range(4000)}; d2={i:i for i in range(4000)}; print(sum(len(d1.items() ^ d2.items()) for _ in range(120)))"

# Disjoint workload: No overlapping keys
python -m pyperf command --rigorous --name dict_items_xor_disjoint
  ./python.exe -c "d1={i:i for i in range(4000)}; d2={i+4000:i for i in range(4000)}; print(sum(len(d1.items() ^ d2.items()) for _ in range(120)))"

Analysis

The dict_items_xor_overlap_neq workload specifically exercises the modified path by comparing dictionaries with overlapping keys but unequal values, triggering a tuple creation for every mismatched entry.

While the aggregate effect is a micro-optimization, the results show a consistent improvement on the target path with a reduction in variance (±4.6 ms → ±2.5 ms) across multiple runs. Control workloads (equal and disjoint) remain neutral, confirming no regressions in non-target dictionary shapes.

@python-cla-bot
Copy link

python-cla-bot bot commented Feb 13, 2026

All commit authors signed the Contributor License Agreement.

CLA signed

@bedevere-app
Copy link

bedevere-app bot commented Feb 13, 2026

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@andrewloux andrewloux marked this pull request as ready for review February 13, 2026 01:55
@andrewloux andrewloux changed the title gh-140009: Use PyTuple_FromArray in dictitems_xor_lock_held gh-140009: Use PyTuple_FromArray in dict.items() symmetric difference Feb 13, 2026
@andrewloux andrewloux force-pushed the pytuple-dictitems-xor-fromarray branch from 714fb11 to 451bec2 Compare February 13, 2026 03:01
@bedevere-app
Copy link

bedevere-app bot commented Feb 13, 2026

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@andrewloux andrewloux changed the title gh-140009: Use PyTuple_FromArray in dict.items() symmetric difference gh-140009: Optimize dict.items() symmetric difference via PyTuple_FromArray Feb 13, 2026
@eendebakpt
Copy link
Contributor

@andrewloux Your own benchmarks show this change is performance neutral.

Unless there are additional result that show why this change is a significant improvement, I suggest we close this.

(I do believe this is a tiny net improvement, but in general we avoid making such small changes to reduce churn and potential unforeseen issues)

@skirpichev skirpichev added the pending The issue will be closed if no feedback is provided label Feb 13, 2026
@andrewloux
Copy link
Author

andrewloux commented Feb 13, 2026

(I do believe this is a tiny net improvement, but in general we avoid making such small changes to reduce churn and potential unforeseen issues)

Yup, totally makes sense - let's close this 👍🏽 Thanks @eendebakpt

@andrewloux andrewloux closed this Feb 13, 2026
@andrewloux andrewloux deleted the pytuple-dictitems-xor-fromarray branch February 13, 2026 11:36
@skirpichev skirpichev removed the pending The issue will be closed if no feedback is provided label Feb 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants