gh-139156: Use PyBytesWriter in the UTF-7 encoder by vstinner · Pull Request #139248 · python/cpython

vstinner · 2025-09-22T20:13:29Z

Replace PyBytes_FromStringAndSize() and _PyBytes_Resize() with the PyBytesWriter API.

Issue: Use PyBytesWriter in Unicode codecs #139156

Replace PyBytes_FromStringAndSize() and _PyBytes_Resize() with the PyBytesWriter API.

vstinner · 2025-09-22T20:15:00Z

Benchmark:

import pyperf
runner = pyperf.Runner()
sizes = (3, 100, 1000)
for size in sizes:
    runner.timeit(f'{size:,} ASCII chars',
        setup=f's="x"*{size}',
        stmt='s.encode("utf7")')
for size in sizes:
    runner.timeit(f'{size:,} UCS-1 chars',
        setup=f's=chr(0xe9) * {size}',
        stmt='s.encode("utf7")')
for size in sizes:
    runner.timeit(f'{size:,} UCS-2 chars',
        setup=f's=chr(0x20ac) * {size}',
        stmt='s.encode("utf7")')
for size in sizes:
    runner.timeit(f'{size:,} UCS-4 chars',
        setup=f's=chr(0x10ffff) * {size}',
        stmt='s.encode("utf7")')

Results with CPU isolation and python -m pyperf system tune:

Benchmark	ref	pep782
3 ASCII chars	487 ns	480 ns: 1.01x faster
100 ASCII chars	785 ns	831 ns: 1.06x slower
1,000 ASCII chars	2.46 us	2.93 us: 1.19x slower
3 UCS-1 chars	499 ns	487 ns: 1.02x faster
1,000 UCS-1 chars	5.58 us	5.72 us: 1.03x slower
3 UCS-2 chars	498 ns	487 ns: 1.02x faster
100 UCS-2 chars	1.14 us	1.18 us: 1.04x slower
1,000 UCS-2 chars	6.01 us	6.04 us: 1.01x slower
100 UCS-4 chars	1.63 us	1.65 us: 1.01x slower
1,000 UCS-4 chars	10.4 us	10.4 us: 1.00x slower
Geometric mean	(ref)	1.02x slower

Benchmark hidden because not significant (2): 100 UCS-1 chars, 3 UCS-4 chars

I'm not sure what's going with 1,000 UCS-1 chars. It looks like a hiccup in the benchmark, not a real regression.

The change uses the same memory allocation strategy, so it should have basically no impact on performance.

vstinner · 2025-09-22T20:21:33Z

1,000 UCS-4 chars: 6.96 us => 6.48 us: 1.07x faster

If I re-run the benchmark with this change, I get: 5.84 us: 1.19x faster. Hum, it seems like the benchmark is not reliable :-(

vstinner · 2025-09-22T20:29:40Z

Benchmark on this change: python -m pyperf timeit -s 's=chr(0x10ffff) * 1000' 's.encode("utf7")'

Build 1: Mean +- std dev: 5.95 us +- 0.06 us
Build 2: Mean +- std dev: 5.72 us +- 0.08 us
Build 3: Mean +- std dev: 6.34 us +- 0.51 us

New try with CPU isolation and python -m pyperf system tune:

Build 1: Mean +- std dev: 10.4 us +- 0.0 us
Build 2: Mean +- std dev: 10.4 us +- 0.0 us
Build 3: Mean +- std dev: 10.4 us +- 0.0 us

vstinner · 2025-09-22T20:41:37Z

I'm not sure what's going with 1,000 UCS-1 chars. It looks like a hiccup in the benchmark, not a real regression.

If I re-run the benchmark, it becomes faster: Mean +- std dev: [ref] 486 ns +- 3 ns -> [pep782] 484 ns +- 4 ns: 1.01x faster

pythongh-139156: Use PyBytesWriter in the UTF-7 encoder

f74ccc5

Replace PyBytes_FromStringAndSize() and _PyBytes_Resize() with the PyBytesWriter API.

vstinner added the skip news label Sep 22, 2025

bedevere-app bot added the awaiting core review label Sep 22, 2025

bedevere-app bot mentioned this pull request Sep 22, 2025

Use PyBytesWriter in Unicode codecs #139156

Closed

vstinner merged commit c863349 into python:main Sep 22, 2025
47 checks passed

vstinner deleted the utf7 branch September 22, 2025 20:49

bedevere-app bot removed the awaiting core review label Sep 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gh-139156: Use PyBytesWriter in the UTF-7 encoder#139248

gh-139156: Use PyBytesWriter in the UTF-7 encoder#139248
vstinner merged 1 commit intopython:mainfrom
vstinner:utf7

vstinner commented Sep 22, 2025 •

edited by bedevere-app bot

Loading

Uh oh!

vstinner commented Sep 22, 2025 •

edited

Loading

Uh oh!

vstinner commented Sep 22, 2025

Uh oh!

vstinner commented Sep 22, 2025

Uh oh!

vstinner commented Sep 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

vstinner commented Sep 22, 2025 • edited by bedevere-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Sep 22, 2025

Uh oh!

vstinner commented Sep 22, 2025

Uh oh!

vstinner commented Sep 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vstinner commented Sep 22, 2025 •

edited by bedevere-app bot

Loading

vstinner commented Sep 22, 2025 •

edited

Loading