gh-139156: Use PyBytesWriter in the UTF-7 encoder#139248
gh-139156: Use PyBytesWriter in the UTF-7 encoder#139248vstinner merged 1 commit intopython:mainfrom
Conversation
Replace PyBytes_FromStringAndSize() and _PyBytes_Resize() with the PyBytesWriter API.
|
Benchmark: import pyperf
runner = pyperf.Runner()
sizes = (3, 100, 1000)
for size in sizes:
runner.timeit(f'{size:,} ASCII chars',
setup=f's="x"*{size}',
stmt='s.encode("utf7")')
for size in sizes:
runner.timeit(f'{size:,} UCS-1 chars',
setup=f's=chr(0xe9) * {size}',
stmt='s.encode("utf7")')
for size in sizes:
runner.timeit(f'{size:,} UCS-2 chars',
setup=f's=chr(0x20ac) * {size}',
stmt='s.encode("utf7")')
for size in sizes:
runner.timeit(f'{size:,} UCS-4 chars',
setup=f's=chr(0x10ffff) * {size}',
stmt='s.encode("utf7")')Results with CPU isolation and
Benchmark hidden because not significant (2): 100 UCS-1 chars, 3 UCS-4 chars I'm not sure what's going with 1,000 UCS-1 chars. It looks like a hiccup in the benchmark, not a real regression. The change uses the same memory allocation strategy, so it should have basically no impact on performance. |
If I re-run the benchmark with this change, I get: 5.84 us: 1.19x faster. Hum, it seems like the benchmark is not reliable :-( |
|
Benchmark on this change:
New try with CPU isolation and
|
If I re-run the benchmark, it becomes faster: |
Replace PyBytes_FromStringAndSize() and _PyBytes_Resize() with the PyBytesWriter API.