Merged
Conversation
Implements Unicode Standard Annex #29 grapheme cluster boundaries. Handles Hangul syllables, emoji ZWJ sequences, regional indicators, combining characters, and Indic scripts. New exports: iter_graphemes, _bisearch
New width() function handles: - Terminal escape sequences (SGR, OSC, CSI, etc.) - Control codes with parse/strict/ignore modes - Tab stops and cursor movement - Always returns non-negative integer (never -1) New exports: width
We suggested to use ``wcwidth<2`` for years, when it should have been ``wcwidth<1``, I really hope somebody didn't copy & paste our recommendation .. :(
its a private function, anyway, still ok. Below the turtles, 0/1 is very much the definition of Falsey and Truthy.
its a private function, anyway, still ok. Below the turtles, 0/1 is very much the definition of Falsey and Truthy.
- Test lone ESC character handling in iter_sequences and width - Test backspace at column 0 (no negative position) - Test carriage return column reset - Test tab with tabstop=0 in parse mode - Test vertical control (LF) in parse mode
Adds iter_graphemes() function for Unicode grapheme cluster iteration.
0126606 to
22e4624
Compare
and, remove optional 'column=0', if you want to measure text starting at a different column containing tabs .. then expand your own!
and, remove optional 'column=0', if you want to measure text starting at a different column containing tabs .. then expand your own!
Text wrapping that properly handles: - Terminal escape sequences - Unicode grapheme clusters - Hyphenation at proper boundaries - Width calculation for CJK and emoji New exports: wrap, SequenceTextWrapper
These files were leftovers from a previous iteration. The textwrap module now uses escape_sequences.py instead.
especially regarding tabs. I've also decided to "take the easy way out" and use str.expandtabs() instead of processing it ourselves and reduce our complexity and need of additional column=1 argument
ea6f1f1 to
b055e8a
Compare
This was referenced Jan 16, 2026
more test coverage around tabs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR builds on #168 and #165 combined
New
wrap()function is an emoji, control and terminal sequence, wide, zero-width, and grapheme-aware version of textwrap.wrap().