Skip to content

Conversation

@jquast
Copy link
Owner

@jquast jquast commented Jan 11, 2026

Closes #267.

  1. Support emojis sequences, we already had automatic tests for them, but they have been adjusted to match that they are now "correct".

  2. There was a long-standing bug, that if the first line of text sent to textwrap() begins with a sequence, that the sequence was lost!

  3. Also match the "ST"-terminated variant of OSC 8 (hyperlink), rarely used, minor.

For "truncate" method, we just update it to the more complex technique used in wcwidth.wcswidth() to account for ZWJ and VS-16 and add covering tests.

For the other, we now use a translation table to remove any possible C0 or C1 control characters before calling wcwidth.wcswidth() directly to prevent defense against erroneous -1 return values, so we can call wcswidth directly, which fixes measurement of ZWJ and VS16.

@jquast jquast force-pushed the jq/support-zwj-and-vs16-emojis branch from cd8348c to 8f981af Compare January 11, 2026 00:23
@jquast jquast force-pushed the jq/support-zwj-and-vs16-emojis branch from 8f981af to ae83d7c Compare January 11, 2026 00:24
@codecov
Copy link

codecov bot commented Jan 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.43%. Comparing base (0214108) to head (e891355).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #338      +/-   ##
==========================================
+ Coverage   98.42%   98.43%   +0.01%     
==========================================
  Files          11       11              
  Lines        2604     2627      +23     
  Branches      463      468       +5     
==========================================
+ Hits         2563     2586      +23     
  Misses         31       31              
  Partials       10       10              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jquast jquast marked this pull request as ready for review January 11, 2026 01:06
@jquast
Copy link
Owner Author

jquast commented Jan 16, 2026

I found a bug in our implementation I will try to address in this branch, sequences are sometimes lost in rare cases.

Found while developing a blessed-free version of this to be included in the wcwidth library, jquast/wcwidth#169

@jquast jquast changed the title Support Variation Selector-16 and ZWJ Emojis Bugfix: Variation Selector-16/ZWJ and starting sequences Jan 16, 2026
@jquast jquast changed the title Bugfix: Variation Selector-16/ZWJ and starting sequences Bugfix: Variation Selector-16/ZWJ and starting sequences in wrap() Jan 16, 2026
@jquast
Copy link
Owner Author

jquast commented Jan 16, 2026

I have tested a "non-blessed" version of our wrap(), ljust(), rjust(), and center() methods for inclusion in upstream library, wcwidth

The above version is approximately 4-5x faster than blessed in my testing (on UDHR data), because it is not instantiating so many Sequence() temporarily and pre-processing with .padd(), but, instead running iterators for graphemes and sequences and tracking lots of indicies. Though alike, they are also a bit different in approaches. The use of grapheme clustering should fix line breaking on some languages that we may not be familiar with enough to test properly.

If they are accepted to wcwidth library, we should probably just bump our wcwidth requirement in blessed and outsource the logic to the wcwidth library and maintain it there, it will probably get more attention and help there.

Some key differences,

  • blessed wraps text containing newlines, preserving their newlines. I chose to keep the exact same behavior as textwrap.wrap() and convert them to "soft spaces", sort of like html does.
  • in conditions where linebreaks occur around sequences, but are followed by zero-width sequences, blessed puts them trailing on the previous line, while wcwidth wrap will put them leading onto the next line. This has a minor practical difference, and wcwidth's "sequences leading next line" is preferred, in the example case like one line at a time is displayed while switching attributes between line display, we want to match the next line's attribute.

@jquast jquast requested a review from grayjk January 16, 2026 07:24
@jquast jquast merged commit c218bf7 into master Jan 19, 2026
25 checks passed
@jquast jquast deleted the jq/support-zwj-and-vs16-emojis branch January 19, 2026 16:53
@jquast
Copy link
Owner Author

jquast commented Jan 22, 2026

One more key difference, wcwidth.width() measures different than blessed.length() did in regards to BACKSPACE \b,

  • wcwidth's width() returns the maximum cursor extent - the rightmost column reached at any point, wcwidth.width("abc\b\b\b") == 3
  • blessed's length() returns the effective length - after processing cursor movements destructively, blessed.Terminal().length("abc\b\b\b") == 1

When I migrated the code into wcwidth, my first pass had an option for "measure_width='extent' or 'cursor'" to provide both of these features, but I cut it out in the final version.

Because after careful review I think blessed is doing this wrong by default -- the most common use case is only, "how much screen real estate this takes up", and, likely to use direct cursor positioning yourself to display each wrapped/centered/etc line -- the ending position of the cursor has no consequence. If you wish to also know "what is the final cursor position" you are probably very low-level, like pyte.

jquast added a commit that referenced this pull request Jan 27, 2026
- Drops Python 3.7.
- Upgrade dependency wcwidth to 0.5.0 (today's release)
  - Replace custom `Sequence` method implementations with wcwidth's new functions: `ljust()`, `rjust()`, `center()`,
     `clip()`, and `width()` using `control_codes='ignore'`, new since 0.3.0.
  - See benchmark [results below](#344 (comment))

2 Changing Behaviors
--------------------------------

1. The `length()` method now returns **maximum cursor extent** rather than **final relative cursor position**. This [is preferred](#338 (comment)) !

2. The `truncate()` method now **fills with space** when a wide character doesn't fit: For leading space this new behavior is absolutely necessary, like if we wanted to do horizontal scrolling of CJK characters, to do this 1 cell at a time will require an oscillating leading ``' '`` at the beginning of the string.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

term.ljust, center etc. incorrect for sequences containing U+FE0F (Variation Selector-16)

2 participants