-
Notifications
You must be signed in to change notification settings - Fork 81
Bugfix: Variation Selector-16/ZWJ and starting sequences in wrap() #338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cd8348c to
8f981af
Compare
8f981af to
ae83d7c
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #338 +/- ##
==========================================
+ Coverage 98.42% 98.43% +0.01%
==========================================
Files 11 11
Lines 2604 2627 +23
Branches 463 468 +5
==========================================
+ Hits 2563 2586 +23
Misses 31 31
Partials 10 10 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I found a bug in our implementation I will try to address in this branch, sequences are sometimes lost in rare cases. Found while developing a blessed-free version of this to be included in the wcwidth library, jquast/wcwidth#169 |
|
I have tested a "non-blessed" version of our wrap(), ljust(), rjust(), and center() methods for inclusion in upstream library, wcwidth
The above version is approximately 4-5x faster than blessed in my testing (on UDHR data), because it is not instantiating so many Sequence() temporarily and pre-processing with .padd(), but, instead running iterators for graphemes and sequences and tracking lots of indicies. Though alike, they are also a bit different in approaches. The use of grapheme clustering should fix line breaking on some languages that we may not be familiar with enough to test properly. If they are accepted to wcwidth library, we should probably just bump our wcwidth requirement in blessed and outsource the logic to the wcwidth library and maintain it there, it will probably get more attention and help there. Some key differences,
|
|
One more key difference, wcwidth.width() measures different than blessed.length() did in regards to BACKSPACE
When I migrated the code into wcwidth, my first pass had an option for "measure_width='extent' or 'cursor'" to provide both of these features, but I cut it out in the final version. Because after careful review I think blessed is doing this wrong by default -- the most common use case is only, "how much screen real estate this takes up", and, likely to use direct cursor positioning yourself to display each wrapped/centered/etc line -- the ending position of the cursor has no consequence. If you wish to also know "what is the final cursor position" you are probably very low-level, like pyte. |
- Drops Python 3.7.
- Upgrade dependency wcwidth to 0.5.0 (today's release)
- Replace custom `Sequence` method implementations with wcwidth's new functions: `ljust()`, `rjust()`, `center()`,
`clip()`, and `width()` using `control_codes='ignore'`, new since 0.3.0.
- See benchmark [results below](#344 (comment))
2 Changing Behaviors
--------------------------------
1. The `length()` method now returns **maximum cursor extent** rather than **final relative cursor position**. This [is preferred](#338 (comment)) !
2. The `truncate()` method now **fills with space** when a wide character doesn't fit: For leading space this new behavior is absolutely necessary, like if we wanted to do horizontal scrolling of CJK characters, to do this 1 cell at a time will require an oscillating leading ``' '`` at the beginning of the string.
Closes #267.
Support emojis sequences, we already had automatic tests for them, but they have been adjusted to match that they are now "correct".
There was a long-standing bug, that if the first line of text sent to textwrap() begins with a sequence, that the sequence was lost!
Also match the "ST"-terminated variant of OSC 8 (hyperlink), rarely used, minor.
For "truncate" method, we just update it to the more complex technique used in wcwidth.wcswidth() to account for ZWJ and VS-16 and add covering tests.
For the other, we now use a translation table to remove any possible C0 or C1 control characters before calling
wcwidth.wcswidth()directly to prevent defense against erroneous -1 return values, so we can call wcswidth directly, which fixes measurement of ZWJ and VS16.