Conversation
Similar to #47392, support [Unicode 15.1](https://www.unicode.org/versions/Unicode15.1.0/) by bumping utf8proc to 2.9.0 (JuliaStrings/utf8proc#253). This allows us to use [118 exciting new emoji characters](https://blog.emojipedia.org/whats-new-in-unicode-15-1-and-emoji-15-1/) as identifiers, including "edible mushroom" `"\U1f344\u200d\U1f7eb"` (but still no superscript "q"). Interestingly, they also updated the [Unicode recommendations on programming-language identifiers (UAX#31)](https://www.unicode.org/reports/tr31/tr31-39.html#Mathematical_Compatibility_Notation_Profile) to finally "bless" identifiers beginning with `∂` and `∇` and/or ending with numeric sub/superscripts. They still don't recommend nearly the range of identifiers accepted by Julia, however.
|
Do you think this, i.e. the Julialang PR JuliaLang/julia#51799 I reviewed the code here, which is rather small (and that PR trivial), except for the Ruby generator that I think I don't need to scrutinize, and it seems safe/preferred to backport, though I did not look at utf8proc_data.c since it's quite large (and generated?). I don't think your change is a breaking change, but I'm not sure.
Because of, at Wikipedia:
Also:
What I find likely breaking about regarding GB18030-2022 and thus I think Unicode 15.1 (but not at the level of utf8proc?)::
|
|
It's not a breaking change, I think (mainly just adding new characters, and tweaking some grapheme-break rules), but it's a new feature and thus probably not eligible for backport. |
Support for Unicode 15.1, which means updating the tables but also adding a new rule to the grapheme-break algorithm to account for the new
Indic_Conjunct_Breakproperty. Fixes #252Currently a work-in-progress. To do:
Update: should be ready now.