Skip to content

Fixes allowing for “Full” folding and NFKC_CaseFold compliance.#102

Closed
nomoon wants to merge 3 commits intoJuliaStrings:masterfrom
nomoon:case_folding_fixes
Closed

Fixes allowing for “Full” folding and NFKC_CaseFold compliance.#102
nomoon wants to merge 3 commits intoJuliaStrings:masterfrom
nomoon:case_folding_fixes

Conversation

@nomoon
Copy link

@nomoon nomoon commented Mar 5, 2017

Changes:

  • Only include C (Common) & F (Full) foldings from CaseFolding.txt data. Removed S (Simple) since F & S are specified to be exclusive. I'm not sure if it was causing issues, but it's technically an error.
  • Extend UTF8PROC_IGNORE to also ignore unassigned codepoints (such as \u2065) which are specified as being discarded by NFKC_CF.

With these changes, compliant NFKC_CaseFolding should be achievable with the options UTF8PROC_STABLE | UTF8PROC_COMPOSE | UTF8PROC_COMPAT | UTF8PROC_CASEFOLD | UTF8PROC_IGNORE

Should resolve #54 (though it was already mostly resolved?)

(This PR doesn't include the new utf8proc_data.c file since there's no point in generating it multiple times if changes are needed.)

nomoon added 3 commits March 4, 2017 20:23
* Only include C (Common) and F (Full) foldings from CaseFolding.txt. Removed S (Simple) since F & S are specified to be exclusive.
* Extend UTF8PROC_IGNORE to also ignore unassigned codepoints (such as \u2065) which are specified as being discarded by NFKC_CF.
@nomoon
Copy link
Author

nomoon commented Mar 5, 2017

It appears the tests are failing because I didn't generate and commit the new utf8proc_data.c

@stevengj
Copy link
Member

Would be good to merge an updated version of this PR...

* NFKC_Casefold normalization (@ref UTF8PROC_COMPOSE and @ref UTF8PROC_COMPAT
* and @ref UTF8PROC_CASEFOLD and @ref UTF8PROC_IGNORE).
**/
UTF8PROC_DLLEXPORT utf8proc_uint8_t *utf8proc_NFKC_CF(const utf8proc_uint8_t *str);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably just call it NFKC_Casefold since that is the standard name.

@stevengj stevengj mentioned this pull request Apr 30, 2018
1 task
@stevengj
Copy link
Member

Closed in favor of the updated PR, #133.

@stevengj
Copy link
Member

stevengj commented May 2, 2018

Closed by #133.

@stevengj stevengj closed this May 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature request: Full Case Folding

2 participants