Skip to content

[Bug]: Decode WideStrings More Gracefully#13

Merged
jecisc merged 1 commit intopharo-contributions:masterfrom
seandenigris:bug_widestring-null-bytes
Feb 16, 2026
Merged

[Bug]: Decode WideStrings More Gracefully#13
jecisc merged 1 commit intopharo-contributions:masterfrom
seandenigris:bug_widestring-null-bytes

Conversation

@seandenigris
Copy link
Contributor

Previously, passing a WideString to Soup would naively convert to bytes assuming an ASCII ByteString, resulting in many null values interleaved with actual characters

Old result:
Screenshot 2026-02-05 at 3 04 24 PM
New result:
Screenshot 2026-02-05 at 3 05 27 PM

But the big payoff is that the old result could result in Soup parsing a hierarchy of HTML tags as one giant SoupString, which is prevented with the fix.

  • Includes a test that failed previously and passes now
  • Future work might correctly handle the charset and/or include non-ASCII smart quote escaping

- previously, passing a WideString to Soup would naively convert to bytes assuming an ASCII ByteString, resulting in many null values interleaved with actual characters
- Includes a test that failed previously and passes now
- Future work might correctly handle the charset and/or include non-ASCII smart quote escaping
@jecisc jecisc merged commit b26f248 into pharo-contributions:master Feb 16, 2026
2 checks passed
@seandenigris seandenigris deleted the bug_widestring-null-bytes branch February 16, 2026 00:05
@seandenigris
Copy link
Contributor Author

Thanks, @jecisc!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants