Skip to content

feat: add custom_string_literal_override to unparser Dialect trait#20590

Merged
alamb merged 5 commits intoapache:mainfrom
goldmedal:feat/unparse-unicode-literal
Mar 16, 2026
Merged

feat: add custom_string_literal_override to unparser Dialect trait#20590
alamb merged 5 commits intoapache:mainfrom
goldmedal:feat/unparse-unicode-literal

Conversation

@goldmedal
Copy link
Contributor

Which issue does this PR close?

  • Closes #.

Rationale for this change

When unparsing queries targeting databases like MSSQL, non-ASCII string literals need special handling. MSSQL requires the N'...' (national string literal) prefix for strings containing Unicode characters. Currently the unparser always emits single-quoted strings with no way for dialects to customize this behavior.

What changes are included in this PR?

  • Add a new custom_string_literal_override method to the Dialect trait with a default implementation returning None (no override).
  • Consolidate the Utf8, Utf8View, and LargeUtf8 match arms in scalar_value_to_sql and route them through the new dialect hook.

Are these changes tested?

Yes. A test-only MsSqlDialect is defined in the test module to verify:

  • ASCII strings produce standard single-quoted literals (no N prefix)
  • Non-ASCII strings produce national string literals (N'...')
  • The default dialect is unaffected (no N prefix regardless of content)

It's used by Wren AI in production for a while: Canner#8

Are there any user-facing changes?

Yes. The Dialect trait gains a new method custom_string_literal_override. This is a non-breaking change since the method has a default implementation. Dialect implementors can override it to customize string literal unparsing.

@github-actions github-actions bot added the sql SQL Planner label Feb 27, 2026
Copy link
Contributor

@kosiew kosiew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@goldmedal
Thanks for working on this.

///
/// For example, MSSQL requires non-ASCII strings to use national string
/// literal syntax (`N'datafusion資料融合'`).
fn custom_string_literal_override(&self, _s: &str) -> Option<ast::Expr> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this only affects scalar UTF8 literal unparsing today, a narrower name or a helper scoped to scalar string literals may make the API easier to understand and extend. Perhaps string_literal_to_sql – to keep the focus on literals, not “any scalar”?

let unparser = Unparser::new(&dialect);

let expr =
Expr::Literal(ScalarValue::Utf8(Some("national string".to_string())), None);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regression test currently exercises ScalarValue::Utf8, but the change also covers Utf8View and LargeUtf8.

Adding tests for each would make the coverage line up with the implementation.

@goldmedal
Copy link
Contributor Author

Thanks @kosiew for reviewing. All the comments has been addressed.

@goldmedal goldmedal requested a review from kosiew March 15, 2026 11:07
Copy link
Contributor

@kosiew kosiew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm
🚀

@alamb alamb added this pull request to the merge queue Mar 16, 2026
@alamb
Copy link
Contributor

alamb commented Mar 16, 2026

Thanks -- looks good to me @kosiew and @goldmedal

Merged via the queue into apache:main with commit bd071be Mar 16, 2026
30 checks passed
@goldmedal goldmedal deleted the feat/unparse-unicode-literal branch March 17, 2026 01:42
@goldmedal
Copy link
Contributor Author

Thanks @kosiew and @alamb 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sql SQL Planner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants