Skip to content

fix: cases cases/11-Functions/01-Scalar/test_scalar_crypto.py failed.#34508

Merged
guanshengliang merged 1 commit intomainfrom
fix/main/scalarCrypto
Feb 5, 2026
Merged

fix: cases cases/11-Functions/01-Scalar/test_scalar_crypto.py failed.#34508
guanshengliang merged 1 commit intomainfrom
fix/main/scalarCrypto

Conversation

@xiao-77
Copy link
Copy Markdown
Contributor

@xiao-77 xiao-77 commented Feb 5, 2026

Description

Issue(s)

  • Close/close/Fix/fix/Resolve/resolve: Issue Link

Checklist

Please check the items in the checklist if applicable.

  • Is the user manual updated?
  • Are the test cases passed and automated?
  • Is there no significant decrease in test coverage?

Copilot AI review requested due to automatic review settings February 5, 2026 03:09
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @xiao-77, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses and fixes parsing failures related to scalar crypto functions, as indicated by issues in test_scalar_crypto.py. The core of the solution involves refining the SQL parser's token handling for these functions by adjusting their declaration in the grammar file. Additionally, a minor code formatting improvement has been applied to enhance readability in a related source file.

Highlights

  • Parser Token Handling Refinement: The pull request modifies sql.y to adjust how crypto-related function tokens (e.g., FROM_BASE64, MD5, AES_ENCRYPT) are declared. They have been moved from explicit %token declarations to a %fallback list, which likely resolves parsing ambiguities and ensures these functions are correctly recognized in various SQL contexts.
  • Code Readability Improvement: In parInsertSql.c, a lengthy if condition has been reformatted with additional newlines. This change enhances the readability of the code without altering its underlying logical behavior.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • source/libs/parser/inc/sql.y
    • Removed the explicit %token declaration for a list of crypto functions including FROM_BASE64, TO_BASE64, MD5, SHA, SHA1, SHA2, AES_ENCRYPT, AES_DECRYPT, SM4_ENCRYPT, and SM4_DECRYPT.
    • Added the aforementioned crypto function names to the %fallback directive. This allows the parser to treat these names as identifiers if they are not matched as specific tokens in other grammar rules.
  • source/libs/parser/src/parInsertSql.c
    • Refactored a long conditional statement within the checkAndTrimValue function by introducing line breaks to improve its visual structure and readability, without changing its functionality.
Activity
  • No specific human activity (comments, reviews, or progress updates) has been recorded for this pull request since its creation.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes failed test cases in test_scalar_crypto.py by adjusting how cryptographic function tokens are handled in the SQL parser.

Changes:

  • Modified token declaration for cryptographic functions (FROM_BASE64, TO_BASE64, MD5, SHA, SHA1, SHA2, AES_ENCRYPT, AES_DECRYPT, SM4_ENCRYPT, SM4_DECRYPT) from explicit tokens to fallback tokens that resolve to NK_ID
  • Reformatted multi-line token type checking condition in parInsertSql.c for better readability

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
source/libs/parser/inc/sql.y Changed crypto function tokens from explicit %token declaration to %fallback NK_ID directive, allowing them to be treated as identifiers in function name contexts
source/libs/parser/src/parInsertSql.c Reformatted multi-line condition with additional line breaks for improved readability; no functional change

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses test failures for scalar crypto functions by modifying source/libs/parser/inc/sql.y to parse crypto function names as regular identifiers. However, the introduction or re-enabling of these SQL scalar cryptographic functions has exposed significant security vulnerabilities. A security audit identified a buffer over-read in AES IV processing, the use of insecure ECB mode when an IV is missing, and a silent encryption failure in SM4 for certain build configurations. These issues could lead to information leaks or the storage of sensitive data in plaintext and require immediate attention.

pToken->type != TK_MD5 && pToken->type != TK_SHA && pToken->type != TK_SHA1 && pToken->type != TK_SHA2 &&
pToken->type != TK_AES_ENCRYPT && pToken->type != TK_AES_DECRYPT && pToken->type != TK_SM4_ENCRYPT &&
pToken->type != TK_SM4_DECRYPT) ||
pToken->type != TK_AES_ENCRYPT && pToken->type != TK_AES_DECRYPT &&
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The checkAndTrimValue function allows TK_AES_ENCRYPT and TK_AES_DECRYPT tokens, which are subsequently processed in parseBinary. In parseBinary (lines 946 and 1072), a user-supplied Initialization Vector (IV) is passed to taosAesEncrypt or taosAesDecrypt without length validation. These functions then call AES_init_ctx_iv (in source/libs/crypt/src/crypt.c), which expects a 16-byte IV. If a user provides a shorter IV, the system will perform a buffer over-read, potentially leaking sensitive stack data into the IV and the resulting ciphertext, or causing a crash.

pToken->type != TK_AES_ENCRYPT && pToken->type != TK_AES_DECRYPT && pToken->type != TK_SM4_ENCRYPT &&
pToken->type != TK_SM4_DECRYPT) ||
pToken->type != TK_AES_ENCRYPT && pToken->type != TK_AES_DECRYPT &&
pToken->type != TK_SM4_ENCRYPT && pToken->type != TK_SM4_DECRYPT) ||
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The checkAndTrimValue function allows TK_SM4_ENCRYPT and TK_SM4_DECRYPT tokens. However, the underlying implementation in source/libs/crypt/src/crypt.c (specifically Builtin_CBC_EncryptImpl at line 51) is a no-op memcpy in build configurations where TD_ENTERPRISE and TD_ASTRA are not defined. This results in a silent encryption failure where sensitive data is stored in plaintext despite the user requesting SM4 encryption.

pToken->type != TK_MD5 && pToken->type != TK_SHA && pToken->type != TK_SHA1 && pToken->type != TK_SHA2 &&
pToken->type != TK_AES_ENCRYPT && pToken->type != TK_AES_DECRYPT && pToken->type != TK_SM4_ENCRYPT &&
pToken->type != TK_SM4_DECRYPT) ||
pToken->type != TK_AES_ENCRYPT && pToken->type != TK_AES_DECRYPT &&
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

Allowing AES encryption without enforcing a mandatory IV leads to the use of AES in Electronic Codebook (ECB) mode when the optional IV argument is omitted (see source/libs/crypt/src/crypt.c:163). ECB mode is cryptographically insecure for multi-block data as it preserves patterns in the plaintext, making it susceptible to various attacks.

@guanshengliang guanshengliang merged commit 7cf18bb into main Feb 5, 2026
18 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants