feat: support zeroth state in state window#33294
Conversation
|
/gemini review |
This comment was marked as off-topic.
This comment was marked as off-topic.
There was a problem hiding this comment.
Code Review
This pull request introduces support for a 'zeroth state' in state window functions, which allows for filtering out a baseline or initial state from the window results. The changes are comprehensive, touching the parser, query AST, serialization/deserialization logic, and the stream processing engine. A new test suite has been added to validate this functionality. My review identified a couple of issues: a potential correctness bug in the state comparison logic due to incomplete type handling, and a minor memory leak in an error path within the parser's translation phase. Overall, this is a solid feature addition.
| if (IS_INTEGER_TYPE(type)) { | ||
| return memcmp(pStateData, &pZerothState->datum.i, bytes) == 0; | ||
| } | ||
| // is boolean type | ||
| return memcmp(pStateData, &pZerothState->datum.b, bytes) == 0; |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
| if (TSDB_CODE_SUCCESS == code) { | ||
| code = scalarCalculateConstants(pCastFunc, &pStateWin->pZeroth); | ||
| } |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
There was a problem hiding this comment.
pCastFunc is the new node, no memleak here
There was a problem hiding this comment.
Code Review
This pull request introduces support for a "zeroth state" in state windows, which is a valuable addition for filtering out initial or uninteresting states. The changes are comprehensive, touching data structures, serialization, parsing, query translation, and execution logic. The inclusion of a new test suite is commendable.
I have identified a few issues that should be addressed:
- A correctness bug in the state comparison logic where floating-point types are not handled correctly.
- A couple of memory leaks in the parser and translator due to improper handling of temporary objects.
Fixing these issues will ensure the feature is robust and reliable. Overall, this is a good contribution.
| bool stIsStateEqualZeroth(char* pStateData, void* pZeroth, int32_t bytes) { | ||
| if (pStateData == NULL || pZeroth == NULL) { | ||
| return false; | ||
| } | ||
|
|
||
| SValueNode* pZerothState = (SValueNode*)pZeroth; | ||
| int8_t type = pZerothState->node.resType.type; | ||
| if (IS_VAR_DATA_TYPE(type)) { | ||
| return memcmp(pStateData, pZerothState->datum.p, bytes) == 0; | ||
| } | ||
| if (IS_INTEGER_TYPE(type)) { | ||
| return memcmp(pStateData, &pZerothState->datum.i, bytes) == 0; | ||
| } | ||
| // is boolean type | ||
| return memcmp(pStateData, &pZerothState->datum.b, bytes) == 0; | ||
| } |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
There was a problem hiding this comment.
no need to handle float type.
| if (pOptions != NULL) { | ||
| if (pOptions->length >= 1) { | ||
| state->pExtend = nodesListGetNode(pOptions, 0); | ||
| } | ||
| if (pOptions->length == 2) { | ||
| state->pZeroth = nodesListGetNode(pOptions, 1); | ||
| } | ||
| } | ||
| nodesClearList(pOptions); |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| SNode* pCastFunc = NULL; | ||
| int32_t code = createCastFunc(pCxt, pStateWin->pZeroth, targetDt, &pCastFunc); | ||
| if (TSDB_CODE_SUCCESS == code) { | ||
| code = scalarCalculateConstants(pCastFunc, &pStateWin->pZeroth); | ||
| } |
This comment was marked as duplicate.
This comment was marked as duplicate.
Sorry, something went wrong.
f50c3f8 to
da5cd73
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces support for a 'zeroth state' in state windows, which allows a specific state value to be ignored when forming windows. The changes span parsing, query translation, serialization, and stream processing, and include a comprehensive new test suite. The implementation is solid, but I've identified one issue where floating-point types are not handled in the stream processing logic for state comparison. My review includes a suggestion to address this.
| if (IS_VAR_DATA_TYPE(type)) { | ||
| return memcmp(pStateData, pZerothState->datum.p, bytes) == 0; | ||
| } | ||
| if (IS_INTEGER_TYPE(type)) { | ||
| return memcmp(pStateData, &pZerothState->datum.i, bytes) == 0; | ||
| } | ||
| if (IS_BOOLEAN_TYPE(type)) { | ||
| return memcmp(pStateData, &pZerothState->datum.b, bytes) == 0; | ||
| } | ||
| return false; |
This comment was marked as duplicate.
This comment was marked as duplicate.
Sorry, something went wrong.
There was a problem hiding this comment.
这里的类型是不是不完全?比如 timestamp ?如果不期望走到最后的 return false,可以加上报错。
|
Tony Zhang seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
| } | ||
| if (pTask->notifyEventType & STRIGGER_EVENT_WINDOW_CLOSE) { | ||
| if (stIsStateEqualZeroth(pStateData, pTask->pStateZeroth, bytes)) { | ||
| pWin = taosArrayPop(pContext->pWindows); |
There was a problem hiding this comment.
这是实时计算的逻辑,历史计算那里也要加,对应函数: stHistoryGroupDoStateCheck
测试历史计算:写数据 => 使用 fill_history 选项建流 => 检查流计算结果
There was a problem hiding this comment.
其实最通顺的方式应该是加在 stRealtimeGroupGenCalcParams,参考里面关于 ignoreNoDataTrigger 的逻辑
There was a problem hiding this comment.
这是实时计算的逻辑,历史计算那里也要加,对应函数: stHistoryGroupDoStateCheck 测试历史计算:写数据 => 使用 fill_history 选项建流 => 检查流计算结果
fixed,支持历史数据触发,已添加测试用例
There was a problem hiding this comment.
其实最通顺的方式应该是加在 stRealtimeGroupGenCalcParams,参考里面关于 ignoreNoDataTrigger 的逻辑
这个目前做不到,因为在这个函数里拿不到每个窗口的状态值,只能在关闭窗口时立即判断
| - window_clause: 指定数据按照窗口进行切分并进行聚合,是时序数据库特色查询。详细信息可参阅特色查询章节 [TDengine TSDB 特色查询](../distinguished)。 | ||
| - SESSION: 会话窗口,ts_col 指定时间戳主键列,tol_val 指定时间间隔,正值,时间精度可选 1n、1u、1a、1s、1m、1h、1d、1w,如 SESSION(ts, 12s)。 | ||
| - STATE_WINDOW: 状态窗口,extend 指定窗口在开始结束时的扩展策略,可选值为 0(默认值)、1、2,分别代表无扩展、向后扩展、向前扩展;TRUE_FOR 指定窗口最小持续时长,时间范围为正值,精度可选 1n、1u、1a、1s、1m、1h、1d、1w,如 TRUE_FOR(1a)。 | ||
| - STATE_WINDOW: 状态窗口,extend 指定窗口在开始结束时的扩展策略,可选值为 0(默认值)、1、2,分别代表无扩展、向后扩展、向前扩展;zeroth state 指定“零状态”,状态列为此状态的窗口将不会被计算和输出;TRUE_FOR 指定窗口最小持续时长,时间范围为正值,精度可选 1n、1u、1a、1s、1m、1h、1d、1w,如 TRUE_FOR(1a)。 |
There was a problem hiding this comment.
zeroth_state可以写什么没有说的很清楚,比如常量、列、运算符、函数可以吗
| SValueNode* pZerothState = (SValueNode*)pZeroth; | ||
| int8_t type = pZerothState->node.resType.type; | ||
| if (IS_VAR_DATA_TYPE(type)) { | ||
| return memcmp(pStateData, pZerothState->datum.p, bytes) == 0; |
|
/gemini review |
There was a problem hiding this comment.
Code Review
The pull request introduces the zeroth_state parameter to the STATE_WINDOW function, allowing users to exclude windows with a specific state from calculations. The changes involve modifications to documentation files, header files, source code, and test cases to accommodate this new functionality. The code review focuses on correctness and potential issues arising from the added functionality.
| TAOS_CHECK_EXIT(tEncodeI16(pEncoder, pMsg->trigger.stateWin.extend)); | ||
| TAOS_CHECK_EXIT(tEncodeI64(pEncoder, pMsg->trigger.stateWin.trueForDuration)); | ||
| int32_t stateWindowZerothLen = | ||
| pMsg->trigger.stateWin.zeroth == NULL ? 0 : (int32_t)strlen((char*)pMsg->trigger.stateWin.zeroth) + 1; |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
| int32_t stIsStateEqualZeroth(void* pStateData, void* pZeroth, bool* pIsEqual) { | ||
| int32_t code = TSDB_CODE_SUCCESS; | ||
| if (pStateData == NULL || pZeroth == NULL) { | ||
| return false; | ||
| } |
There was a problem hiding this comment.
If pStateData is NULL, the function should return true to indicate that the state data is equal to the zeroth state, since there is no state data to compare against. This is a critical bug that needs to be fixed.
int32_t stIsStateEqualZeroth(void* pStateData, void* pZeroth, bool* pIsEqual) {
int32_t code = TSDB_CODE_SUCCESS;
if (pStateData == NULL || pZeroth == NULL) {
*pIsEqual = true; // Treat NULL state data as equal to zeroth state
return code;
}| int32_t stIsStateEqualZeroth(void* pStateData, void* pZeroth, bool* pIsEqual) { | ||
| int32_t code = TSDB_CODE_SUCCESS; | ||
| if (pStateData == NULL || pZeroth == NULL) { | ||
| return false; |
There was a problem hiding this comment.
This return statement will cause the function to return false without setting the pIsEqual parameter, which is unexpected. The return type is int32_t, but the function is returning a boolean value. This is a high severity issue that needs to be fixed.
if (pStateData == NULL || pZeroth == NULL) {
*pIsEqual = true; // Treat NULL state data as equal to zeroth state
return code;
}| int32_t code = createCastFunc(pCxt, pStateWin->pZeroth, targetDt, &pCastFunc); | ||
| if (TSDB_CODE_SUCCESS == code) { | ||
| code = scalarCalculateConstants(pCastFunc, &pStateWin->pZeroth); | ||
| } | ||
| if (TSDB_CODE_SUCCESS != code) { | ||
| code = generateSyntaxErrMsgExt(&pCxt->msgBuf, TSDB_CODE_PAR_WRONG_VALUE_TYPE, "Zeroth value type mismatch"); | ||
| } |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
| if (IS_VAR_DATA_TYPE(type)) { | ||
| *pIsEqual = compareLenPrefixedStr(pStateData, pZerothState->datum.p) == 0; |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
| tdSql.error("select count(*) from ntb state_window(s, 1.5)") | ||
| tdSql.error("select count(*) from ntb state_window(s, *)") | ||
| tdSql.error("select count(*) from ntb state_window(s, '2025-09-03')") | ||
| tdSql.error("select count(*) from ntb state_window(s, 1, 2)") |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
feat: support zeroth state in state window (#33294)
Description
Please briefly describe the code changes in this pull request.
Checklist
Please check the items in the checklist if applicable.