Skip to content

Enhance AI query validation by adding support for query refinement and explanation handling#1628

Merged
Artuomka merged 2 commits intomainfrom
backend_ai_query_validation_check
Feb 24, 2026
Merged

Enhance AI query validation by adding support for query refinement and explanation handling#1628
Artuomka merged 2 commits intomainfrom
backend_ai_query_validation_check

Conversation

@Artuomka
Copy link
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings February 24, 2026 14:02
@Artuomka Artuomka enabled auto-merge February 24, 2026 14:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request enhances the AI-powered panel position generation feature by adding query validation and refinement using database EXPLAIN statements. The implementation introduces an iterative feedback loop where AI-generated SQL queries are validated through EXPLAIN, and the AI is given opportunities to correct any issues or optimize the query based on execution plan analysis.

Changes:

  • Added query refinement system with EXPLAIN-based validation for supported database types (Postgres, MySQL, ClickHouse and their agent variants)
  • Implemented iterative AI feedback loop (up to 3 iterations) to correct query errors or optimize based on EXPLAIN output
  • Refactored TableInfo into a reusable interface for better code organization

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +348 to +356
let cleaned = aiResponse.trim();
if (cleaned.startsWith('```sql')) {
cleaned = cleaned.slice(6);
} else if (cleaned.startsWith('```')) {
cleaned = cleaned.slice(3);
}
if (cleaned.endsWith('```')) {
cleaned = cleaned.slice(0, -3);
}
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanQueryResponse method doesn't handle cases where the AI might return the query wrapped in other markdown formats like postgresql, mysql, or clickhouse. Since the prompt is database-type specific, the AI might use database-specific code fences. Consider using a more generic approach that strips any code fence markers (e.g., using regex to match /^[a-z]*\n?/ and /```$/).

Suggested change
let cleaned = aiResponse.trim();
if (cleaned.startsWith('```sql')) {
cleaned = cleaned.slice(6);
} else if (cleaned.startsWith('```')) {
cleaned = cleaned.slice(3);
}
if (cleaned.endsWith('```')) {
cleaned = cleaned.slice(0, -3);
}
// Remove leading and trailing markdown code fences, regardless of language tag
let cleaned = aiResponse.trim();
// Strip opening fence like ```sql, ```postgresql, ```mysql, etc., plus optional newline
cleaned = cleaned.replace(/^```[a-z]*\n?/i, '');
// Strip closing fence ```
cleaned = cleaned.replace(/```$/, '');

Copilot uses AI. Check for mistakes.
extends AbstractUseCase<GeneratePanelPositionWithAiDs, GeneratedPanelWithPositionDto>
implements IGeneratePanelPositionWithAi
{
private readonly logger = new Logger(GeneratePanelPositionWithAiUseCase.name);
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logger is instantiated using NestJS's built-in Logger class, which deviates from the codebase convention. Other use cases in the codebase (e.g., find-tables-in-connection.use.case.ts:38) use WinstonLogger injected via constructor dependency injection. This pattern should be followed for consistency and to ensure proper logging configuration across the application.

Copilot uses AI. Check for mistakes.
Comment on lines +303 to +306
const result = await (dao as IDataAccessObject).executeRawQuery(explainQuery, tableName);
return { success: true, result: JSON.stringify(result, null, 2) };
} catch (error) {
return { success: false, error: error.message };
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The executeRawQuery method requires a userEmail parameter (third parameter) for agent connection types (agent_postgres, agent_mysql, agent_clickhouse). However, this implementation casts dao to IDataAccessObject and only passes two parameters. This will cause the executeRawQuery call to fail for agent connections that are included in EXPLAIN_SUPPORTED_TYPES. The userId should be extracted from inputData in the implementation method, converted to userEmail using isConnectionTypeAgent check and getUserEmailOrReturnNull, and then passed to validateAndRefineQueryWithExplain and ultimately to executeRawQuery.

Suggested change
const result = await (dao as IDataAccessObject).executeRawQuery(explainQuery, tableName);
return { success: true, result: JSON.stringify(result, null, 2) };
} catch (error) {
return { success: false, error: error.message };
const result = await (dao as IDataAccessObject).executeRawQuery(explainQuery, tableName, null);
return { success: true, result: JSON.stringify(result, null, 2) };
} catch (error) {
return { success: false, error: (error as Error).message };

Copilot uses AI. Check for mistakes.
tableName: string,
): Promise<{ success: boolean; result?: string; error?: string }> {
try {
const explainQuery = `EXPLAIN ${query.replace(/;\s*$/, '')}`;
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The EXPLAIN query is constructed by simply prepending "EXPLAIN " to the original query. However, different database types have different EXPLAIN syntax. For example, MySQL uses "EXPLAIN" but ClickHouse might use "EXPLAIN PLAN" or have different options. PostgreSQL supports extended EXPLAIN options like "EXPLAIN ANALYZE". The implementation should account for database-specific EXPLAIN syntax differences based on the connectionType parameter.

Copilot uses AI. Check for mistakes.
Comment on lines +247 to +294
private async validateAndRefineQueryWithExplain(
dao: IDataAccessObject | IDataAccessObjectAgent,
generatedWidget: AIGeneratedWidgetResponse,
tableInfo: TableInfo,
connectionType: ConnectionTypesEnum,
chartDescription: string,
): Promise<AIGeneratedWidgetResponse> {
if (!EXPLAIN_SUPPORTED_TYPES.has(connectionType)) {
return generatedWidget;
}

let currentQuery = generatedWidget.query_text;

for (let iteration = 0; iteration < MAX_FEEDBACK_ITERATIONS; iteration++) {
const explainResult = await this.runExplainQuery(dao, currentQuery, tableInfo.table_name);

const correctionPrompt = this.buildQueryCorrectionPrompt(
currentQuery,
explainResult.success ? explainResult.result : explainResult.error,
!explainResult.success,
tableInfo,
connectionType,
chartDescription,
);

const aiResponse = await this.aiCoreService.completeWithProvider(AIProviderType.BEDROCK, correctionPrompt, {
temperature: 0.2,
});

const correctedQuery = this.cleanQueryResponse(aiResponse);

validateQuerySafety(correctedQuery, connectionType);

if (this.normalizeWhitespace(correctedQuery) === this.normalizeWhitespace(currentQuery)) {
this.logger.log(`Query accepted by AI without changes after EXPLAIN (iteration ${iteration + 1})`);
break;
}

this.logger.log(`Query corrected by AI after EXPLAIN (iteration ${iteration + 1})`);
currentQuery = correctedQuery;

if (explainResult.success) {
break;
}
}

return { ...generatedWidget, query_text: currentQuery };
}
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new validateAndRefineQueryWithExplain method adds significant complexity and multiple code paths (success/failure EXPLAIN, iterations, AI corrections) but appears to lack test coverage. Given that other similar use-cases in the codebase (request-info-from-table-with-ai-v5/v6/v7) have comprehensive test coverage, tests should be added to verify: 1) behavior for supported vs unsupported connection types, 2) handling of EXPLAIN failures vs successes, 3) iteration limits, 4) AI correction acceptance/rejection logic, and 5) error handling for unsafe queries generated by AI.

Copilot uses AI. Check for mistakes.
Comment on lines +287 to +290

if (explainResult.success) {
break;
}
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for breaking out of the refinement loop may be flawed. On line 288-290, the loop breaks if explainResult.success is true, even though the AI just corrected the query. This means if the original query fails EXPLAIN, the AI corrects it, and the corrected query succeeds EXPLAIN, the loop breaks immediately without verifying if the AI made additional changes. The break should occur after checking if the query is unchanged AND the EXPLAIN succeeded, or consider removing the second break condition entirely since the unchanged check on line 280-283 already provides a termination condition.

Suggested change
if (explainResult.success) {
break;
}

Copilot uses AI. Check for mistakes.
};
}

const MAX_FEEDBACK_ITERATIONS = 3;
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constant naming convention appears inconsistent with the rest of the codebase. Based on common TypeScript/NestJS conventions and the SCREAMING_SNAKE_CASE pattern used for EXPLAIN_SUPPORTED_TYPES, the constant should be named MAX_REFINEMENT_ITERATIONS or MAX_EXPLAIN_ITERATIONS instead of MAX_FEEDBACK_ITERATIONS to better describe its purpose in the context of query refinement with EXPLAIN.

Copilot uses AI. Check for mistakes.
Comment on lines +260 to +291
for (let iteration = 0; iteration < MAX_FEEDBACK_ITERATIONS; iteration++) {
const explainResult = await this.runExplainQuery(dao, currentQuery, tableInfo.table_name);

const correctionPrompt = this.buildQueryCorrectionPrompt(
currentQuery,
explainResult.success ? explainResult.result : explainResult.error,
!explainResult.success,
tableInfo,
connectionType,
chartDescription,
);

const aiResponse = await this.aiCoreService.completeWithProvider(AIProviderType.BEDROCK, correctionPrompt, {
temperature: 0.2,
});

const correctedQuery = this.cleanQueryResponse(aiResponse);

validateQuerySafety(correctedQuery, connectionType);

if (this.normalizeWhitespace(correctedQuery) === this.normalizeWhitespace(currentQuery)) {
this.logger.log(`Query accepted by AI without changes after EXPLAIN (iteration ${iteration + 1})`);
break;
}

this.logger.log(`Query corrected by AI after EXPLAIN (iteration ${iteration + 1})`);
currentQuery = correctedQuery;

if (explainResult.success) {
break;
}
}
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running EXPLAIN queries up to MAX_FEEDBACK_ITERATIONS times (3) with AI correction calls in between could significantly increase response time and AI token usage. Each iteration involves: 1) executing an EXPLAIN query, 2) calling the AI service with a potentially large EXPLAIN result, and 3) parsing and validating the AI response. Consider adding timeout handling or making the iteration count configurable based on the complexity of the operation. Additionally, consider logging execution time for each iteration to help monitor performance.

Copilot uses AI. Check for mistakes.

const correctedQuery = this.cleanQueryResponse(aiResponse);

validateQuerySafety(correctedQuery, connectionType);
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When validateQuerySafety fails on line 278, it will throw an exception that propagates up without any specific handling. This means that if the AI-generated correction introduces unsafe SQL (e.g., contains forbidden keywords), the entire operation fails. Consider catching this exception and either breaking the loop to return the previous safe query, or adding retry logic with a different prompt to the AI. The current behavior could be confusing to users since the EXPLAIN-based refinement is meant to improve the query, but could cause the operation to fail when the original query was already safe.

Copilot uses AI. Check for mistakes.
Comment on lines +303 to +304
const result = await (dao as IDataAccessObject).executeRawQuery(explainQuery, tableName);
return { success: true, result: JSON.stringify(result, null, 2) };
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method assumes that calling JSON.stringify on the EXPLAIN result will always produce a useful string for the AI to analyze. However, executeRawQuery can return different data structures depending on the connection type (as seen in execute-panel.use.case.ts lines 67-86 where postgres returns {rows: []}, mysql returns [[]], etc.). The result should be normalized or processed based on connectionType before stringifying to ensure the AI receives consistent, meaningful EXPLAIN output format.

Copilot uses AI. Check for mistakes.
@Artuomka Artuomka disabled auto-merge February 24, 2026 14:14
@Artuomka Artuomka enabled auto-merge February 24, 2026 14:15
@Artuomka Artuomka merged commit 21f4ed3 into main Feb 24, 2026
17 of 19 checks passed
@Artuomka Artuomka deleted the backend_ai_query_validation_check branch February 24, 2026 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants