fix(queries): check rows.Err() after Next() loop in bind()#1472
Open
marcoabreu wants to merge 2 commits intoaarondl:masterfrom
Open
fix(queries): check rows.Err() after Next() loop in bind()#1472marcoabreu wants to merge 2 commits intoaarondl:masterfrom
marcoabreu wants to merge 2 commits intoaarondl:masterfrom
Conversation
bind() did not check rows.Err() after the rows.Next() loop. When rows.Next() returned false due to a connection error (not an empty result set), bind() returned sql.ErrNoRows — silently swallowing the real error. This caused callers that distinguish "not found" from "transient failure" to make wrong decisions. In production, intermittent connection errors were misclassified as "row not found", causing permanent data loss in retry-based message processing systems. The fix adds a rows.Err() check after the loop, before the foundOne/ErrNoRows check. This covers all bind kinds (struct, slice, ptr slice) uniformly and is backwards-compatible.
Same bug pattern as the bind() fix: rows.Next() loops in driver schema introspection functions did not check rows.Err() afterward. Per the database/sql contract, Next() returns false both when the result set is exhausted and when an error occurs — Err() must be consulted to distinguish the two cases. Without this check, a connection error during code generation would silently produce incomplete schema information (missing tables, columns, or indexes), leading to incorrect generated code. Affected functions across all four drivers: - PostgreSQL: TableNames, ViewNames, Columns, loadUniqueColumns - MySQL: TableNames, ViewNames, Columns - MSSQL: TableNames, ViewNames, Columns - SQLite3: TableNames, ViewNames, tableInfo, indexes (nested loops)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
bind()inqueries/reflect.goand schema introspection functions across all four database drivers do not checkrows.Err()afterrows.Next()loops. Whenrows.Next()returnsfalsedue to any error — connection resets, context cancellation, query timeouts, TLS failures, etc. — these functions either returnsql.ErrNoRowsor silently return partial/empty results instead of the real error.Ref: #1324 (comment)
The bug
Per the
database/sqldocumentation:Runtime query path (
bind()): After therows.Next()loop,bind()checks!foundOneand returnssql.ErrNoRowswithout first consultingrows.Err(). The callingBind()method only checksrows.Err()on the success path — on the error path, the real error is permanently lost. This affects all bind kinds: struct (returns wrongsql.ErrNoRows), slice and pointer slice (silently returns partial results).Code generation path (all drivers): Schema introspection functions (
TableNames,ViewNames,Columns, etc.) iteraterows.Next()but never checkrows.Err(), silently returning incomplete schema information on connection errors.The fix
Added
rows.Err()checks after everyrows.Next()loop that was missing one:queries/reflect.go—bind():All four drivers — same pattern in each affected function:
TableNames,ViewNames,Columns,loadUniqueColumnsTableNames,ViewNames,ColumnsTableNames,ViewNames,ColumnsTableNames,ViewNames,tableInfo,indexes(nested loops — both inner and outer)The fix is backwards-compatible: callers using
errors.Is(err, sql.ErrNoRows)will no longer match on real errors, which is the correct behavior.Production impact
We discovered this bug in a production identity management platform running on GCP Cloud Run + Cloud SQL (PostgreSQL). Any error occurring between query dispatch and row iteration — Cloud SQL proxy restarts, context cancellations, TCP resets, query timeouts — was silently converted to
sql.ErrNoRows. Applications treating "not found" as a permanent/non-retryable condition permanently dropped millions of messages per day and assumed query results to be empty when they actually had an underlying error, causing correctness failures across the system.Test plan
18 new tests using
go-sqlmockwithRowError()to simulate mid-iteration connection failures:TestBindStructRowsError,TestBindSliceRowsError,TestBindPtrSliceRowsErrorTestTableNames_RowsErr,TestViewNames_RowsErr,TestLoadUniqueColumns_RowsErr,TestColumns_RowsErrTestTableNames_RowsErr,TestViewNames_RowsErr,TestColumns_RowsErrTestTableNames_RowsErr,TestViewNames_RowsErr,TestColumns_RowsErrTestTableNames_RowsErr,TestViewNames_RowsErr,TestTableInfo_RowsErr,TestIndexes_OuterRowsErr,TestIndexes_InnerRowsErr.Next()loops in the codebase