Fix: Critical database connection leaks in all database functions#547
Conversation
- Add finally blocks to ensure connections are always closed - Fix connection leaks in folders.py, face_clusters.py, faces.py, albums.py, metadata.py, yolo_mapping.py - Prevent database locking issues during image deletion and other operations - Maintain backward compatibility while improving reliability Fixes AOSSIE-Org#464
WalkthroughWraps many DB operations across backend and sync microservice modules in try/finally or context-managed blocks to guarantee deterministic commits/rollbacks and that SQLite connections are closed; adds image-existence validation for album operations and several new folder-management helpers. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant C as Caller
participant F as DB_Function
participant DB as sqlite3
rect rgba(230,240,250,0.6)
Note over F: Standardized DB pattern (try/finally or context manager)
C->>F: invoke(...)
F->>DB: connect()
activate DB
alt normal
F->>DB: execute SQL
opt write
F->>DB: commit()
end
F-->>C: return result
else exception
F-->>C: raise/propagate
end
DB-->>F: close() [finally]
deactivate DB
end
sequenceDiagram
autonumber
participant U as Caller
participant A as db_add_images_to_album
participant DB as sqlite3
rect rgba(240,255,240,0.6)
Note over A: validate provided image IDs before inserting
U->>A: add(album_id, image_ids)
A->>DB: connect (try)
A->>DB: SELECT existing image IDs
A->>A: filter provided IDs -> valid_ids
alt valid_ids empty
A-->>U: raise ValueError
else
A->>DB: INSERT mappings for valid_ids
A->>DB: commit()
A-->>U: success/count
end
DB-->>A: close() [finally]
end
sequenceDiagram
autonumber
participant U as Caller
participant R as db_remove_image_from_album
participant DB as sqlite3
rect rgba(255,245,230,0.6)
Note over R: existence check before delete
U->>R: remove(album_id, image_id)
R->>DB: connect (try)
R->>DB: SELECT mapping exists?
alt not found
R-->>U: raise ValueError
else found
R->>DB: DELETE mapping
R->>DB: commit()
R-->>U: True/ack
end
DB-->>R: close() [finally]
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/app/database/faces.py (1)
164-181: Handle None bbox; current code can raise TypeError (not caught).json.loads(None) raises TypeError, which escapes your JSONDecodeError except block. Guard before decoding.
- try: - embeddings_json = json.loads(embeddings) - bbox_json = json.loads(bbox) - except json.JSONDecodeError: - continue + try: + embeddings_json = json.loads(embeddings) if embeddings else None + bbox_json = json.loads(bbox) if bbox else None + except (json.JSONDecodeError, TypeError): + continuebackend/app/database/albums.py (1)
198-205: Avoid generatingIN ()when no image IDs.Passing an empty
image_idslist now rendersSELECT ... WHERE id IN (), which SQLite rejects withOperationalError. This used to be a no-op but is now a hard failure on perfectly valid input. Please short-circuit before building the SQL.Apply this diff:
def db_add_images_to_album(album_id: str, image_ids: list[str]): with get_db_connection() as conn: cursor = conn.cursor() + if not image_ids: + return + query = ( f"SELECT id FROM images WHERE id IN ({','.join('?' for _ in image_ids)})" )
🧹 Nitpick comments (5)
backend/app/database/yolo_mapping.py (2)
14-31: Add explicit rollback on failure (avoid partial writes).If any INSERT fails after some rows, the open transaction will be implicitly rolled back on close, but being explicit improves clarity and safety across drivers. Consider the standard pattern below.
- try: + try: cursor.execute( """ CREATE TABLE IF NOT EXISTS mappings ( class_id TEXT PRIMARY KEY, name VARCHAR NOT NULL ) """ ) for class_id, name in enumerate(class_names): cursor.execute( "INSERT OR REPLACE INTO mappings (class_id, name) VALUES (?, ?)", (str(class_id), name), # Convert class_id to string since it's now TEXT ) - conn.commit() - finally: - conn.close() + conn.commit() + except Exception: + conn.rollback() + raise + finally: + conn.close()
10-12: Remove stray print; prefer logging.The print of cwd is noisy in production. Either drop it or gate it behind logging.
-import os -print(os.getcwd()) +import logging, os +log = logging.getLogger(__name__) +# log.debug("cwd=%s", os.getcwd())backend/app/database/faces.py (3)
71-90: Simplify/robustify embeddings serialization.Current
[emb.tolist() for emb in embeddings]iterates element-wise. Use ravel().tolist() to normalize shape and cut overhead.- embeddings_json = json.dumps([emb.tolist() for emb in embeddings]) + embeddings_json = json.dumps(np.ravel(embeddings).tolist())
333-372: Compute means more simply with defaultdict; minor.Cleaner grouping without manual key checks.
- # Group embeddings by cluster_id - cluster_embeddings = {} - for row in rows: - cluster_id, embeddings_json = row - embeddings = np.array(json.loads(embeddings_json)) - - if cluster_id not in cluster_embeddings: - cluster_embeddings[cluster_id] = [] - cluster_embeddings[cluster_id].append(embeddings) + from collections import defaultdict + cluster_embeddings = defaultdict(list) + for cluster_id, embeddings_json in rows: + cluster_embeddings[cluster_id].append(np.array(json.loads(embeddings_json)))
1-1: Ensure all sqlite3.connect calls are properly guarded by context managers or try/finally
Several functions (e.g. in backend/app/database/faces.py, backend/app/utils/face_clusters.py, sync-microservice/app/database/folders.py) callconn = sqlite3.connect(...)outside of awithblock ortry/finally, risking unclosed connections. Wrap eachsqlite3.connectinwith sqlite3.connect(...) as conn:or move it inside atrywith afinallythat callsconn.close().
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
backend/app/database/albums.py(4 hunks)backend/app/database/face_clusters.py(7 hunks)backend/app/database/faces.py(5 hunks)backend/app/database/folders.py(3 hunks)backend/app/database/metadata.py(3 hunks)backend/app/database/yolo_mapping.py(1 hunks)sync-microservice/app/database/folders.py(1 hunks)
🧰 Additional context used
🪛 GitHub Actions: PR Check
backend/app/database/faces.py
[error] 1-1: Black formatting changed this file during pre-commit. Please re-run commit after formatting (black) or run 'black backend/app/database/faces.py'.
backend/app/database/albums.py
[error] 1-1: Black formatting changed this file during pre-commit. Please re-run commit after formatting (black) or run 'black backend/app/database/albums.py'.
🔇 Additional comments (5)
backend/app/database/yolo_mapping.py (1)
14-31: Good: deterministic cleanup added.The try/finally that encloses commit/close reliably fixes the leak here. Nice.
backend/app/database/faces.py (4)
222-236: LGTM: leak fixed and types handled properly.Deterministic close with finally; decoding to numpy is correct.
251-274: LGTM: safe join + cleanup.Query, ordering, decode, and unconditional close look good.
300-320: Batch update path looks correct.Uses executemany + single commit + cleanup. Consider validating/normalizing cluster_id to int to avoid FK/type surprises from user input.
1-1: Pre-commit failed: run Black.CI indicates Black reformatted this file. Please run: black backend/app/database/faces.py and re-push.
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/app/database/face_clusters.py (1)
21-35: Add try/finally block for consistency.While this function is not modified in the PR, the PR objectives state "Added finally blocks to 31+ database functions to ensure connections are always closed." For consistency and to guarantee cleanup even if CREATE TABLE fails unexpectedly, wrap the operations in try/finally.
Apply this diff:
def db_create_clusters_table() -> None: """Create the face_clusters table if it doesn't exist.""" conn = sqlite3.connect(DATABASE_PATH) cursor = conn.cursor() - cursor.execute( - """ - CREATE TABLE IF NOT EXISTS face_clusters ( - cluster_id TEXT PRIMARY KEY, - cluster_name TEXT, - face_image_base64 TEXT - ) - """ - ) - conn.commit() - conn.close() + try: + cursor.execute( + """ + CREATE TABLE IF NOT EXISTS face_clusters ( + cluster_id TEXT PRIMARY KEY, + cluster_name TEXT, + face_image_base64 TEXT + ) + """ + ) + conn.commit() + finally: + conn.close()
🧹 Nitpick comments (7)
backend/app/database/faces.py (2)
239-241: Remove unnecessary parentheses in return type annotation.The extra parentheses around the return type annotation are syntactically valid but unconventional and inconsistent with Python typing standards.
Apply this diff:
-def db_get_all_faces_with_cluster_names() -> ( - List[Dict[str, Union[FaceId, FaceEmbedding, Optional[str]]]] -): +def db_get_all_faces_with_cluster_names() -> List[Dict[str, Union[FaceId, FaceEmbedding, Optional[str]]]]:
281-283: Remove trailing comma in function signature.The trailing comma after the parameter is unconventional and serves no purpose in a single-parameter function signature.
Apply this diff:
def db_update_face_cluster_ids_batch( - face_cluster_mapping: List[Dict[str, Union[FaceId, ClusterId]]], + face_cluster_mapping: List[Dict[str, Union[FaceId, ClusterId]]] ) -> None:backend/app/database/folders.py (1)
393-395: Remove unnecessary parentheses in return type annotation.Similar to the issue in faces.py, the extra parentheses around the return type annotation are unconventional and inconsistent with Python typing standards.
Apply this diff:
-def db_get_all_folder_details() -> ( - List[Tuple[str, str, Optional[str], int, bool, Optional[bool]]] -): +def db_get_all_folder_details() -> List[Tuple[str, str, Optional[str], int, bool, Optional[bool]]]:backend/app/database/face_clusters.py (4)
55-78: Consider adding explicit rollback on exception.The try/finally pattern ensures the connection closes, but there's no explicit rollback if the insert or commit fails. While SQLite may auto-rollback, explicitly handling rollback improves clarity and ensures clean transaction state.
Apply this diff:
try: cluster_ids = [] insert_data = [] for cluster in clusters: cluster_id = cluster.get("cluster_id") cluster_name = cluster.get("cluster_name") face_image_base64 = cluster.get("face_image_base64") insert_data.append((cluster_id, cluster_name, face_image_base64)) cluster_ids.append(cluster_id) cursor.executemany( """ INSERT INTO face_clusters (cluster_id, cluster_name, face_image_base64) VALUES (?, ?, ?) """, insert_data, ) conn.commit() return cluster_ids + except Exception: + conn.rollback() + raise finally: conn.close()
158-181: Good fix, but consider adding explicit rollback.The try/finally pattern correctly ensures the connection closes (Python's finally executes even with the early return on line 168). However, like the batch insert function, consider adding explicit rollback on exception for transaction safety.
Apply this diff:
try: # Build the update query dynamically based on provided parameters update_fields = [] update_values = [] if cluster_name is not None: update_fields.append("cluster_name = ?") update_values.append(cluster_name) if not update_fields: return False update_values.append(cluster_id) cursor.execute( f"UPDATE face_clusters SET {', '.join(update_fields)} WHERE cluster_id = ?", update_values, ) updated = cursor.rowcount > 0 conn.commit() return updated + except Exception: + conn.rollback() + raise finally: conn.close()
194-201: LGTM, but consider adding explicit rollback.The try/finally pattern ensures connection cleanup. Like other write operations, consider adding explicit rollback on exception for transaction safety.
Apply this diff:
try: cursor.execute("DELETE FROM face_clusters") deleted_count = cursor.rowcount conn.commit() return deleted_count + except Exception: + conn.rollback() + raise finally: conn.close()
265-319: LGTM!The try/finally block ensures connection cleanup. The implementation correctly maintains the original behavior including JSON bbox parsing.
Minor optimization: The
import jsonon line 301 is inside the loop. Consider moving it to the module-level imports at the top of the file for better efficiency, though this is a pre-existing pattern and not critical.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
backend/app/database/albums.py(4 hunks)backend/app/database/face_clusters.py(7 hunks)backend/app/database/faces.py(5 hunks)backend/app/database/folders.py(4 hunks)
🔇 Additional comments (26)
backend/app/database/faces.py (5)
71-89: LGTM! Proper resource management implemented.The try/finally block ensures the connection is always closed, and the return statement is correctly placed inside the try block.
222-236: LGTM! Connection cleanup guaranteed.The try/finally pattern properly ensures the connection closes after fetching and processing all unassigned face records.
251-278: LGTM! Resource cleanup properly implemented.The try/finally block ensures the connection is closed after fetching and processing all face records with cluster names.
304-323: LGTM! Batch update properly protected.The try/finally block ensures the connection is closed after the batch update operation completes or fails.
337-376: LGTM! Complex operation with proper cleanup.The try/finally block ensures connection closure even when processing multiple clusters and calculating mean embeddings.
backend/app/database/albums.py (13)
45-59: LGTM! Table creation now has guaranteed cleanup.The try/finally block ensures the connection is closed after creating the albums table.
65-79: LGTM! Junction table creation properly protected.The try/finally block ensures connection closure after creating the album_images table.
85-93: LGTM! Read operation with proper cleanup.The try/finally block ensures the connection is closed after fetching albums, regardless of the
show_hiddenparameter.
99-104: LGTM! Lookup with guaranteed cleanup.The try/finally block ensures connection closure after fetching the album by name.
110-115: LGTM! ID lookup properly protected.The try/finally block ensures the connection is closed after fetching the album by ID.
127-142: LGTM! Insert operation with proper cleanup.The try/finally block ensures connection closure after inserting an album, including proper password hashing when provided.
154-170: LGTM! Update operation properly protected.The try/finally block ensures the connection is closed after updating album details and password hash.
176-180: LGTM! Delete operation with guaranteed cleanup.The try/finally block ensures connection closure after deleting an album.
186-193: LGTM! Image retrieval with proper cleanup.The try/finally block ensures the connection is closed after fetching album images.
196-212: LGTM! Context manager with validation.The function uses the context manager for proper resource management and validates that at least one provided image ID exists before attempting the insert.
215-231: LGTM! Context manager with existence check.The function uses the context manager and validates that the image exists in the album before attempting deletion, providing clear error feedback.
237-244: LGTM! Batch deletion with proper cleanup.The try/finally block ensures connection closure after batch deleting images from an album.
250-259: LGTM! Password verification properly protected.The try/finally block ensures the connection is closed after verifying the album password, with proper handling of missing or null passwords.
backend/app/database/folders.py (5)
67-102: LGTM! Comprehensive folder insertion with cleanup.The try/finally block ensures connection closure after validating the directory, checking for existing folders, and inserting new folder records.
108-117: LGTM! Path lookup with proper cleanup.The try/finally block ensures the connection is closed after looking up the folder ID by path.
123-131: LGTM! ID lookup properly protected.The try/finally block ensures connection closure after fetching the folder path by ID.
143-148: LGTM! Bulk ID retrieval with guaranteed cleanup.The try/finally block ensures the connection is closed after fetching all folder IDs.
189-213: LGTM! Delete operation with validation and cleanup.The try/finally block ensures connection closure after enabling foreign keys, validating folder existence, and performing the deletion. The ValueError provides clear feedback when the folder doesn't exist.
backend/app/database/face_clusters.py (3)
94-108: LGTM!The try/finally block correctly ensures connection cleanup for this read-only operation. The logic preserves the original behavior while guaranteeing resource cleanup.
121-138: LGTM!The try/finally block ensures connection cleanup while maintaining the original return behavior. The implementation is correct.
216-247: LGTM!The try/finally block ensures connection cleanup for this read-only JOIN query. The implementation correctly maintains the original behavior while guaranteeing resource cleanup.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/app/database/albums.py (2)
200-216: Guard empty lists and respect SQLite variable limits.
- If image_ids is empty, SQL becomes “IN ()” (syntax error).
- Very large lists can exceed SQLite’s max variable count (~999).
Apply:
def db_add_images_to_album(album_id: str, image_ids: list[str]): - with get_db_connection() as conn: + if not image_ids: + raise ValueError("image_ids cannot be empty.") + with get_db_connection() as conn: cursor = conn.cursor() - - query = ( - f"SELECT id FROM images WHERE id IN ({','.join('?' for _ in image_ids)})" - ) - cursor.execute(query, image_ids) + # Batch to stay under SQLite parameter limits + valid_images = [] + BATCH = 800 + for i in range(0, len(image_ids), BATCH): + chunk = image_ids[i : i + BATCH] + q = f"SELECT id FROM images WHERE id IN ({','.join('?' for _ in chunk)})" + cursor.execute(q, chunk) + valid_images.extend(row[0] for row in cursor.fetchall())Insertion loop remains unchanged; it will dedupe via INSERT OR IGNORE.
8-33: Refactor raw sqlite3.connect calls and implement delete_image_db
- Replace all direct
sqlite3.connectinvocations in backend/app/database/* and sync-microservice/app/database/folders.py withwith get_db_connection():to ensure proper commit/rollback and connection closure.- Add the missing
delete_image_dbfunction implementation per issue #464.
🧹 Nitpick comments (16)
sync-microservice/app/database/folders.py (1)
64-66: Use logging and consider context manager for auto‑rollback.
Replace print with logging and preferwith sqlite3.connect(...) as conn:for implicit commit/rollback.- except Exception as e: - print(f"Database connection error: {e}") + except Exception as e: + logging.exception("Database connection error") return FalseAdd once at file top if not present:
import loggingbackend/app/database/yolo_mapping.py (2)
7-11: Remove debugprint(os.getcwd()).
Noisy in production; use logging if truly needed.- # print current directory: - import os - - print(os.getcwd()) + # Optional: use logging.debug("cwd: %s", os.getcwd()) if needed
11-35: Prefer connection context manager for commit/rollback.
Simplifies and improves safety; avoids manualcommitandclose.- conn = None - try: - conn = sqlite3.connect(DATABASE_PATH) - cursor = conn.cursor() + with sqlite3.connect(DATABASE_PATH) as conn: + cursor = conn.cursor() cursor.execute( """ CREATE TABLE IF NOT EXISTS mappings ( class_id INTEGER PRIMARY KEY, name VARCHAR NOT NULL ) """ ) for class_id, name in enumerate(class_names): cursor.execute( "INSERT OR REPLACE INTO mappings (class_id, name) VALUES (?, ?)", ( class_id, name, ), # Keep class_id as integer to match image_classes.class_id ) - - conn.commit() - finally: - if conn is not None: - conn.close() + # with-conn will commit on success, rollback on exceptionbackend/app/database/metadata.py (3)
10-31: Add rollback (or usewith) for transaction safety.
If an error occurs afterCREATE/inserts, explicitly rollback; or use a connection context manager.- conn = None - try: - conn = sqlite3.connect(DATABASE_PATH) - cursor = conn.cursor() + conn = None + try: + conn = sqlite3.connect(DATABASE_PATH) + cursor = conn.cursor() cursor.execute( """ CREATE TABLE IF NOT EXISTS metadata ( metadata TEXT ) """ ) # Insert initial row if table is empty cursor.execute("SELECT COUNT(*) FROM metadata") if cursor.fetchone()[0] == 0: cursor.execute("INSERT INTO metadata (metadata) VALUES (?)", ("{}",)) - conn.commit() + conn.commit() + except Exception: + if conn is not None: + conn.rollback() + raise finally: if conn is not None: conn.close()
43-56: Usewithto simplify and ensure cleanup.
Pure read; still safe to wrap for consistency.- conn = sqlite3.connect(DATABASE_PATH) - cursor = conn.cursor() - try: - cursor.execute("SELECT metadata FROM metadata LIMIT 1") - row = cursor.fetchone() - if row and row[0]: - try: - return json.loads(row[0]) - except json.JSONDecodeError: - return None - return None - finally: - conn.close() + with sqlite3.connect(DATABASE_PATH) as conn: + cursor = conn.cursor() + cursor.execute("SELECT metadata FROM metadata LIMIT 1") + row = cursor.fetchone() + if row and row[0]: + try: + return json.loads(row[0]) + except json.JSONDecodeError: + return None + return None
71-83: Rollback on failure in write path.
Delete + insert should rollback on error to avoid leaving the DB in a pending transaction.- try: + try: metadata_json = json.dumps(metadata) # Delete all existing rows and insert new one cursor.execute("DELETE FROM metadata") cursor.execute("INSERT INTO metadata (metadata) VALUES (?)", (metadata_json,)) updated = cursor.rowcount > 0 conn.commit() return updated - finally: + except Exception: + conn.rollback() + raise + finally: conn.close()backend/app/database/faces.py (5)
75-93: Simplify embedding serialization and add rollback on error.
Useembeddings.tolist()and rollback on exceptions.- try: - embeddings_json = json.dumps([emb.tolist() for emb in embeddings]) + try: + embeddings_json = json.dumps(embeddings.tolist()) # Convert bbox to JSON string if provided bbox_json = json.dumps(bbox) if bbox is not None else None cursor.execute( """ INSERT INTO faces (image_id, cluster_id, embeddings, confidence, bbox) VALUES (?, ?, ?, ?, ?) """, (image_id, cluster_id, embeddings_json, confidence, bbox_json), ) face_id = cursor.lastrowid conn.commit() return face_id - finally: + except Exception: + conn.rollback() + raise + finally: conn.close()
226-241: Guard JSON parsing failures.
A bad row can raise and abort the whole read; catch parsing errors per row.- faces = [] - for row in rows: - face_id, embeddings_json = row - # Convert JSON string back to numpy array - embeddings = np.array(json.loads(embeddings_json)) - faces.append({"face_id": face_id, "embeddings": embeddings}) + faces = [] + for row in rows: + face_id, embeddings_json = row + try: + embeddings = np.array(json.loads(embeddings_json)) + except (TypeError, json.JSONDecodeError): + continue + faces.append({"face_id": face_id, "embeddings": embeddings})
243-245: Nit: remove extraneous parentheses in return type.
Keeps annotations idiomatic.-def db_get_all_faces_with_cluster_names() -> ( - List[Dict[str, Union[FaceId, FaceEmbedding, Optional[str]]]] -): +def db_get_all_faces_with_cluster_names() -> List[ + Dict[str, Union[FaceId, FaceEmbedding, Optional[str]]] +]:
286-327: Add rollback for batch updates.
Ensure transactional safety on failure.- try: + try: # Prepare update data as tuples (cluster_id, face_id) update_data = [] for mapping in face_cluster_mapping: face_id = mapping.get("face_id") cluster_id = mapping.get("cluster_id") update_data.append((cluster_id, face_id)) cursor.executemany( """ UPDATE faces SET cluster_id = ? WHERE face_id = ? """, update_data, ) conn.commit() - finally: + except Exception: + conn.rollback() + raise + finally: conn.close()
341-380: Optionally simplify mean computation.
np.mean(np.vstack(embeddings_list), axis=0)is concise; keep if shapes guaranteed equal.backend/app/database/albums.py (1)
42-62: Adopt get_db_connection across this module for consistency (timeouts, WAL, FK).Several functions still use raw sqlite3.connect without busy_timeout/WAL/foreign_keys. Migrate them to get_db_connection for uniform behavior and fewer “database is locked” errors.
Example (db_get_all_albums):
-def db_get_all_albums(show_hidden: bool = False): - conn = sqlite3.connect(DATABASE_PATH) - cursor = conn.cursor() - try: - if show_hidden: - cursor.execute("SELECT * FROM albums") - else: - cursor.execute("SELECT * FROM albums WHERE is_hidden = 0") - albums = cursor.fetchall() - return albums - finally: - conn.close() +def db_get_all_albums(show_hidden: bool = False): + with get_db_connection() as conn: + cursor = conn.cursor() + if show_hidden: + cursor.execute("SELECT * FROM albums") + else: + cursor.execute("SELECT * FROM albums WHERE is_hidden = 0") + return cursor.fetchall()Also applies to: 65-83, 90-98, 100-109, 111-120, 187-198, 251-263
backend/app/database/face_clusters.py (2)
208-251: Align connections with shared helper for timeouts and FK behavior.While these are reads, adopting a common get_db_connection (or a shared db utils module) keeps behavior consistent under load.
Example:
-conn = sqlite3.connect(DATABASE_PATH) +from app.database.albums import get_db_connection # or move helper to db/utils.py ... -try: - cursor.execute(""" ... """) +try: + with get_db_connection() as conn: + cursor = conn.cursor() + cursor.execute(""" ... """)
269-323: Prefer sqlite3.Row and module-level json import for clarity.Reduces tuple indexing and repeated imports.
Apply:
-import sqlite3 +import sqlite3 +import json ... -conn = sqlite3.connect(DATABASE_PATH) -cursor = conn.cursor() +conn = sqlite3.connect(DATABASE_PATH) +conn.row_factory = sqlite3.Row +cursor = conn.cursor() ... -rows = cursor.fetchall() -... -for row in rows: - ( - image_id, - image_path, - thumbnail_path, - metadata, - face_id, - confidence, - bbox_json, - ) = row - bbox = None - if bbox_json: - import json - bbox = json.loads(bbox_json) - images.append( - { - "image_id": image_id, - "image_path": image_path, - "thumbnail_path": thumbnail_path, - "metadata": metadata, - "face_id": face_id, - "confidence": confidence, - "bbox": bbox, - } - ) +for row in cursor.fetchall(): + bbox = json.loads(row["bbox"]) if row["bbox"] else None + images.append( + { + "image_id": row["image_id"], + "image_path": row["image_path"], + "thumbnail_path": row["thumbnail_path"], + "metadata": row["metadata"], + "face_id": row["face_id"], + "confidence": row["confidence"], + "bbox": bbox, + } + )backend/app/database/folders.py (2)
338-355: Normalize prefix queries to absolute path and consider boundary matching.Stored paths are absolute elsewhere; using raw root_path may miss matches or over‑match (e.g., “/root/path1” matches “/root/path10”). Normalize and match with a trailing separator.
Apply:
def db_get_folder_ids_by_path_prefix(root_path: str) -> List[FolderIdPath]: - """Get all folder IDs and paths whose path starts with the given root path.""" - conn = sqlite3.connect(DATABASE_PATH) + """Get all folder IDs and paths whose path starts with the given root path.""" + conn = sqlite3.connect(DATABASE_PATH) cursor = conn.cursor() try: - # Use path LIKE with wildcard to match all subfolders - cursor.execute( - """ - SELECT folder_id, folder_path FROM folders - WHERE folder_path LIKE ? || '%' - """, - (root_path,), - ) + abs_root = os.path.abspath(root_path) + # Ensure boundary to avoid prefix collisions (…/root vs …/root2) + pattern = abs_root.rstrip(os.sep) + os.sep + '%' + cursor.execute( + """ + SELECT folder_id, folder_path FROM folders + WHERE folder_path LIKE ? + """, + (pattern,), + ) return cursor.fetchall() finally: conn.close()
155-188: Operational: add index on folders(folder_path) to speed prefix lookups.Given frequent lookups by path/prefix, an index on folder_path will help, especially on large datasets.
Consider:
CREATE INDEX IF NOT EXISTS idx_folders_path ON folders(folder_path);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
backend/app/database/albums.py(4 hunks)backend/app/database/face_clusters.py(8 hunks)backend/app/database/faces.py(6 hunks)backend/app/database/folders.py(5 hunks)backend/app/database/metadata.py(3 hunks)backend/app/database/yolo_mapping.py(1 hunks)sync-microservice/app/database/folders.py(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Backend Tests
🔇 Additional comments (12)
sync-microservice/app/database/folders.py (1)
51-69: Good fix: outer finally now closes conn even if cursor() fails.
This addresses the previously noted leak scenario. Nice.backend/app/database/yolo_mapping.py (1)
17-31: Schema/type alignment looks correct.
class_id INTEGER PRIMARY KEYplus integer inserts resolves prior join mismatch withimage_classes.class_id.backend/app/database/albums.py (2)
8-33: Solid connection context manager with retries, WAL, busy timeout, and FK enforcement.Good foundation for consistent DB behavior and cleanup.
251-263: Password verification flow looks correct.bcrypt usage and None-handling are fine.
backend/app/database/face_clusters.py (3)
23-39: Table creation wrapped correctly; deterministic closure.Looks good.
59-83: Batch insert transaction is clean and safe.Use of executemany + commit + finally close is correct.
162-185: Dynamic UPDATE with early no-op return is fine.Rowcount check + commit pattern is correct.
backend/app/database/folders.py (5)
16-36: Table creation now safely closes the connection.LGTM.
71-106: Insert folder flow is correct and leak-safe.Absolute path normalization, existence check, and idempotency via pre-check look good.
147-152: All-folder-IDs retrieval wrapped with finally; good.No concerns.
193-217: Deletion enables FK and cleans up; good.Covers cascade constraints prior to delete and commits.
397-399: Signature clarification for db_get_all_folder_details looks good.No behavior change; readability improved.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
backend/app/database/albums.py (1)
42-269: Consider standardizing on the context manager for consistency.Most functions use raw
sqlite3.connect()with manual try/finally blocks, while a few (db_delete_album,db_add_images_to_album,db_remove_image_from_album) use theget_db_connection()context manager.The context manager provides important benefits:
- Foreign key enforcement via
PRAGMA foreign_keys=ON- WAL mode for better concurrency
- Automatic commit/rollback handling
- Retry logic for database locks
- Longer busy timeout (30 seconds vs default 5 seconds)
Functions without these features may encounter database locking issues under concurrent access or fail to enforce foreign key constraints.
Consider refactoring the remaining functions to use
get_db_connection()for consistency and robustness.backend/app/database/faces.py (1)
55-381: Consider enabling foreign key enforcement on all connections.Only
db_create_faces_table(Line 33) enablesPRAGMA foreign_keys = ON, while all other functions use raw connections without this pragma. Thefacestable declares foreign key constraints (Lines 44-45) that referenceimages(id)andface_clusters(cluster_id), but these constraints are not enforced unless the pragma is set on each connection.This means operations in functions like
db_insert_face_embeddings,db_update_face_cluster_ids_batch, and others won't validate referential integrity, potentially allowing orphaned records.Consider one of these approaches:
- Add
conn.execute("PRAGMA foreign_keys = ON")after eachsqlite3.connect()call- Create and use a
get_db_connection()context manager (similar toalbums.py) that centralizes pragma settings, WAL mode, busy timeout, and retry logicOption 2 provides better consistency and additional robustness (retries, longer timeouts, automatic commit/rollback).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
backend/app/database/albums.py(4 hunks)backend/app/database/faces.py(6 hunks)backend/app/routes/albums.py(1 hunks)backend/app/utils/images.py(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- backend/app/routes/albums.py
🔇 Additional comments (23)
backend/app/utils/images.py (2)
93-93: LGTM! Formatting improvement.Adding the trailing comma follows Python style guidelines and makes future parameter additions cleaner.
254-255: LGTM! Added return type annotation.The return type annotation improves type safety and code documentation. The trailing comma is also a nice formatting touch.
backend/app/database/albums.py (13)
42-61: LGTM! Proper connection cleanup added.The try/finally block ensures the connection is always closed, even on exceptions. Initializing
conn = Noneallows safe cleanup.
64-83: LGTM! Proper connection cleanup added.The try/finally block ensures the connection is always closed, even on exceptions.
86-97: LGTM! Connection leak fixed.The finally block ensures
conn.close()is called even if an exception occurs during query execution.
100-108: LGTM! Connection leak fixed.The finally block ensures proper cleanup on all code paths.
111-119: LGTM! Connection leak fixed.The finally block ensures proper cleanup on all code paths.
122-146: LGTM! Connection leak fixed.The try/finally block ensures the connection is always closed, even if password hashing or insertion fails.
149-184: LGTM! Password update logic corrected.The conditional logic now correctly preserves the existing password when
password is None, addressing the bug from the previous review.
187-190: LGTM! Context manager ensures foreign key enforcement.Using
get_db_connection()ensuresPRAGMA foreign_keys = ONis set, enabling CASCADE deletion toalbum_images. This addresses the concern from the previous review.
193-203: LGTM! Connection leak fixed.The finally block ensures proper cleanup on all code paths.
206-222: LGTM! Proper validation and context manager usage.The function validates that at least one of the provided image IDs exists before performing insertions, raising a clear error otherwise. The context manager ensures proper resource cleanup and foreign key enforcement.
225-241: LGTM! Proper validation and context manager usage.The function validates that the image exists in the album before deletion, providing clear error feedback. The context manager ensures proper resource cleanup.
244-254: LGTM! Connection leak fixed.The finally block ensures proper cleanup on all code paths.
257-269: LGTM! Connection leak fixed.The finally block ensures proper cleanup even if password verification fails.
backend/app/database/faces.py (8)
29-52: LGTM! Foreign key enforcement and connection leak fixed.The function now enables foreign key constraints via
PRAGMA foreign_keys = ONand ensures the connection is always closed via the finally block. This addresses the concerns from the previous review.
55-94: LGTM! Connection leak fixed.The try/finally block ensures the connection is always closed, even if JSON serialization or insertion fails.
217-241: LGTM! Connection leak fixed.The finally block ensures proper cleanup on all code paths.
244-246: LGTM! Return type annotation formatting.The extra parentheses around the return type are valid Python syntax for multi-line return type annotations, improving readability.
256-283: LGTM! Connection leak fixed.The finally block ensures proper cleanup on all code paths.
286-288: LGTM! Formatting improvement.The trailing comma follows Python style guidelines and makes future parameter additions cleaner.
309-328: LGTM! Connection leak fixed.The finally block ensures proper cleanup on all code paths.
342-381: LGTM! Connection leak fixed.The finally block ensures proper cleanup on all code paths.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
backend/app/database/faces.py (3)
73-94: Standardize resource management pattern across all database functions.
db_create_faces_tableuses the defensive pattern (conn = None+if conn is not Nonecheck in finally), while these six functions create the connection before the try block with unconditionalconn.close()in finally. Although technically safe (ifconnect()fails, the try block is never reached), this inconsistency makes the codebase harder to maintain and reason about.Adopt the defensive pattern consistently:
+conn = None +try: - conn = sqlite3.connect(DATABASE_PATH) - cursor = conn.cursor() - - try: + conn = sqlite3.connect(DATABASE_PATH) + conn.execute("PRAGMA foreign_keys = ON") + cursor = conn.cursor() ... finally: + if conn is not None: - conn.close() + conn.close()Also applies to: 145-146, 226-227, 255-256, 308-309, 341-342
246-248: Remove extraneous parentheses from return type annotation.The extra parentheses wrapping the return type serve no semantic purpose and appear to be an accidental formatting change rather than a meaningful API modification.
-def db_get_all_faces_with_cluster_names() -> ( - List[Dict[str, Union[FaceId, FaceEmbedding, Optional[str]]]] -): +def db_get_all_faces_with_cluster_names() -> List[Dict[str, Union[FaceId, FaceEmbedding, Optional[str]]]]:
289-289: Remove trailing comma from single-parameter function signature.The trailing comma after a single parameter is unusual and typically reserved for cases where additional parameters are expected. This appears to be an accidental formatting change.
def db_update_face_cluster_ids_batch( - face_cluster_mapping: List[Dict[str, Union[FaceId, ClusterId]]], + face_cluster_mapping: List[Dict[str, Union[FaceId, ClusterId]]] ) -> None:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/app/database/faces.py(6 hunks)backend/app/utils/images.py(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Backend Tests
🔇 Additional comments (3)
backend/app/utils/images.py (2)
101-103: LGTM! Clean type annotation improvement.The trailing comma and explicit
-> Nonereturn type follow Python best practices and improve code maintainability without changing behavior.
264-266: LGTM! Consistent formatting improvement.The trailing comma addition maintains consistency with Python conventions and the other function signature updates in this file.
backend/app/database/faces.py (1)
29-52: The script couldn't access the file, but the earlierrgoutput confirms 7sqlite3.connect()calls at lines 32, 73, 145, 226, 255, 308, and 341.The snippet shows line 32 (in
db_create_faces_table) has PRAGMA at line 33. To verify the review comment's claim that the remaining 6 functions lack this pragma:Please manually inspect or provide the code context for the following lines in
backend/app/database/faces.py:
- Lines 71-76 (around line 73 connect)
- Lines 143-148 (around line 145 connect)
- Lines 224-229 (around line 226 connect)
- Lines 253-258 (around line 255 connect)
- Lines 306-311 (around line 308 connect)
- Lines 339-344 (around line 341 connect)
Confirm whether each of these functions has
conn.execute("PRAGMA foreign_keys = ON")immediately after thesqlite3.connect()call.
rahulharpal1603
left a comment
There was a problem hiding this comment.
@Aditya30ag
Thanks a lot for the PR!
Sorry for a lot of delay though :)
Summary
Fixed critical database connection leaks across all database modules by adding proper finally blocks.
Changes
Files Modified
Testing
All database operations now properly clean up connections, resolving the critical connection leak issue.
Closes #464"
Summary by CodeRabbit
Bug Fixes
Refactor
Chores