Skip to content

perf: reduce memory consumption across services#1047

Open
nledez wants to merge 2 commits intocgwire:mainfrom
nledez:perf/reduce-memory-consumption
Open

perf: reduce memory consumption across services#1047
nledez wants to merge 2 commits intocgwire:mainfrom
nledez:perf/reduce-memory-consumption

Conversation

@nledez
Copy link
Copy Markdown
Contributor

@nledez nledez commented Apr 7, 2026

Problem
Several SQLAlchemy patterns cause excessive memory usage: Task.assignees is eagerly loaded via selectin on every query (70-80% don't need it), get_comments() triggers N+1 queries fetching persons one by one, backup_service loads all preview files into memory at once, and playlists_service fetches full ORM objects when only 2 fields are needed.

Solution

  • Fix N+1 in get_comments() by batch-fetching persons with get_persons_by_ids()
  • Stream preview files in backup_service with yield_per(500)
  • Use with_entities(id, extension) in playlists_service instead of full PreviewFile objects
  • Replace manual if key not in dict patterns with defaultdict in projects_service and time_spents_service
  • Flush index documents in batches in index_service instead of accumulating indefinitely
  • Change Task.assignees from lazy="selectin" to default lazy loading, add explicit selectinload only in the 3 bulk access sites (CSV export, schedule service, deletion service)

assignees = db.relationship(
"Person", secondary=TaskPersonLink.__table__, lazy="selectin"
)
assignees = db.relationship("Person", secondary=TaskPersonLink.__table__)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the selectin selection. It is more adapted to our case, where most consuming queries need it.

Copy link
Copy Markdown
Contributor

@frankrousseau frankrousseau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments

nledez added 2 commits April 14, 2026 14:32
  - Fix N+1 queries in get_comments() by batch-fetching persons
  - Stream preview files in backup_service with yield_per(500)
  - Use with_entities for preview file lookup in playlists_service
  - Replace manual dict init patterns with defaultdict
  - Flush index documents in batches instead of accumulating
  - Change Task.assignees from eager selectin to lazy loading,
  add explicit selectinload only where assignees are accessed
  Use yield_per(500) and Flask streaming response to avoid loading
  entire query results and building the full CSV string in memory.
@nledez nledez force-pushed the perf/reduce-memory-consumption branch from bb5f96b to 0340b9f Compare April 14, 2026 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants