Skip to content

Add Optional Batched Yielding to QuerySet.iterator() and QuerySet.aiterator() #103

@JohananOppongAmoateng

Description

@JohananOppongAmoateng

Code of Conduct

  • I agree to follow Django's Code of Conduct

Feature Description

This proposal suggests extending Django’s QuerySet.iterator() and QuerySet.aiterator() methods with an optional parameter (e.g. batch_yield_size) that allows them to yield lists of model instances in batches, rather than yielding one object at a time.

The goal is to combine the memory efficiency of streaming querysets with the convenience of grouped (batched) processing.

Problem

Currently, QuerySet.iterator() and QuerySet.aiterator() always yield one model instance per iteration. While this is ideal for memory efficiency, it becomes inconvenient or inefficient for common batch-oriented workflows, such as:

  • Bulk writes to external systems (APIs, message queues, file exports)
  • Batch validation or transformation pipelines
  • Chunked background processing
  • Network or I/O bound operations where per-object overhead is significant

Developers often work around this by manually buffering results into lists, for example:

buffer = []
for obj in queryset.iterator():
    buffer.append(obj)
    if len(buffer) == 1000:
        process(buffer)
        buffer.clear()

This pattern is repetitive, error-prone, and obscures intent. Django already supports database-level chunking via chunk_size, but there is no built-in way to express logical batch processing at the iteration level.

Request or proposal

proposal

Additional Details

This feature is orthogonal to chunk_size, which controls how many rows are fetched from the database cursor.

It differs from Paginator in that:

It does not require counting the full queryset.

It supports true streaming over large datasets.

The feature is especially useful for async workflows, where per-object await overhead is high and batching significantly improves throughput.

The default behavior would remain unchanged; batching would be opt-in.

Implementation Suggestions

Synchronous Example

for batch in MyModel.objects.iterator(

    chunk_size=2000,
    batch_yield_size=500,
):
    send_to_api(batch)

Proposed behavior:

Internally fetch rows using chunk_size

Yield lists of up to batch_yield_size model instances

The final batch may be smaller

Asynchronous Example

async for batch in MyModel.objects.aiterator(
    chunk_size=2000,
    batch_yield_size=500,
):
    await send_to_api(batch)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Idea

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions