Skip to content

Legacy sws_scale() prevents multi-threaded color conversion #1303

@mscheltienne

Description

@mscheltienne

🚀 The feature

Following up on #1215, the current implementation of sws_scale is a bottleneck when decoding 4K videos. On the plot below, you can see that decoding is capped at ~20ms regardless of the number of threads and seeking mode of the decoder (CPU decoding).

Image

After some investigation, it turns out the YUV -> RGB conversion is dominating the decoding time and is very consistent. torchcodec uses the legacy sws_scale() FFmpeg function for YUV→RGB color conversion. This function is always single-threaded. FFmpeg 7+ introduced sws_scale_frame() with a threads option on SwsContext that parallelizes the conversion across CPU cores. PyAV and the ffmpeg CLI already use this new API, achieving roughly 2x faster conversion on 4K content.

Motivation, pitch

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions