Skip to content

RFDETR CrowdHuman fine tuning #674

@frittatelle

Description

@frittatelle

Hi!
I'm trying to fine tune a RFDETRSmall model on the CrowdHuman dataset that contains a bunch of overlapping boxes and a high density of detections in each image of the dataset.
The dataset is quite big (15000 images for training, ~3000 val, ~3000 test) and I'm trying to find the best parameters for fine tuning the model.
This is my simple current setup (RTX3090).

    # model
    model = RFDETRSmall(
        pretrain_weights="checkpoints/rf-detr-small-topdown-hearty-armadillo-5.pth"
    )

    model.train(
        dataset_dir=dataset_dir,
        epochs=30,
        batch_size=8,
        grad_accum_steps=4,
        lr=5e-5,
        patience=10,
        early_stopping=True,
        output_dir=output_dir,
        wandb=True,
    )

The final goal would be to distinguish hugged people for a security application. The pretrained RFDETRSmall often fails this task and classifies hugged people as a single blob, therefore I'm trying to get better results by training on a high density detection dataset.
The problem I'm facing is that the loss is basically flat and the cardinality error which is my most important metric here is never going down.
The following is a snippet of the logs of the last training experiment.

Epoch: [6]  [130/375]  eta: 0:02:57  lr: 0.000020  class_error: 0.00  loss: 5.8841 (6.2072)  loss_ce: 0.6394 (0.6561)  loss_bbox: 0.1946 (0.2135)  loss_giou: 0.5716 (0.6327)  loss_ce_0: 0.6761 (0.6853)  loss_bbox_0: 0.2103 (0.2254)  loss_giou_0: 0.5967 (0.6541)  loss_ce_1: 0.6477 (0.6630)  loss_bbox_1: 0.1962 (0.2170)  loss_giou_1: 0.5819 (0.6404)  loss_ce_enc: 0.6746 (0.6771)  loss_bbox_enc: 0.2334 (0.2524)  loss_giou_enc: 0.6250 (0.6903)  loss_ce_unscaled: 0.6394 (0.6561)  class_error_unscaled: 0.0000 (0.0000)  loss_bbox_unscaled: 0.0389 (0.0427)  loss_giou_unscaled: 0.2858 (0.3163)  cardinality_error_unscaled: 24.5000 (27.5401)  loss_ce_0_unscaled: 0.6761 (0.6853)  loss_bbox_0_unscaled: 0.0421 (0.0451)  loss_giou_0_unscaled: 0.2983 (0.3271)  cardinality_error_0_unscaled: 24.5000 (27.5401)  loss_ce_1_unscaled: 0.6477 (0.6630)  loss_bbox_1_unscaled: 0.0392 (0.0434)  loss_giou_1_unscaled: 0.2909 (0.3202)  cardinality_error_1_unscaled: 24.5000 (27.5401)  loss_ce_enc_unscaled: 0.6746 (0.6771)  loss_bbox_enc_unscaled: 0.0467 (0.0505)  loss_giou_enc_unscaled: 0.3125 (0.3451)  cardinality_error_enc_unscaled: 24.5000 (27.5401)  time: 0.7002  data: 0.0175  max mem: 18269
2026-02-12 11:36:35

Epoch: [6]  [140/375]  eta: 0:02:47  lr: 0.000020  class_error: 0.00  loss: 5.8714 (6.1737)  loss_ce: 0.6487 (0.6544)  loss_bbox: 0.1979 (0.2141)  loss_giou: 0.5658 (0.6250)  loss_ce_0: 0.6803 (0.6841)  loss_bbox_0: 0.2095 (0.2260)  loss_giou_0: 0.5844 (0.6460)  loss_ce_1: 0.6527 (0.6615)  loss_bbox_1: 0.2055 (0.2178)  loss_giou_1: 0.5665 (0.6326)  loss_ce_enc: 0.6733 (0.6764)  loss_bbox_enc: 0.2305 (0.2533)  loss_giou_enc: 0.6226 (0.6825)  loss_ce_unscaled: 0.6487 (0.6544)  class_error_unscaled: 0.0000 (0.0000)  loss_bbox_unscaled: 0.0396 (0.0428)  loss_giou_unscaled: 0.2829 (0.3125)  cardinality_error_unscaled: 22.5000 (26.9663)  loss_ce_0_unscaled: 0.6803 (0.6841)  loss_bbox_0_unscaled: 0.0419 (0.0452)  loss_giou_0_unscaled: 0.2922 (0.3230)  cardinality_error_0_unscaled: 22.5000 (26.9663)  loss_ce_1_unscaled: 0.6527 (0.6615)  loss_bbox_1_unscaled: 0.0411 (0.0436)  loss_giou_1_unscaled: 0.2833 (0.3163)  cardinality_error_1_unscaled: 22.5000 (26.9663)  loss_ce_enc_unscaled: 0.6733 (0.6764)  loss_bbox_enc_unscaled: 0.0461 (0.0507)  loss_giou_enc_unscaled: 0.3113 (0.3413)  cardinality_error_enc_unscaled: 22.5000 (26.9663)  time: 0.6730  data: 0.0167  max mem: 18269

Do you have any suggestion on how to proceed?
Thanks in advance

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions