-
Notifications
You must be signed in to change notification settings - Fork 684
Description
Hi!
I'm trying to fine tune a RFDETRSmall model on the CrowdHuman dataset that contains a bunch of overlapping boxes and a high density of detections in each image of the dataset.
The dataset is quite big (15000 images for training, ~3000 val, ~3000 test) and I'm trying to find the best parameters for fine tuning the model.
This is my simple current setup (RTX3090).
# model
model = RFDETRSmall(
pretrain_weights="checkpoints/rf-detr-small-topdown-hearty-armadillo-5.pth"
)
model.train(
dataset_dir=dataset_dir,
epochs=30,
batch_size=8,
grad_accum_steps=4,
lr=5e-5,
patience=10,
early_stopping=True,
output_dir=output_dir,
wandb=True,
)
The final goal would be to distinguish hugged people for a security application. The pretrained RFDETRSmall often fails this task and classifies hugged people as a single blob, therefore I'm trying to get better results by training on a high density detection dataset.
The problem I'm facing is that the loss is basically flat and the cardinality error which is my most important metric here is never going down.
The following is a snippet of the logs of the last training experiment.
Epoch: [6] [130/375] eta: 0:02:57 lr: 0.000020 class_error: 0.00 loss: 5.8841 (6.2072) loss_ce: 0.6394 (0.6561) loss_bbox: 0.1946 (0.2135) loss_giou: 0.5716 (0.6327) loss_ce_0: 0.6761 (0.6853) loss_bbox_0: 0.2103 (0.2254) loss_giou_0: 0.5967 (0.6541) loss_ce_1: 0.6477 (0.6630) loss_bbox_1: 0.1962 (0.2170) loss_giou_1: 0.5819 (0.6404) loss_ce_enc: 0.6746 (0.6771) loss_bbox_enc: 0.2334 (0.2524) loss_giou_enc: 0.6250 (0.6903) loss_ce_unscaled: 0.6394 (0.6561) class_error_unscaled: 0.0000 (0.0000) loss_bbox_unscaled: 0.0389 (0.0427) loss_giou_unscaled: 0.2858 (0.3163) cardinality_error_unscaled: 24.5000 (27.5401) loss_ce_0_unscaled: 0.6761 (0.6853) loss_bbox_0_unscaled: 0.0421 (0.0451) loss_giou_0_unscaled: 0.2983 (0.3271) cardinality_error_0_unscaled: 24.5000 (27.5401) loss_ce_1_unscaled: 0.6477 (0.6630) loss_bbox_1_unscaled: 0.0392 (0.0434) loss_giou_1_unscaled: 0.2909 (0.3202) cardinality_error_1_unscaled: 24.5000 (27.5401) loss_ce_enc_unscaled: 0.6746 (0.6771) loss_bbox_enc_unscaled: 0.0467 (0.0505) loss_giou_enc_unscaled: 0.3125 (0.3451) cardinality_error_enc_unscaled: 24.5000 (27.5401) time: 0.7002 data: 0.0175 max mem: 18269
2026-02-12 11:36:35
Epoch: [6] [140/375] eta: 0:02:47 lr: 0.000020 class_error: 0.00 loss: 5.8714 (6.1737) loss_ce: 0.6487 (0.6544) loss_bbox: 0.1979 (0.2141) loss_giou: 0.5658 (0.6250) loss_ce_0: 0.6803 (0.6841) loss_bbox_0: 0.2095 (0.2260) loss_giou_0: 0.5844 (0.6460) loss_ce_1: 0.6527 (0.6615) loss_bbox_1: 0.2055 (0.2178) loss_giou_1: 0.5665 (0.6326) loss_ce_enc: 0.6733 (0.6764) loss_bbox_enc: 0.2305 (0.2533) loss_giou_enc: 0.6226 (0.6825) loss_ce_unscaled: 0.6487 (0.6544) class_error_unscaled: 0.0000 (0.0000) loss_bbox_unscaled: 0.0396 (0.0428) loss_giou_unscaled: 0.2829 (0.3125) cardinality_error_unscaled: 22.5000 (26.9663) loss_ce_0_unscaled: 0.6803 (0.6841) loss_bbox_0_unscaled: 0.0419 (0.0452) loss_giou_0_unscaled: 0.2922 (0.3230) cardinality_error_0_unscaled: 22.5000 (26.9663) loss_ce_1_unscaled: 0.6527 (0.6615) loss_bbox_1_unscaled: 0.0411 (0.0436) loss_giou_1_unscaled: 0.2833 (0.3163) cardinality_error_1_unscaled: 22.5000 (26.9663) loss_ce_enc_unscaled: 0.6733 (0.6764) loss_bbox_enc_unscaled: 0.0461 (0.0507) loss_giou_enc_unscaled: 0.3113 (0.3413) cardinality_error_enc_unscaled: 22.5000 (26.9663) time: 0.6730 data: 0.0167 max mem: 18269
Do you have any suggestion on how to proceed?
Thanks in advance