When I read the mmdet/distillation/losses/fgd.py, there is something I can't understand.
- In the init function of the
FeatureLoss, at the end of that function, after all the initialization, why do they reset params (line 57)? This reset makes the last layers' weights of self.channel_add_conv_t and self.channel_add_conv_s become zeros the outputs of these layers are zeros?
- Are Gcblocks trained with student? I think when we use the model for our inferences, the Gcblocks are eliminated so they don't have to train, do they?
When I read the
mmdet/distillation/losses/fgd.py, there is something I can't understand.FeatureLoss, at the end of that function, after all the initialization, why do they reset params (line 57)? This reset makes the last layers' weights ofself.channel_add_conv_tandself.channel_add_conv_sbecome zeros the outputs of these layers are zeros?