No way to freeze fused BN stats

When fine-tuning networks trained with BN sometimes we want to freeze and use the accumulated moving averages while allowing the gradients to be backpropagated through the BN layer, but currently there is no way of doing so with fused BN, since when is_training = False the layer gives erroneous gradients. Of course, we could use the batch statistics from the new task to accumulate the stats, but it isn't possible in the case of batch_size = 1.

I understand that due to the nature of the CuDNN kernel it might be hard to implement such feature, but a fused Batch Renorm layer could be a decent compromise, as it uses the moving averages when training as well as during inference.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No way to freeze fused BN stats #10857

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

No way to freeze fused BN stats #10857

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions