Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is ReLU applied after residual connection in ResNet?

In the ResNet architecture, why is the ReLU activation applied after the element-wise addition with the residual in a residual block, instead of before it?

like image 942
Shen Zhuoran Avatar asked Mar 07 '23 18:03

Shen Zhuoran


1 Answers

Because it was proposed this way. Residual Connections have been investigated in the following work: https://arxiv.org/pdf/1603.05027.pdf and they have found, that Skip -> BN -> RELU -> Conv -> BN -> RELU -> Conv -> Add works best.

However, the differences in performance are negligible and therefore the original ResNet formulation prevailed. Still, you can read the paper if you want to know what works and what does not.

like image 79
Thomas Pinetz Avatar answered Apr 27 '23 03:04

Thomas Pinetz