YOLOv5-4.0-activations.py 源代码导读(激活函数)

  • Post author:
  • Post category:其他




YOLOv5介绍

YOLOv5为兼顾速度与性能的目标检测算法。笔者将在近期更新一系列YOLOv5的代码导读博客。YOLOv5为2021.1.5日发布的4.0版本。


YOLOv5开源项目github网址



源代码导读汇总网址


本博客导读的代码为utils文件夹下的

activations.py

最后更新日期为2021.1.7



激活函数-数学表达

首先介绍YOLOv5激活函数中涉及到的函数其具体形式:

Sigmoid激活函数:





S

i

g

m

o

i

d

(

x

)

=

1

1

+

e

x

=

e

x

e

x

+

1

Sigmoid(x) = \frac{1}{1+e^{-x} } = \frac{e^{x}}{e^{x}+1 }






S


i


g


m


o


i


d


(


x


)




=



















1




+





e














x






















1






















=




















e











x












+




1















e











x

































Swish激活函数:





S

w

i

s

h

(

x

)

=

x

1

+

e

β

x

Swish(x) = \frac{x}{1+e^{-\beta x}}






S


w


i


s


h


(


x


)




=



















1




+





e














β


x






















x

























ReLU激活函数:





R

e

L

U

(

x

)

=

{

x

,

if 

x

 

>

 0

0

,

if 

x

 

 0

=

m

a

x

(

0

,

x

)

ReLU(x)= \begin{cases} x, & \text {if $x$ $>$ 0} \\ 0, & \text{if $x$ $\leq$ 0} \end{cases} = max(0,x)






R


e


L


U


(


x


)




=










{














x


,








0


,




























if


x






>




0










if


x











0

























=








m


a


x


(


0


,




x


)







ReLU6限制上述函数输出在6以内:





R

e

L

U

6

(

x

)

=

m

i

n

(

6

,

m

a

x

(

0

,

x

)

)

ReLU6(x) = min(6,max(0,x))






R


e


L


U


6


(


x


)




=








m


i


n


(


6


,




m


a


x


(


0


,




x


)


)







Hard-swish激活函数:





H

a

r

d

s

w

i

s

h

(

x

)

=

x

R

e

L

U

6

(

x

+

3

)

6

Hard-swish(x) = \frac{x \ast ReLU6(x+3)}{6 }






H


a


r


d













s


w


i


s


h


(


x


)




=



















6














x









R


e


L


U


6


(


x




+




3


)

























Swish激活函数求导过程(



β

=

1

取\beta=1









β




=








1





):





S

w

i

s

h

(

x

)

=

1

+

e

x

+

x

e

x

(

1

+

e

x

)

2

=

1

1

+

e

x

+

x

e

x

(

1

+

e

x

)

2

Swish^{‘}(x) = \frac{1+e^{-x}+xe^{-x}}{(1+e^{-x})^2} = \frac{1}{1+e^{-x} } + \frac{xe^{-x}}{(1+e^{-x})^2}






S


w


i


s



h










































(


x


)




=



















(


1




+





e














x











)










2





















1




+





e














x












+




x



e














x






























=



















1




+





e














x






















1






















+



















(


1




+





e














x











)










2





















x



e














x





































=

1

1

+

e

x

+

x

1

+

e

x

(

1

1

1

+

e

x

)

=\frac{1}{1+e^{-x} } + \frac{x}{1+e^{-x} } \ast (1 – \frac{1}{1+e^{-x}} )






=



















1




+





e














x






















1






















+



















1




+





e














x






















x































(


1
























1




+





e














x






















1




















)







tanh函数:





t

a

n

h

(

x

)

=

e

x

e

x

e

x

+

e

x

tanh(x) = \frac{e^{x}-e^{-x}}{e^{x}+e^{-x} }






t


a


n


h


(


x


)




=




















e











x












+





e














x























e











x


















e














x

































softplus函数(ReLU函数的平滑版):





s

o

f

t

p

l

u

s

(

x

)

=

l

n

(

1

+

e

x

)

softplus(x) = ln({1+e^{x} })






s


o


f


t


p


l


u


s


(


x


)




=








l


n


(



1




+





e











x











)







Mish激活函数:





M

i

s

h

(

x

)

=

x

t

a

n

h

(

s

o

f

t

p

l

u

s

(

x

)

)

Mish(x) = x \ast tanh(softplus(x))






M


i


s


h


(


x


)




=








x













t


a


n


h


(


s


o


f


t


p


l


u


s


(


x


)


)







Mish激活函数求导过程:





M

i

s

h

(

x

)

=

t

a

n

h

(

s

o

f

t

p

l

u

s

(

x

)

)

+

x

(

t

a

n

h

(

s

o

f

t

p

l

u

s

(

x

)

)

)

Mish^{‘}(x) = tanh(softplus(x))+x \ast (tanh(softplus(x)))^{‘}






M


i


s



h










































(


x


)




=








t


a


n


h


(


s


o


f


t


p


l


u


s


(


x


)


)




+








x













(


t


a


n


h


(


s


o


f


t


p


l


u


s


(


x


)


)



)



















































f

x

=

t

a

n

h

(

s

o

f

t

p

l

u

s

(

x

)

)

设fx = tanh(softplus(x))









f


x




=








t


a


n


h


(


s


o


f


t


p


l


u


s


(


x


)


)











M

i

s

h

(

x

)

=

f

x

+

x

s

o

f

t

p

l

u

s

(

x

)

t

a

n

h

(

s

o

f

t

p

l

u

s

(

x

)

)

Mish^{‘}(x) = fx + x \ast softplus^{‘}(x) \ast tanh^{‘}(softplus(x))






M


i


s



h










































(


x


)




=








f


x




+








x













s


o


f


t


p


l


u



s










































(


x


)













t


a


n



h










































(


s


o


f


t


p


l


u


s


(


x


)


)











s

o

f

t

p

l

u

s

(

x

)

=

l

n

(

1

+

e

x

)

=

e

x

e

x

+

1

=

s

i

g

m

o

i

d

(

x

)

softplus^{‘}(x) = ln({1+e^{x} })^{‘} = \frac{e^{x}}{e^{x}+1} = sigmoid(x)






s


o


f


t


p


l


u



s










































(


x


)




=








l


n


(



1




+





e











x












)












































=




















e











x












+




1















e











x






























=








s


i


g


m


o


i


d


(


x


)











t

a

n

h

(

x

)

=

1

t

a

n

h

2

(

x

)

tanh^{‘}(x) = 1 – tanh^{2}(x)






t


a


n



h










































(


x


)




=








1













t


a


n



h











2










(


x


)







则:




M

i

s

h

(

x

)

=

f

x

+

x

s

i

g

m

o

i

d

(

x

)

(

1

(

f

x

)

2

)

Mish^{‘}(x) = fx + x \ast sigmoid(x) \ast (1 – (fx)^{2})






M


i


s



h










































(


x


)




=








f


x




+








x













s


i


g


m


o


i


d


(


x


)













(


1













(


f


x



)











2










)







activations.py

该文件定义了Yolov5中的各种激活函数。各种激活函数的表达形式及求导过程均总结如上。

# Activation functions
#定义了网络架构中使用的激活函数

import torch
import torch.nn as nn
import torch.nn.functional as F

swish类定义了Swish激活函数

# Swish https://arxiv.org/pdf/1905.02244.pdf ---
class Swish(nn.Module):  #Swish激活函数
    @staticmethod
    def forward(x): # 此处beta定为1
        return x * torch.sigmoid(x)

hardswish类定义了hardswish激活函数

class Hardswish(nn.Module):  # nn.Hardswish()的输出友好版本
    @staticmethod
    def forward(x): #数学形式见上文
        # return x * F.hardsigmoid(x)  # for torchscript and CoreML
        return x * F.hardtanh(x + 3, 0., 6.) / 6.  # for torchscript, CoreML and ONNX

高效应用swish激活函数的类

class MemoryEfficientSwish(nn.Module): #节省内存的Swish 不采用自动求导
    class F(torch.autograd.Function):
        @staticmethod
        def forward(ctx, x):
            #save_for_backward会保留x的全部信息(一个完整的外挂Autograd Function的Variable), 
            #并提供避免in-place操作导致的input在backward被修改的情况.
            #in-place操作指不通过中间变量计算的变量间的操作。
            ctx.save_for_backward(x)
            return x * torch.sigmoid(x)

        @staticmethod
        def backward(ctx, grad_output):
            #此处saved_tensors[0] 作用同上文 save_for_backward
            x = ctx.saved_tensors[0]
            sx = torch.sigmoid(x)
            # 返回该激活函数求导之后的结果 求导过程见上文
            return grad_output * (sx * (1 + x * (1 - sx))) 

    def forward(self, x): #应用前向传播方法
        return self.F.apply(x)

Mish类定义了Mish激活函数

# Mish https://github.com/digantamisra98/Mish ----
class Mish(nn.Module):   #这里定义了Mish激活函数
    @staticmethod           #数学公式见上
    def forward(x):
        return x * F.softplus(x).tanh()

高效应用Mish激活函数的类

class MemoryEfficientMish(nn.Module): # 节省内存版
    class F(torch.autograd.Function):        #该类形式内容同上一个类
        @staticmethod
        def forward(ctx, x):
            ctx.save_for_backward(x)
            return x.mul(torch.tanh(F.softplus(x)))  # x * tanh(ln(1 + exp(x)))

        @staticmethod
        def backward(ctx, grad_output):
            x = ctx.saved_tensors[0]
            sx = torch.sigmoid(x)    # sx = sigmoid(x)
            fx = F.softplus(x).tanh() # fx = tanh(softplus(x))
            return grad_output * (fx + x * sx * (1 - fx * fx))

    def forward(self, x):
        return self.F.apply(x)

定义了FReLU激活函数

# FReLU https://arxiv.org/abs/2007.11824 ---
# 旷视研究院2020ECCV 提出的激活函数 (形式真的简单)
class FReLU(nn.Module):
    def __init__(self, c1, k=3):  # ch_in, kernel
        super().__init__()
        self.conv = nn.Conv2d(c1, c1, k, 1, 1, groups=c1)
        self.bn = nn.BatchNorm2d(c1)

    def forward(self, x):
        return torch.max(x, self.bn(self.conv(x)))



版权声明:本文为weixin_42716570原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。