以下图的全连接网络为例,推导反向传播修改参数的过程。

假设神经元中不考虑偏置和激活函数的影响。

假设训练集的某个输入和预期输出为x1=40, x2=80, y=60,第一层参数初始时为0.5,第二层参数初始时是1,学习率为1e-5,并且损失函数:

Loss = \frac{1}{2}(y-y^{'})^{2}

step1:前向传播

从x1和x2输入开始前向计算中间神经元z和输出神经元y的值:

z_{1} = w_{11}^{1} x_{1} + w_{21}^{1} x_{2} = 0.5 * 40 + 0.5 * 80 = 60

z_{2} = w_{12}^{1} x_{1} + w_{22}^{1} x_{2} = 0.5 * 40 + 0.5 * 80 = 60 

z_{3} = w_{13}^{1} x_{1} + w_{23}^{1} x_{2} = 0.5 * 40 + 0.5 * 80 = 60 

y^{'} = w_{11}^{2} z_{1} + w_{21}^{2} z_{2} + w_{31}^{2} z_{3}= 1 * 60 + 1 * 60 + 1 * 60 = 180 

step2:计算损失值

通过定义好的损失函数计算损失值:

Loss = \frac{1}{2}(y-y^{'})^{2} = \frac{1}{2}(60-180)^{2} = 7200

step3:反向传播

3.1 计算梯度

因为我们的目标是让损失函数计算出的结果慢慢减小,方法是让每个参数对损失函数求导计算梯度,首先计算对y^{'}的导数:

\partial L / \partial y^{'} = \frac{1}{2} \cdot 2 \cdot (y - y^{'})\cdot (-1) = 180 -60 = 120

计算第二层w参数的梯度:

\partial L / \partial w_{11}^{2} = \partial L / \partial y^{'}\cdot \partial y^{'} / \partial w_{11}^{2} = 120 \cdot z1 = 120 * 60 = 7200

\partial L / \partial w_{21}^{2} = \partial L / \partial y^{'}\cdot \partial y^{'} / \partial w_{21}^{2} = 120 \cdot z2 = 120 * 60 = 7200

\partial L / \partial w_{31}^{2} = \partial L / \partial y^{'}\cdot \partial y^{'} / \partial w_{31}^{2} = 120 \cdot z1 = 120 * 60 = 7200 

计算第一层x1节点w参数的梯度:

\frac{\partial L}{\partial w_{11}^{1}} = \frac{\partial L}{\partial y^{'}} \cdot \frac{\partial y^{'}}{\partial z_{1}} \cdot \frac{\partial z_{1}}{\partial w_{11}^{1}} = 120 * w_{11}^{2} * x_{1} = 120 * 1 * 40 = 4800 

\frac{\partial L}{\partial w_{12}^{1}} = \frac{\partial L}{\partial y^{'}} \cdot \frac{\partial y^{'}}{\partial z_{2}} \cdot \frac{\partial z_{2}}{\partial w_{12}^{1}} = 120 * w_{21}^{2} * x_{1} = 120 * 1 * 40 = 4800 

\frac{\partial L}{\partial w_{13}^{1}} = \frac{\partial L}{\partial y^{'}} \cdot \frac{\partial y^{'}}{\partial z_{3}} \cdot \frac{\partial z_{3}}{\partial w_{13}^{1}} = 120 * w_{31}^{2} * x_{1} = 120 * 1 * 40 = 4800 

计算第一层x2节点w参数的梯度:

 \frac{\partial L}{\partial w_{21}^{1}} = \frac{\partial L}{\partial y^{'}} \cdot \frac{\partial y^{'}}{\partial z_{1}} \cdot \frac{\partial z_{1}}{\partial w_{21}^{1}} = 120 * w_{11}^{2} * x_{2} = 120 * 1 * 80 = 9600

\frac{\partial L}{\partial w_{22}^{1}} = \frac{\partial L}{\partial y^{'}} \cdot \frac{\partial y^{'}}{\partial z_{2}} \cdot \frac{\partial z_{2}}{\partial w_{22}^{1}} = 120 * w_{21}^{2} * x_{2} = 120 * 1 * 80 = 9600 

\frac{\partial L}{\partial w_{23}^{1}} = \frac{\partial L}{\partial y^{'}} \cdot \frac{\partial y^{'}}{\partial z_{3}} \cdot \frac{\partial z_{3}}{\partial w_{23}^{1}} = 120 * w_{31}^{2} * x_{2} = 120 * 1 * 80 = 9600 

3.2 更新梯度

梯度更新的公式:

w = w - \eta \frac{\partial L}{\partial w}

反向更新第二层的梯度:

w_{11}^{2} = w_{11}^{2} - \eta \frac{\partial L}{\partial w_{11}^{2}} = 1 - 10^{-5} * 7200 = 0.928

w_{21}^{2} = w_{21}^{2} - \eta \frac{\partial L}{\partial w_{21}^{2}} = 1 - 10^{-5} * 7200 = 0.928 

w_{31}^{2} = w_{31}^{2} - \eta \frac{\partial L}{\partial w_{31}^{2}} = 1 - 10^{-5} * 7200 = 0.928

反向更新第一层的梯度:

w_{11}^{1} = w_{11}^{1} - \eta \frac{\partial L}{\partial w_{11}^{1}} = 0.5 - 10^{-5} * 4800 = 0.452 

w_{12}^{1} = w_{12}^{1} - \eta \frac{\partial L}{\partial w_{12}^{1}} = 0.5 - 10^{-5} * 4800 = 0.452 

w_{13}^{1} = w_{13}^{1} - \eta \frac{\partial L}{\partial w_{13}^{1}} = 0.5 - 10^{-5} * 4800 = 0.452

w_{21}^{1} = w_{21}^{1} - \eta \frac{\partial L}{\partial w_{21}^{1}} = 0.5 - 10^{-5} * 9600 = 0.404

w_{22}^{1} = w_{22}^{1} - \eta \frac{\partial L}{\partial w_{22}^{1}} = 0.5 - 10^{-5} * 9600 = 0.404 

w_{23}^{1} = w_{23}^{1} - \eta \frac{\partial L}{\partial w_{23}^{1}} = 0.5 - 10^{-5} * 9600 = 0.404 

step4:重新进行前向传播

z_{1} = w_{11}^{1} x_{1} + w_{21}^{1} x_{2} = 0.452 * 40 + 0.408 * 80 = 50.4

z_{2} = w_{12}^{1} x_{1} + w_{22}^{1} x_{2} = 0.452 * 40 + 0.408 * 80 = 50.4 

z_{3} = w_{13}^{1} x_{1} + w_{23}^{1} x_{2} = 0.452 * 40 + 0.408 * 80 = 50.4 

y^{'} = w_{11}^{2} z_{1} + w_{21}^{2} z_{2} + w_{31}^{2} z_{3}= 0.928 * 50.4 * 3 = 140.3136

step5:重新计算误差

Loss = \frac{1}{2}(y-y^{'})^{2} = \frac{1}{2}(60-140.3136)^{2} = 3225.137 < 7200

可以看到,误差比之前小,之后只要重复上述步骤直到模型收敛为止。

Logo

有“AI”的1024 = 2048,欢迎大家加入2048 AI社区

更多推荐