# cross-entropy(交叉熵)损失函数
cross-entropy(交叉熵)参考文献https://ml-cheatsheet.readthedocs.io/en/latest/loss_functions.html#cross-entropy交叉熵是个数,是机器学习损失函数中的一种。交叉熵损失函数也叫做log损失函数。交叉熵损失函数measures the performance of a classification mod...
cross-entropy(交叉熵)
参考文献
https://ml-cheatsheet.readthedocs.io/en/latest/loss_functions.html#cross-entropy
交叉熵是个函数,是机器学习损失函数中的一种。
交叉熵损失函数也叫做log损失函数。交叉熵损失函数measures the performance of a classification model whose output is a probability value between 0 and 1.
Cross-entropy loss increases as the predicted probability diverges from the actual label. For example, a probability of 0.012 when the actual label is 1 would be bad and result in a high loss value. A perfect model would have a log loss of 0.
在不同背景下,Cross-entropy and log loss are slightly different depending on the context, but in machine learning when calculating error rates between 0 and 1 they resolve to the same thing.
code
def CrossEntropy(yHat, y):
if y == 1:
return -log(yHat);
else:
return -log(1-yHat);
其中log为自然对数。
所以,在二分类问题中,一个样本的交叉熵可以这样计算:
− ( y log ( p ) + ( 1 − y ) log ( 1 − p ) ) -(y \log (p)+(1-y) \log (1-p)) −(ylog(p)+(1−y)log(1−p))
其中,y是样本的真实标签,所以
当y = 1时,cross-entropy = -log p
当y = 0时,cross-entropy = -log(1-p)
当是多分类问题时,设类数为M, 一个样本的交叉熵为:
− ∑ c = 1 M y o , c log ( p o , c ) -\sum_{c=1}^{M} y_{o, c} \log \left(p_{o, c}\right) −c=1∑Myo,clog(po,c)
- M: number of classes (dog, cat, fish)
- log: the natural log
- y o , c y_{o,c} yo,c: binary indicator(0 or 1) if class label c is the correct classification for observation o.
- p o , c p_{o,c} po,c: predicted probability observation o is of class c.
所以本质上,一个样本的交叉熵是:该样本被预测为真实分类的概率的自然对数的相反数。因为非真实分类的 y o , c = 0 y_{o,c} = 0 yo,c=0.
我突然发现,把所有在0-1之间的打分都理解为概率,似乎一切都变得好理解了。
总结
交叉熵是某一个样本的函数,自变量有:
- 该样本被预测为各类的概率
- 该样本的真实分类
一个样本的交叉熵是:该样本被预测为真实分类的概率的自然对数的相反数。
更多推荐
所有评论(0)