KL距离
KL距离全称:Kullback-Leibler差异(Kullback-Leibler)又称:相对熵(relative entropy)数学本质:衡量相同事件空间里两个概率分布相对差距的测度定义:D(p∣∣q)=∑x∈Xp(x)logp(x)q(x)D(p||q)= \sum_{x \in X} p(x) log \frac {p(x)}{q(x)} D(p∣∣q)=x∈X∑p(...
KL距离
- 全称:
Kullback-Leibler差异(Kullback-Leibler divergence) - 又称:
相对熵(relative entropy) - 数学本质:
衡量相同事件空间里两个概率分布相对差距的测度 - 定义:
D(p∣∣q)=∑x∈Xp(x)logp(x)q(x)D(p||q)= \sum_{x \in X} p(x) log \frac {p(x)}{q(x)} D(p∣∣q)=x∈X∑p(x)logq(x)p(x)
其中,p(x)p(x)p(x)与q(x)q(x)q(x)是两个概率分布。
定义中约定:
0log(0/q)=00log(0/q)=00log(0/q)=0、plog(p/0)=∞plog(p/0)=\inftyplog(p/0)=∞
等价形式:
D(p∣∣q)=Ep[logp(X)q(X)]D(p||q)=E_{p}[log\frac{p(X)}{q(X)}]D(p∣∣q)=Ep[logq(X)p(X)]
-
说明:
- 两个概率分布的差距越大,KL距离越大;
- 当两个概率分布相同时,KL距离为0
-
推论:
-
互信息衡量一个联合分布与独立性有多大的差距:
I(X;Y)=X(X)−H(X∣Y)=−∑x∈Xp(x)logp(x)+∑x∈X∑y∈Yp(x,y)logp(x∣y)=∑x∈X∑y∈Yp(x,y)logp(x∣y)p(x)=∑x∈X∑y∈Yp(x,y)logp(x,y)p(x)p(y)=D[p(x,y)∣∣p(x)p(y)] \begin{aligned} I(X;Y) &=X(X)-H(X|Y) \\ & =-\sum_{x \in X}p(x)logp(x)+\sum_{x \in X}\sum_{y \in Y}p(x,y)logp(x|y) \\ & =\sum_{x \in X}\sum_{y \in Y}p(x,y)log\frac{p(x|y)}{p(x)} \\ & =\sum_{x \in X}\sum_{y \in Y}p(x,y)log\frac{p(x,y)}{p(x)p(y)} \\ & =D[p(x,y)||p(x)p(y)] \end{aligned} I(X;Y)=X(X)−H(X∣Y)=−x∈X∑p(x)logp(x)+x∈X∑y∈Y∑p(x,y)logp(x∣y)=x∈X∑y∈Y∑p(x,y)logp(x)p(x∣y)=x∈X∑y∈Y∑p(x,y)logp(x)p(y)p(x,y)=D[p(x,y)∣∣p(x)p(y)] -
条件相对熵:
D[p(y∣x)∣∣q(y∣x)]=∑xp(x)∑yp(y∣x)logp(y∣x)q(y∣x)D[p(y|x)||q(y|x)]=\sum_{x}p(x)\sum_{y}p(y|x)log\frac{p(y|x)}{q(y|x)}D[p(y∣x)∣∣q(y∣x)]=x∑p(x)y∑p(y∣x)logq(y∣x)p(y∣x) -
相对熵的链式法则:
D[p(x,y)∣∣q(x,y)]=D[p(x)∣∣q(x)]+D[p(y∣x)∣∣q(y∣x)]D[p(x,y)||q(x,y)]=D[p(x)||q(x)]+D[p(y|x)||q(y|x)]D[p(x,y)∣∣q(x,y)]=D[p(x)∣∣q(x)]+D[p(y∣x)∣∣q(y∣x)]
更多推荐

所有评论(0)