Vector Approximate Message Passing, VAMP

向量序列的收敛

x ( N ) \boldsymbol x(N) x(N)是一个 r N rN rN维的向量,其形式为
x ( N ) = ( x 1 ( N ) , … , x N ( N ) ) ∈ R r N × 1 (28) \boldsymbol x(N) = (\boldsymbol x_1(N),\ldots,\boldsymbol x_N(N)) \in \mathbb R^{rN \times 1} \tag{28} x(N)=(x1(N),,xN(N))RrN×1(28)
其中的子向量 x n ( N ) ∈ R r , ∀ n \boldsymbol x_n(N) \in \mathbb R^{r}, \forall n xn(N)Rr,n r r r可能等于1,表示块向量为标量)。我们称 x ( N ) \boldsymbol x(N) x(N)是一个块向量序列(block vector sequence),该块向量序列含 N N N块。这样的向量序列可以是确定的,也可以是随机的。多数情况下,我们会省去符号 N N N而简写为 x \boldsymbol x x

p p p阶伪Lipschitz函数的定义:给定 p ≥ 1 p \geq 1 p1 f : R s → R r \boldsymbol f: \mathbb R^s \rightarrow \mathbb R^r f:RsRr,若
∃ C > 0 ,  such that  ∥ f ( x 1 ) − f ( x 2 ) ∥ ≤ C ∥ x 1 − x 2 ∥ [ 1 + ∥ x 1 ∥ p − 1 + ∥ x 2 ∥ p − 1 ] (29) \exist C >0,\text{ such that } {\Vert \boldsymbol f(\boldsymbol x_1) - \boldsymbol f(\boldsymbol x_2) \Vert} \leq C{\Vert \boldsymbol x_1 - \boldsymbol x_2 \Vert} \left[ 1+{\Vert \boldsymbol x_1 \Vert}^{p-1} +{\Vert \boldsymbol x_2 \Vert}^{p-1} \right ] \tag{29} C>0, such that f(x1)f(x2)Cx1x2[1+x1p1+x2p1](29)

f \boldsymbol f f p p p阶的伪Lipschitz函数。若 p = 1 p=1 p=1,则伪Lipschitz连续就是标准的Lipschitz连续。

现在假设 x = x ( N ) \boldsymbol x = \boldsymbol x(N) x=x(N)是一个块序列向量。给定 p ≥ 1 p \geq 1 p1,如果存在一个随机向量 X ∈ R r \boldsymbol X \in \mathbb R^r XRr满足以下两个条件
(1) E ∣ X ∣ p < ∞ \mathbb E{|\boldsymbol X|}^p < \infty EXp<
(2)对于任意的 p p p标量伪Lipschitz连续函数,有
lim ⁡ N → ∞ 1 N ∑ n = 1 N f ( x n ( N ) ) = E [ f ( X ) ]  a.s.. (30) \lim_{N \rightarrow \infty} \frac{1}{N} \sum_{n=1}^N f(x_n(N)) = \mathbb E[f(X)] \text{ a.s..} \tag{30} NlimN1n=1Nf(xn(N))=E[f(X)] a.s..(30)
那么我们说 x = x ( N ) \boldsymbol x = \boldsymbol x(N) x=x(N)满足 p p p阶矩经验收敛,可简写为
lim ⁡ N → ∞ { x n } n = 1 N = P L ( p ) X (31) \lim_{N \rightarrow \infty} {\{ x_n \}}_{n=1}^N \overset{PL\left( p \right)}{=} X \tag{31} Nlim{xn}n=1N=PL(p)X(31)
x n x_n xn即是 x n ( N ) x_n(N) xn(N)的缩略形式。重要的是,条件(2),即式(30)对于标量也成立。

均匀Lipschitz连续的定义:给定函数 ϕ ( r , γ ) \boldsymbol \phi (\boldsymbol r, \gamma) ϕ(r,γ),如果存在 L 1 , L 2 ≥ 0 L_1, L_2 \geq 0 L1,L20,以及 γ ‾ \overline{\gamma } γ的一个邻域 U U U,使得

∥ ϕ ( r 1 , γ ) − ϕ ( r 2 , γ ) ∥ ≤ L 1 ∥ r 1 − r 2 ∥   ∀ r 1 , r 1 ∈ R s ,   γ ∈ U (32) \Vert \boldsymbol \phi (\boldsymbol r_1, \gamma) - \boldsymbol \phi (\boldsymbol r_2, \gamma) \Vert \leq L_1 \Vert \boldsymbol r_1 - \boldsymbol r_2 \Vert \ \forall \boldsymbol r_1,\boldsymbol r_1 \in \mathbb R^s, \ \gamma \in U \tag{32} ϕ(r1,γ)ϕ(r2,γ)L1r1r2 r1,r1Rs, γU(32)

∥ ϕ ( r , γ 1 ) − ϕ ( r , γ 2 ) ∥ ≤ L 2 ( 1 + ∥ r ∥ ) ∣ γ 1 − γ 2 ∣   ∀ r ∈ R s , γ 1 , γ 2 ∈ U (33) \Vert \boldsymbol \phi (\boldsymbol r, \gamma_1) - \boldsymbol \phi (\boldsymbol r, \gamma_2) \Vert \leq L_2(1+\Vert \boldsymbol r \Vert)|\gamma_1 - \gamma_2| \ \forall \boldsymbol r \in \mathbb R^s, \gamma_1,\gamma_2 \in U \tag{33} ϕ(r,γ1)ϕ(r,γ2)L2(1+r)γ1γ2 rRs,γ1,γ2U(33)

则认为函数 ϕ ( r , γ ) \boldsymbol \phi (\boldsymbol r, \gamma) ϕ(r,γ) γ ‾ \overline{\gamma } γ满足均匀Lipschitz连续(Uniform Lipschitz Continuous)。

状态演进分析

大系统分析

  1. 线性测量模型
    重写系统模型为:
    y = A x 0 + w ∈ R M   w ∼ N ( 0 , γ ω 0 − 1 I N ) (34) \boldsymbol{y}=\boldsymbol{Ax}^0+\boldsymbol{w}\in \mathbb{R} ^M \ \boldsymbol{w} \sim \mathcal N(\boldsymbol 0, \gamma^{-1}_{\omega 0} \pmb I_N) \tag{34} y=Ax0+wRM wN(0,γω01IIIN)(34)
    其中 A ∈ R N × N \boldsymbol A \in \mathbb R^{N\times N} ARN×N A \boldsymbol A A已知且为方阵【为了便于分析】),高斯噪声 w \boldsymbol w w的精度为 γ ω 0 \gamma^{}_{\omega 0} γω0(注意: γ ω 0 \gamma^{}_{\omega 0} γω0表示的是真实噪声精度, γ ω \gamma^{}_{\omega } γω表示的是估计噪声精度)。
    关于矩阵 A \boldsymbol A A的假设:我们假设矩阵 A \boldsymbol A A是一个大的右旋转不变(Right-orthogonally invariant)矩阵,它的奇异值分解为
    A = U S V T ,   S = Diag ( s ) (35) \boldsymbol A = \boldsymbol U \boldsymbol S \boldsymbol V^T, \ \boldsymbol S = \text{Diag}(\boldsymbol s) \tag{35} A=USVT, S=Diag(s)(35)

其中, U , V ∈ R N × N \boldsymbol U, \boldsymbol V \in \mathbb R^{N \times N} U,VRN×N是酉阵,并且 U \boldsymbol U U是确定的, V \boldsymbol V V服从Haar分布(在正交阵集合中服从均匀分布)。

右旋转不变的解释:
p ( A ) = p ( A V 0 )  for any fixed orthogonal matrix  V 0 (36) p(\boldsymbol A) = p(\boldsymbol {AV}_0) \ \text{for any fixed orthogonal matrix } \boldsymbol V_0 \tag{36} p(A)=p(AV0) for any fixed orthogonal matrix V0(36)

如何理解长方矩阵 A \boldsymbol A A
A = U S V T = [ U 0 0 0 I ] D i a g ( [ s 0 0 ] ) V T = U 0 S 0 V T (37) \boldsymbol{A}=\boldsymbol{USV}^T=\left[ \begin{matrix} \boldsymbol{U}_{\boldsymbol{0}}& \boldsymbol{0}\\ \boldsymbol{0}& \boldsymbol{I}\\ \end{matrix} \right] \mathrm{Diag}\left( \left[ \begin{array}{c} \boldsymbol{s}_0\\ \boldsymbol{0}\\ \end{array} \right] \right) \boldsymbol{V}^T=\boldsymbol{U}_{\boldsymbol{0}}\boldsymbol{S}_0\boldsymbol{V}^T \tag{37} A=USVT=[U000I]Diag([s00])VT=U0S0VT(37)

  1. 滤波(去噪)
    在分析当中,我们认为滤波函数 g 1 ( ⋅ , γ 1 k ) \pmb {\mathrm{g}}_1 (\cdot,\gamma_{1k}) ggg1(,γ1k)一般的,不针对特定的准则,假设元素可分 g 1 ( ⋅ , γ 1 k ) {\mathrm{g}}_1 (\cdot,\gamma_{1k}) g1(,γ1k),并且 g 1 ( ⋅ , γ 1 k ) {\mathrm{g}}_1 (\cdot,\gamma_{1k}) g1(,γ1k)满足特定的Lipschitz连续条件。
  2. 逼近分布
    对于式(34)给出的模型,噪声分布已知,对矩阵 A \boldsymbol A A的奇异值分解形式(37),我们已知矩阵 U \boldsymbol U U是确定的,矩阵 V \boldsymbol V V服从Haar分布,现在只剩真实向量 x 0 \boldsymbol x^0 x0和奇异值向量 s \boldsymbol s s的分布未知。这里不去深究个中原因,直接给出以下结论:
    (1)我们假设奇异值向量 s ∈ R N \boldsymbol s \in \mathbb R^N sRN的元素,以二阶矩经验收敛于随机变量 S S S,结合式(31),得到
    lim ⁡ N → ∞ { s n } n = 1 N = P L ( 2 ) S (38) \lim_{N \rightarrow \infty} {\{ s_n \}}_{n=1}^N \overset{PL\left( 2 \right)}{=} S \tag{38} Nlim{sn}n=1N=PL(2)S(38)

并假设 E [ S ] > 0 \mathbb E[S] > 0 E[S]>0 S ∈ [ 0 , S max ] S \in [0,S_\text{max}] S[0,Smax]
(2)我们假设滤波器的初始输入 r 10 \boldsymbol r_{10} r10和真实信号 x 0 \boldsymbol x^{0} x0经验收敛如下:
lim ⁡ N → ∞ { r 10 , n , x n 0 } n = 1 N = P L ( 2 ) ( R 10 , X 0 ) (39) \lim_{N \rightarrow \infty} {\{ r_{10,n},x^0_{n} \}}_{n=1}^N \overset{PL\left( 2 \right)}{=} (R_{10},X^0) \tag{39} Nlim{r10,n,xn0}n=1N=PL(2)(R10,X0)(39)

我们考虑式(9)给出的MMSE滤波器,在考虑 x 0 \boldsymbol x^0 x0的逼近分布下,即式(39),MMSE滤波器 g 1 ( r 1 , γ 1 ) \mathrm g_1(r_1,\gamma_1) g1(r1,γ1)及其一阶导可以表示为:
g 1 ( r 1 , γ 1 ) = E [ X 0 ∣ R 1 = r 1 ] g 1 ′ ( r 1 , γ 1 ) = γ 1 var [ X 0 ∣ R 1 = r 1 ] R 1 = X 0 + P ,   P ∼ N ( 0 , γ 1 − 1 ) (40) \mathrm g_1(r_1,\gamma_1) = \mathbb E \left[ X^0| R_1=r_1\right] \\ \mathrm g^{\prime}_1(r_1,\gamma_1) = \gamma_1 \text{var} \left[ X^0| R_1=r_1\right] \\ R_1 = X^0 + P, \ P \sim \mathcal N(0,\gamma^{-1}_1) \tag{40} g1(r1,γ1)=E[X0R1=r1]g1(r1,γ1)=γ1var[X0R1=r1]R1=X0+P, PN(0,γ11)(40)

因此,MMSE滤波器及其一阶导可以通过计算后验分布 p ( X 0 ∣ R 1 = r 1 ) p(X^0|R_1 = r_1) p(X0R1=r1)的均值和方差得到。

误差函数(Error Function)

(1)对从测量值 R R R的估计 X ^ = g 1 ( R 1 , γ 1 ) \hat {X} = \mathrm{g_1}(R_1, \gamma_1) X^=g1(R1,γ1)的MSE E 1 ( γ 1 , τ 1 ) \mathcal E_1(\gamma_1, \tau_1) E1(γ1,τ1)
E 1 ( γ 1 , τ 1 ) ≔ E [ ( g 1 ( R 1 , γ 1 ) − X 0 ) 2 ] R 1 = X 0 + P ,   P ∼ N ( 0 , τ 1 ) (41) \mathcal E_1(\gamma_1, \tau_1) \coloneqq \mathbb E \left[ {(\mathrm g_1 (R_1, \gamma_1) - X^0)}^2\right] \\ R_1 = X^0 + P, \ P \sim \mathcal N(0, \tau_1) \tag{41} E1(γ1,τ1):=E[(g1(R1,γ1)X0)2]R1=X0+P, PN(0,τ1)(41)
(2)LMMSE估计器(回忆式(23))的误差函数:
E 2 ( γ 2 , τ 2 ) ≔ lim ⁡ N → ∞ 1 N E [ ∥ g 2 ( r 2 , γ 2 ) − x 0 ∥ 2 ] , r 2 = x 0 + q ,   q ∼ N ( 0 , τ 2 I ) y = A x 0 + w , w ∼ N ( 0 , γ ω 0 − 1 ) (42) \mathcal E_2(\gamma_2, \tau_2) \coloneqq \lim_{N \rightarrow \infty} \frac{1}{N} \mathbb E \left[ {\Vert \pmb {\mathrm g}_2(\boldsymbol r_2, \gamma_2 )- \boldsymbol x^0 \Vert}^2\right], \\ \boldsymbol r_2 = \boldsymbol x^0 + \boldsymbol q, \ \boldsymbol q \sim \mathcal N(0, \tau_2 \pmb I) \\ \boldsymbol y = \boldsymbol{Ax}^0+\boldsymbol w, \boldsymbol w \sim \mathcal N(0, \gamma^{-1}_{\omega 0}) \tag{42} E2(γ2,τ2):=NlimN1E[ggg2(r2,γ2)x02],r2=x0+q, qN(0,τ2III)y=Ax0+w,wN(0,γω01)(42)
τ 1 = γ 1 − 1 , τ 2 = γ 2 − 1 , γ ω = γ ω 0 \tau_1=\gamma^{-1}_1,\tau_2=\gamma^{-1}_2,\gamma_\omega = \gamma_{\omega 0} τ1=γ11,τ2=γ21,γω=γω0时,我们认为两个估计器 g 1 \pmb {\mathrm g}_1 ggg1 g 2 \pmb {\mathrm g}_2 ggg2都是“匹配的”,在该匹配条件下,使用更加简略的符号:
E 1 ( γ 1 ) ≔ E 1 ( γ 1 , γ 1 − 1 ) ,   E 2 ( γ 2 ) ≔ E 2 ( γ 2 , γ 2 − 1 ) (43) \mathcal E_1(\gamma_1) \coloneqq \mathcal E_1(\gamma^{}_1,\gamma^{-1}_1), \ \mathcal E_2(\gamma_2) \coloneqq \mathcal E_2(\gamma^{}_2,\gamma^{-1}_2) \tag{43} E1(γ1):=E1(γ1,γ11), E2(γ2):=E2(γ2,γ21)(43)

引理1:对以上两个误差函数 E 1 , E 2 \mathcal E_1, \mathcal E_2 E1,E2
(a)对式(40)所描述的MMSE滤波器,在“匹配”条件下,即 τ 1 = γ 1 − 1 \tau_1 = \gamma^{-1}_1 τ1=γ11时,误差函数就是条件方差,即,
E 1 ( γ 1 ) = var [ X 0 ∣ R 1 = X 0 + P ] ,   P ∼ N ( 0 , γ 1 − 1 ) (44) \mathcal E_1(\gamma_1) = \text{var} \left[ X^0|R_1=X^0+P\right], \ P \sim \mathcal N(0,\gamma^{-1}_1) \tag{44} E1(γ1)=var[X0R1=X0+P], PN(0,γ11)(44)

(直接由式(40)得到)
(b)LMMSE误差函数可以被表示为:
E 2 ( γ 2 , τ 2 ) = lim ⁡ N → ∞ 1 N tr [ Q − 2 Q ~ ] (45) \mathcal E_2 (\gamma_2, \tau_2) = \lim_{ N \rightarrow \infty} \frac{1}{N}\text{tr}\left[ \boldsymbol Q^{-2} \tilde {\boldsymbol Q}\right] \tag{45} E2(γ2,τ2)=NlimN1tr[Q2Q~](45)

其中,
Q ≔ γ ω A T A + γ 2 I ,   Q ~ ≔ γ ω 2 γ ω 0 A T A + τ 2 γ 2 2 I (46) \boldsymbol Q \coloneqq \gamma_{\omega} \boldsymbol A^T \boldsymbol A + \gamma_2 \pmb I, \ \tilde {\boldsymbol Q} \coloneqq \frac{\gamma^2_{\omega}}{\gamma_{\omega 0}} \boldsymbol A^T \boldsymbol A + \tau_2 \gamma^2_2 \pmb I \tag{46} Q:=γωATA+γ2III, Q~:=γω0γω2ATA+τ2γ22III(46)

当满足匹配条件 τ 2 = γ 2 − 1 \tau_2 = \gamma^{-1}_2 τ2=γ21 γ ω = γ ω 0 \gamma_\omega = \gamma_{\omega 0} γω=γω0时,
E 2 ( γ 2 ) = lim ⁡ N → ∞ 1 N tr [ Q − 1 ] (47) \mathcal E_2(\gamma_2) = \lim_{N \rightarrow \infty} \frac{1}{N} \text{tr}\left [ \boldsymbol Q^{-1} \right ] \tag{47} E2(γ2)=NlimN1tr[Q1](47)

证明(b):已知 y = A x 0 + w \boldsymbol{y}=\boldsymbol{Ax}^0+\boldsymbol{w} y=Ax0+w r 2 = x 0 + q \boldsymbol r_2 = \boldsymbol x^0 + \boldsymbol q r2=x0+q
LMMSE估计的误差:
g 2 ( r 2 , γ 2 ) − x 0 = ( γ ω A T A + γ 2 k I ) − 1 ( γ ω A T ( A x 0 + ω ) + γ 2 r 2 ) − x 0 = ( γ ω A T A + γ 2 k I ) − 1 ( γ 2 q + γ ω A T w ) = Q − 1 ( γ 2 q + γ ω A T w ) (48) \begin{aligned} \pmb {\mathrm{g}}_2(\boldsymbol r_2, \gamma_2) - \boldsymbol x^0 &= {\left( \gamma_\omega \boldsymbol A^T \boldsymbol A + \gamma_{2k} \pmb I \right)}^{-1} \left( \gamma_\omega \boldsymbol A^T (\boldsymbol {Ax}^0+\boldsymbol {\omega}) + \gamma_{2}\boldsymbol r_{2} \right) - \boldsymbol x^0 \\ &={\left( \gamma_\omega \boldsymbol A^T \boldsymbol A + \gamma_{2k} \pmb I \right)}^{-1} \left( \gamma_2 \boldsymbol q + \gamma_{\omega} \boldsymbol A^T \boldsymbol w \right) \\ &= \boldsymbol Q^{-1} \left( \gamma_2 \boldsymbol q + \gamma_{\omega} \boldsymbol A^T \boldsymbol w \right) \end{aligned} \tag{48} ggg2(r2,γ2)x0=(γωATA+γ2kIII)1(γωAT(Ax0+ω)+γ2r2)x0=(γωATA+γ2kIII)1(γ2q+γωATw)=Q1(γ2q+γωATw)(48)

g 2 ( r 2 , γ 2 ) − x 0 \pmb {\mathrm{g}}_2(\boldsymbol r_2, \gamma_2) - \boldsymbol x^0 ggg2(r2,γ2)x0的协方差矩阵为:
E [ ( g 2 ( r 2 , γ 2 ) − x 0 ) ( g 2 ( r 2 , γ 2 ) − x 0 ) T ] = Q − 1 [ γ 2 2 E [ q q T ] + γ ω 2 A E [ w w T ] A T ] Q − 1 = Q − 1 Q ~ Q − 1 (49) \begin{aligned} & \mathbb E \left [ { (\pmb {\mathrm{g}}_2(\boldsymbol r_2, \gamma_2) - \boldsymbol x^0)(\pmb {\mathrm{g}}_2(\boldsymbol r_2, \gamma_2) - \boldsymbol x^0)^T }\right ] \\ &=\boldsymbol Q^{-1} \left [ \gamma^2_2 \mathbb E [\boldsymbol q \boldsymbol q^T] + \gamma^2_{\omega} \boldsymbol A \mathbb E[\boldsymbol {ww}^T] \boldsymbol A^T \right ] \boldsymbol Q^{-1} \\ &= \boldsymbol Q^{-1} \tilde {\boldsymbol Q} \boldsymbol Q^{-1} \end{aligned} \tag{49} E[(ggg2(r2,γ2)x0)(ggg2(r2,γ2)x0)T]=Q1[γ22E[qqT]+γω2AE[wwT]AT]Q1=Q1Q~Q1(49)

将式(49)求迹代入到式(42)即可得到式(47)
证毕!
(c)LMMSE误差函数也可以被表示为(将 A \boldsymbol A A的奇异值分解代入到(b)中 i):
E 2 ( γ 2 , τ 2 ) = E [ γ ω 2 S 2 / γ ω 0 + τ 2 γ 2 2 ( γ ω S 2 + γ 2 ) 2 ] (50) \mathcal E_2(\gamma_2, \tau_2) = \mathbb E \left[ \frac{\gamma^2_{\omega} S^2/\gamma_{\omega 0} + \tau_2 \gamma^2_2}{(\gamma_{\omega} S^2 + \gamma_2)^2} \right] \tag{50} E2(γ2,τ2)=E[(γωS2+γ2)2γω2S2/γω0+τ2γ22](50)
其中 S S S是一个A的奇异值所构成的随机变量(如式(38)),对于“匹配”条件: τ 2 = γ 2 − 1 \tau_2 = \gamma^{-1}_2 τ2=γ21 γ ω = γ ω 0 \gamma_\omega = \gamma_{\omega 0} γω=γω0,有
E 2 ( γ 2 ) = E [ 1 γ ω S 2 + γ 2 ] (51) \mathcal E_2(\gamma_2) = \mathbb E \left[ \frac{1}{\gamma_\omega S^2 + \gamma_2}\right ] \tag{51} E2(γ2)=E[γωS2+γ21](51)

敏感函数(Sensitivity Function)

敏感函数描述的是估计器(滤波器)的期望散度。
(1)对于滤波器 g 1 \pmb {\mathrm g_1} g1g1g1,其敏感函数定义为:
A 1 ( γ 1 , τ 1 ) ≔ E [ g 1 ′ ( R 1 , γ 1 ) ] , R 1 = X 0 + P ,   P ∼ N ( 0 , τ 1 ) (52) A_1(\gamma_1, \tau_1) \coloneqq \mathbb E[\mathrm {g^{\prime}_1(R_1, \gamma_1)}], \\ R_1 = X^0 + P, \ P \sim \mathcal N(0, \tau_1) \tag{52} A1(γ1,τ1):=E[g1(R1,γ1)],R1=X0+P, PN(0,τ1)(52)
(2)对于LMMSE估计器,其敏感函数定义为:
A 2 ( γ 2 ) ≔ lim ⁡ N → ∞ 1 N tr [ ∂ g 2 ( r 2 , γ 2 ) ∂ r 2 ] (53) A_2(\gamma_2) \coloneqq \lim_{N \rightarrow \infty} \frac{1}{N} \text{tr} \left [ \frac{\partial \pmb {\mathrm g_2(\boldsymbol r_2, \gamma_2)}}{\partial \boldsymbol r_2} \right ] \tag{53} A2(γ2):=NlimN1tr[r2g2(r2,γ2)g2(r2,γ2)g2(r2,γ2)](53)

引理2:对于上述敏感函数:
(a) “匹配”条件下 τ 1 = γ 1 − 1 \tau_1 = \gamma^{-1}_1 τ1=γ11,的MMSE滤波器,敏感函数为
A 1 ( γ 1 , γ 1 − 1 ) = γ 1 var [ X 0 ∣ R 1 = X 0 + N ( 0 , γ 1 − 1 ) ] (54) A_1(\gamma_1, \gamma^{-1}_1) = \gamma_1 \text{var} \left [ X^0|R_1 = X^0+\mathcal N(0,\gamma^{-1}_1) \right ] \tag{54} A1(γ1,γ11)=γ1var[X0R1=X0+N(0,γ11)](54)
(2)LMMSE估计器的敏感函数
A 2 ( γ 2 ) = lim ⁡ N → ∞ 1 N γ 2 tr { Q − 1 } (55) A_2(\gamma_2 ) = \lim_{N \rightarrow \infty} \frac{1}{N} \gamma_2 \text{tr} \{ \boldsymbol Q^{-1} \} \tag{55} A2(γ2)=NlimN1γ2tr{Q1}(55)
(c)LMMSE估计器的敏感函数也可写作:
A 2 ( γ 2 ) = E [ γ 2 γ ω S 2 + γ 2 ] (56) A_2(\gamma_2) = \mathbb E \left[ \frac{\gamma_2}{ \gamma_{\omega}S^2 + \gamma_2} \right ] \tag{56} A2(γ2)=E[γωS2+γ2γ2](56)

线性约束下的正交阵

(服务于理论1的证明)
假设 V ∈ R N × N \boldsymbol V \in \mathbb R^{N \times N} VRN×N是正交阵,且满足线性约束
A = V B (57) \boldsymbol A = \boldsymbol {VB} \tag{57} A=VB(57)
其中 A , B ∈ R N × s \boldsymbol A, \boldsymbol B \in \mathbb R^{N \times s} A,BRN×s,假设矩阵 A , B \boldsymbol A, \boldsymbol B A,B列满秩,因此 s ≤ N s \leq N sN。让
U A = A ( A T A ) − 1 / 2 ,   U B = B ( B T B ) − 1 / 2 (67) \boldsymbol {U_A}=\boldsymbol A (\boldsymbol A^T \boldsymbol A)^{-1/2}, \ \boldsymbol {U_B}=\boldsymbol B (\boldsymbol B^T \boldsymbol B)^{-1/2} \tag{67} UA=A(ATA)1/2, UB=B(BTB)1/2(67)
我们记矩阵 A \boldsymbol A A的SVD分解形式为 A = E A Σ A F A T \boldsymbol A = \boldsymbol {E_A} \boldsymbol {\Sigma_A} \boldsymbol {F^T_A} A=EAΣAFAT,矩阵 B \boldsymbol B B类似,为 B = E B Σ B F B T \boldsymbol B = \boldsymbol {E_B} \boldsymbol {\Sigma_B} \boldsymbol {F^T_B} B=EBΣBFBT,其中 E A , E B ∈ R N × s \boldsymbol {E_A}, \boldsymbol {E_B} \in \mathbb R^{N \times s} EA,EBRN×s,则
U A = E A F A T ,   U B = E B F B T (68) \boldsymbol {U_A}= \boldsymbol {E_A} \boldsymbol {F^T_A}, \ \boldsymbol {U_B}= \boldsymbol {E_B} \boldsymbol {F^T_B} \tag{68} UA=EAFAT, UB=EBFBT(68)
U A ⊥ ∈ R N × ( N − s ) \boldsymbol U_{\boldsymbol A^{\bot}} \in \mathbb R^{N \times (N-s)} UARN×(Ns)表征 range ( A ) ⊥ \text{range}(\boldsymbol A)^{\bot} range(A)的正交基底所构成的矩阵(列向量正交),令 U B ⊥ ∈ R N × ( N − s ) \boldsymbol U_{\boldsymbol B^{\bot}} \in \mathbb R^{N \times (N-s)} UBRN×(Ns)表征 range ( B ) ⊥ \text{range}(\boldsymbol B)^{\bot} range(B)的正交基底所构成的矩阵(列向量正交)。注意到
range ( A ) ⊥ = ker A T (69) \text{range}(\boldsymbol A)^{\bot}=\text{ker} \boldsymbol A^T \tag{69} range(A)=kerAT(69)
A T \boldsymbol A^T AT的SVD分解为 A T = F A Σ A E A T \boldsymbol A^T=\boldsymbol F_{\boldsymbol A} \boldsymbol {\Sigma_A} \boldsymbol {E}^T_{\boldsymbol A} AT=FAΣAEAT,假设 [ E A , E A ⊥ ] ∈ R N × N [\boldsymbol E_{\boldsymbol A}, \boldsymbol E^{\bot}_{\boldsymbol A}] \in R^{N \times N} [EA,EA]RN×N构成一个正交阵,其中 E A ⊥ \boldsymbol E^{\bot}_{\boldsymbol A} EA表征 range ( E A ) ⊥ \text{range}(\boldsymbol E_{\boldsymbol A})^{\bot} range(EA)。所以有,
ker A T = range ( E A ) ⊥ = ker E A T ( range A ) ⊥ = ker ( E A T ) = ( range E A ) ⊥ (70) \text{ker} \boldsymbol A^T=\text{range} \boldsymbol (E_{\boldsymbol A})^{\bot}=\text{ker} \boldsymbol E^T_{\boldsymbol A} \\ (\text{range} \boldsymbol A)^{\bot}=\text{ker} \boldsymbol (E^T_{\boldsymbol A})^{}=(\text{range} \boldsymbol E_{\boldsymbol A} )^{\bot}\tag{70} kerAT=range(EA)=kerEAT(rangeA)=ker(EAT)=(rangeEA)(70)
同理, B \boldsymbol B B也有类似的结论,这里不做赘述。因此,可令
U A ⊥ = E A ⊥ ,   U B ⊥ = E B ⊥ (71) \boldsymbol U_{\boldsymbol A^{\bot}} = \boldsymbol E^{\bot}_{\boldsymbol A}, \ \boldsymbol U_{\boldsymbol B^{\bot}} = \boldsymbol E^{\bot}_{\boldsymbol B} \tag{71} UA=EA, UB=EB(71)
此外,定义
V ~ ≔ U A ⊥ T V U B ⊥ ∈ R ( N − s ) × ( N − s ) (72) \tilde {\boldsymbol V} \coloneqq \boldsymbol U^T_{\boldsymbol A^{\bot}} \boldsymbol V \boldsymbol U_{\boldsymbol B^{\bot}} \in \mathbb R^{(N-s) \times (N-s)} \tag{72} V~:=UATVUBR(Ns)×(Ns)(72)

引理3:在上述定义下,有如下关系式
V = A ( A T A ) − 1 B T + U A ⊥ V ~ U B ⊥ T (73) \boldsymbol V = \boldsymbol A (\boldsymbol A^T \boldsymbol A)^{-1} \boldsymbol B^T + \boldsymbol U_{\boldsymbol A^{\bot}} \tilde {\boldsymbol V} \boldsymbol U^T_{\boldsymbol B^{\bot}} \tag{73} V=A(ATA)1BT+UAV~UBT(73)
证明:令 P A ≔ U A U A T = E A E A T \boldsymbol P_{\boldsymbol A} \coloneqq \boldsymbol {U_A} \boldsymbol U^T_{\boldsymbol A}=\boldsymbol {E_A} \boldsymbol E^T_{\boldsymbol A} PA:=UAUAT=EAEAT P A ⊥ ≔ U A ⊥ U A ⊥ T = E A ⊥ ( E A ⊥ ) T \boldsymbol P^{\bot}_{\boldsymbol A} \coloneqq \boldsymbol U_{\boldsymbol A^{\bot}} \boldsymbol U^T_{\boldsymbol A^{\bot}}=\boldsymbol E^{\bot}_{\boldsymbol A} (\boldsymbol E^{\bot}_{\boldsymbol A})^T PA:=UAUAT=EA(EA)T,类似地定义 P B \boldsymbol P_{\boldsymbol B} PB P B ⊥ \boldsymbol P^{\bot}_{\boldsymbol B} PB,则
( P A + P A ⊥ ) = ∑ i ( u A , i u A , i T + u A , i ⊥ ( u A , i ⊥ ) T ) = [ E A , E A ⊥ ] ⋅ [ E A , E A ⊥ ] T = I N (74) (\boldsymbol P_{\boldsymbol A} +\boldsymbol P^{\bot}_{\boldsymbol A})=\sum_{i} \left( \boldsymbol u_{\boldsymbol A,i} \boldsymbol u^T_{\boldsymbol A,i} + \boldsymbol u^{\bot}_{\boldsymbol A,i} (\boldsymbol u^{\bot}_{\boldsymbol A,i})^T \right )=[\boldsymbol E_{\boldsymbol A}, \boldsymbol E^{\bot}_{\boldsymbol A}] \cdot [\boldsymbol E_{\boldsymbol A}, \boldsymbol E^{\bot}_{\boldsymbol A}]^T = \pmb I_N \tag{74} (PA+PA)=i(uA,iuA,iT+uA,i(uA,i)T)=[EA,EA][EA,EA]T=IIIN(74)

又因为式(57) A = V B \boldsymbol A = \boldsymbol {VB} A=VB
P B  are the projections onto range B ⟹ ∀ x ∈ R N ,   ∃ y ∈ R N  such that  P B x = B y ⟹ V P B x = V B y = A y ; range ( P A ⊥ ) = range ( A ⊥ ) ⊥ range ( A ) ⟹ P A ⊥ A y = P A ⊥ V P B x = 0 ⟹ P A ⊥ V P B = 0 \begin{aligned} & P_{\boldsymbol B} \text{ are the projections onto } \text{range} {\boldsymbol B} \\ & \Longrightarrow \forall \boldsymbol x \in \mathbb R^{N}, \ \exist \boldsymbol y \in \mathbb R^{N} \text{ such that } \boldsymbol P_{\boldsymbol B} \boldsymbol x = \boldsymbol B \boldsymbol y \\ & \Longrightarrow \boldsymbol V \boldsymbol P_{\boldsymbol B} \boldsymbol x = \boldsymbol V \boldsymbol B \boldsymbol y=\boldsymbol A \boldsymbol y; \\ & \text{range}(\boldsymbol P^{\bot}_{\boldsymbol A})=\text{range}(\boldsymbol A^{\bot}) \bot \text{range}(\boldsymbol A) \\ & \Longrightarrow \boldsymbol P^{\bot}_{\boldsymbol A} \boldsymbol A \boldsymbol y= \boldsymbol P^{\bot}_{\boldsymbol A} \boldsymbol V \boldsymbol P_{\boldsymbol B} \boldsymbol x= \boldsymbol 0 \\ & \Longrightarrow \boldsymbol P^{\bot}_{\boldsymbol A} \boldsymbol V \boldsymbol P_{\boldsymbol B}= \boldsymbol 0 \end{aligned} PB are the projections onto rangeBxRN, yRN such that PBx=ByVPBx=VBy=Ay;range(PA)=range(A)range(A)PAAy=PAVPBx=0PAVPB=0

类似地, A = V B ⟹ B = V T A \boldsymbol A = \boldsymbol {VB} \Longrightarrow \boldsymbol B=\boldsymbol V^T \boldsymbol A A=VBB=VTA,同理可得 P B ⊥ V T P A = 0 ⟹ P A V P B ⊥ = 0 \boldsymbol P^{\bot}_{\boldsymbol B} \boldsymbol V^T \boldsymbol P_{\boldsymbol A}= \boldsymbol 0 \Longrightarrow \boldsymbol P^{}_{\boldsymbol A} \boldsymbol V \boldsymbol P^{\bot}_{\boldsymbol B}= \boldsymbol 0 PBVTPA=0PAVPB=0
{ P A ⊥ V P B = 0 P A V P B ⊥ = 0 (75) \begin{cases} \boldsymbol P^{\bot}_{\boldsymbol A} \boldsymbol V \boldsymbol P_{\boldsymbol B}= \boldsymbol 0 \\ \boldsymbol P^{}_{\boldsymbol A} \boldsymbol V \boldsymbol P^{\bot}_{\boldsymbol B}= \boldsymbol 0 \\ \end{cases} \tag{75} {PAVPB=0PAVPB=0(75)

因此
V = ( P A + P A ⊥ ) V ( P B + P B ⊥ ) = P A V P B + P A ⊥ V P B ⊥ (76) \begin{aligned} \boldsymbol V &= (\boldsymbol P_{\boldsymbol A} +\boldsymbol P^{\bot}_{\boldsymbol A}) \boldsymbol V (\boldsymbol P_{\boldsymbol B} +\boldsymbol P^{\bot}_{\boldsymbol B}) \\ &= \boldsymbol P^{}_{\boldsymbol A} \boldsymbol V \boldsymbol P^{}_{\boldsymbol B}+\boldsymbol P^{\bot}_{\boldsymbol A} \boldsymbol V \boldsymbol P^{\bot}_{\boldsymbol B} \end{aligned} \tag{76} V=(PA+PA)V(PB+PB)=PAVPB+PAVPB(76)

式(76)的第一项:
P A V P B = P A V B ( B T B ) − 1 B T = P A A ( B T B ) − 1 B T = E A E A T E A Σ A F A T ( B T B ) − 1 B T = E A Σ A F A T ( B T B ) − 1 B T = A ( B T B ) − 1 B T = A ( A T A ) − 1 B T ( ⟸ A T A = B T V T V B = B T B ) (77) \begin{aligned} \boldsymbol P^{}_{\boldsymbol A} \boldsymbol V \boldsymbol P^{}_{\boldsymbol B}&=\boldsymbol P^{}_{\boldsymbol A} \boldsymbol V \boldsymbol B (\boldsymbol B^T \boldsymbol B)^{-1} \boldsymbol B^T \\ &=\boldsymbol P^{}_{\boldsymbol A} \boldsymbol A (\boldsymbol B^T \boldsymbol B)^{-1} \boldsymbol B^T \\ &=\boldsymbol {E_A} \boldsymbol E^T_{\boldsymbol A} \boldsymbol {E_A} \boldsymbol {\Sigma_A} \boldsymbol {F^T_A} (\boldsymbol B^T \boldsymbol B)^{-1} \boldsymbol B^T \\ &= \boldsymbol {E_A} \boldsymbol {\Sigma_A} \boldsymbol {F^T_A} (\boldsymbol B^T \boldsymbol B)^{-1} \boldsymbol B^T \\ &= \boldsymbol A (\boldsymbol B^T \boldsymbol B)^{-1} \boldsymbol B^T \\ &= \boldsymbol A (\boldsymbol A^T \boldsymbol A)^{-1} \boldsymbol B^T (\Longleftarrow \boldsymbol A^T \boldsymbol A=\boldsymbol B^T \boldsymbol V^T \boldsymbol V \boldsymbol B=\boldsymbol B^T \boldsymbol B) \tag{77} \end{aligned} PAVPB=PAVB(BTB)1BT=PAA(BTB)1BT=EAEATEAΣAFAT(BTB)1BT=EAΣAFAT(BTB)1BT=A(BTB)1BT=A(ATA)1BT(ATA=BTVTVB=BTB)(77)

第二项:
P A ⊥ V P B ⊥ = U A ⊥ V ~ U B ⊥ T (78) \boldsymbol P^{\bot}_{\boldsymbol A} \boldsymbol V \boldsymbol P^{\bot}_{\boldsymbol B}=\boldsymbol U_{\boldsymbol A^{\bot}} \tilde {\boldsymbol V} \boldsymbol U^T_{\boldsymbol B^{\bot}} \tag{78} PAVPB=UAV~UBT(78)

另外,可以得到矩阵 V ~ = U A ⊥ T V U B ⊥ ∈ R ( N − s ) × ( N − s ) \tilde {\boldsymbol V} = \boldsymbol U^T_{\boldsymbol A^{\bot}} \boldsymbol V \boldsymbol U_{\boldsymbol B^{\bot}} \in \mathbb R^{(N-s) \times (N-s)} V~=UATVUBR(Ns)×(Ns)是正交阵,如下所述:
V ~ T V ~ = U B ⊥ T V T P A ⊥ V U B ⊥ = U B ⊥ T V T ( I − P A ) V U B ⊥ = U B ⊥ T V T I V U B ⊥ − U B ⊥ T V T P A V U B ⊥ = I − U B ⊥ T V T P A V U B ⊥ (79) \begin{aligned} {\tilde {\boldsymbol V}}^T \tilde {\boldsymbol V}&=\boldsymbol U^T_{\boldsymbol B^{\bot}} \boldsymbol V^T \boldsymbol P^{\bot}_{\boldsymbol A} \boldsymbol V \boldsymbol U_{\boldsymbol B^{\bot}} \\ &= \boldsymbol U^T_{\boldsymbol B^{\bot}} \boldsymbol V^T (\pmb I - \boldsymbol P^{}_{\boldsymbol A}) \boldsymbol V \boldsymbol U_{\boldsymbol B^{\bot}} \\ &= \boldsymbol U^T_{\boldsymbol B^{\bot}} \boldsymbol V^T \pmb I \boldsymbol V \boldsymbol U_{\boldsymbol B^{\bot}} - \boldsymbol U^T_{\boldsymbol B^{\bot}} \boldsymbol V^T \boldsymbol P^{}_{\boldsymbol A} \boldsymbol V \boldsymbol U_{\boldsymbol B^{\bot}} \\ &= \pmb I - \boldsymbol U^T_{\boldsymbol B^{\bot}} \boldsymbol V^T \boldsymbol P^{}_{\boldsymbol A} \boldsymbol V \boldsymbol U_{\boldsymbol B^{\bot}} \end{aligned} \tag{79} V~TV~=UBTVTPAVUB=UBTVT(IIIPA)VUB=UBTVTIIIVUBUBTVTPAVUB=IIIUBTVTPAVUB(79)

要证 V ~ \tilde {\boldsymbol V} V~正交阵,只需证:
U B ⊥ T V T P A V U B ⊥ = 0 (80) \boldsymbol U^T_{\boldsymbol B^{\bot}} \boldsymbol V^T \boldsymbol P^{}_{\boldsymbol A} \boldsymbol V \boldsymbol U_{\boldsymbol B^{\bot}} = \pmb 0 \tag{80} UBTVTPAVUB=000(80)
因为  range ( B ⊥ ) ⊥ = ker ( B ⊥ ) T \text{ range}(\boldsymbol B^{\bot})^{\bot}=\text{ker}{\boldsymbol (B^{\bot}})^T  range(B)=ker(B)T,即  range ( B ) = ker ( B ⊥ ) T \text{ range}(\boldsymbol B)=\text{ker}{\boldsymbol (B^{\bot}})^T  range(B)=ker(B)T
所以只需证
range ( B ) = range ( V T P A ) (81) \text{range}(\boldsymbol B)=\text{range}(\boldsymbol V^T \boldsymbol {P_A}) \tag{81} range(B)=range(VTPA)(81)
因为 range ( A ) = range ( P A ) \text{range}(A)=\text{range}( \boldsymbol {P_A}) range(A)=range(PA),且 B = V T A \boldsymbol B=\boldsymbol V^T \boldsymbol A B=VTA
range ( B ) = range ( V T A ) = ? range ( V T P A ) ?的充分必要性证明: ⊆ : ∀ x , ∃ y , such that A x = P A y ⊇ : ∀ y , ∃ x , such that P A y = A x \text{range}(\boldsymbol B)=\text{range}(\boldsymbol V^T \boldsymbol A) \overset{?}{=} \text{range}(\boldsymbol V^T \boldsymbol {P_A}) \\ \text{?的充分必要性证明:} \\ \subseteq : \forall \boldsymbol x, \exist \boldsymbol y, \text{such that} \boldsymbol {Ax} = \boldsymbol {P_A y} \\ \supseteq : \forall \boldsymbol y, \exist \boldsymbol x, \text{such that} \boldsymbol {P_A y} = \boldsymbol {Ax} range(B)=range(VTA)=?range(VTPA)?的充分必要性证明::x,y,such thatAx=PAy:y,x,such thatPAy=Ax

因此式(81)成立,因此式(80)成立,因此式(79)成立, V ~ \tilde {\boldsymbol V} V~为正交阵。

引理4:让 V ∈ R N × N \boldsymbol V \in \mathbb R^{N \times N} VRN×N满足Haar分布。假设上面所定义的矩阵 A , B \boldsymbol A, \boldsymbol B A,B是确定的, G G G代表事件:矩阵 V \boldsymbol V V满足约束条件 A = V B \boldsymbol A = \boldsymbol {VB} A=VB,即式(57)。则给定事件 G G G下, V ~ \tilde {\boldsymbol V} V~服从Haar分布且独立于 G G G
证明:令 O N O_N ON表示 N × N N \times N N×N的正交阵的集合,令 L = { V ∈ O N  and  A = V B } \mathcal L=\{ \boldsymbol V \in O_N \text{ and } \boldsymbol A = \boldsymbol {VB}\} L={VON and A=VB}。不考虑约束条件(57),若 p V ( V ) p_{\boldsymbol V}(\boldsymbol V) pV(V)为在 O N O_N ON上的均匀概率密度函数(Haar),那么给定事件 G G G下, V \boldsymbol V V的条件概率密度函数为:
p V ∣ G ( V ∣ G ) = 1 Z p V ( V ) I { V ∈ L } (82) p_{\boldsymbol V|G}(\boldsymbol V|G)=\frac{1}{Z}p_{\boldsymbol V}(\boldsymbol V) \mathbb I_{\{ \boldsymbol V \in \mathcal L\}} \tag{82} pVG(VG)=Z1pV(V)I{VL}(82)
其中 I ( ⋅ ) \mathbb I(\cdot) I()表示指示函数, Z Z Z是归一化因子。
结合式(72)和式(73),可以得到
p V ~ ∣ G ( V ~ ∣ G ) ∝ p V ∣ G ( ϕ ( V ~ ) ∣ G ) ∝ p V ( ϕ ( V ~ ) ) I { ϕ ( V ~ ) ∈ L } = p V ( ϕ ( V ~ ) ) (83) p_{\tilde {\boldsymbol V}|G}(\tilde {\boldsymbol V}|G) \propto p_{\boldsymbol V| G}(\phi(\tilde {\boldsymbol V})|G) \propto p_{\boldsymbol V}(\phi( \tilde {\boldsymbol V})) \mathbb I_{\{ \phi( \tilde {\boldsymbol V}) \in \mathcal L\}}=p_{\boldsymbol V}(\phi( \tilde {\boldsymbol V})) \tag{83} pV~G(V~G)pVG(ϕ(V~)G)pV(ϕ(V~))I{ϕ(V~)L}=pV(ϕ(V~))(83)

式(83)最后一步是因为 ϕ ( V ~ ) ∈ L \phi( \tilde {\boldsymbol V}) \in \mathcal L ϕ(V~)L意味着约束条件(式(57))已经满足,所以去掉了指示函数。
ϕ : V ~ → V \phi: \tilde {\boldsymbol V} \rightarrow \boldsymbol V ϕ:V~V表示从 O N − s O_{N-s} ONs L \mathcal L L的映射(该映射可逆)。要证明 V ~ \tilde {\boldsymbol V} V~条件服从Haar分布,只需证
p V ~ ∣ G ( W 0 V ~ ∣ G ) = p V ~ ∣ G ( V ~ ∣ G )  for any  W 0 ∈ O N − s (84) p_{\tilde {\boldsymbol V}|G}(\boldsymbol W_0\tilde {\boldsymbol V}|G)= p_{\tilde {\boldsymbol V}|G}(\tilde {\boldsymbol V}|G) \ \text{for any } \boldsymbol W_0 \in O_{N-s} \tag{84} pV~G(W0V~G)=pV~G(V~G) for any W0ONs(84)
要证式(83),首先定义矩阵
W = U A U A T + U A ⊥ W 0 U A ⊥ T (85) \boldsymbol W = \boldsymbol U_{\boldsymbol A} \boldsymbol U^T_{\boldsymbol A} + \boldsymbol U_{\boldsymbol A^{\bot}} \boldsymbol W_0 \boldsymbol U^T_{\boldsymbol A^{\bot}} \tag{85} W=UAUAT+UAW0UAT(85)

可以证明 W \boldsymbol W W为正交阵(通过计算 W T W = I \boldsymbol W^T \boldsymbol W=\pmb I WTW=III,过程中有一个步骤需要类比式(74)),且可以验证:
ϕ ( W 0 V ~ ) = W ϕ ( V ~ ) (86) \phi(\boldsymbol W_0 \tilde {\boldsymbol V})=\boldsymbol W \phi(\tilde {\boldsymbol V}) \tag{86} ϕ(W0V~)=Wϕ(V~)(86)

(将式(67)代入式(84))。因此
p V ~ ∣ G ( W 0 V ~ ∣ G ) ∝ a p V ( ϕ ( W 0 V ~ ) ) ∝ b p V ( W ϕ ( V ~ ) ) ∝ c p V ( ϕ ( V ~ ) ) (87) p_{\tilde {\boldsymbol V}|G}(\boldsymbol W_0\tilde {\boldsymbol V}|G) \overset{a} {\propto}p_{\boldsymbol V}\left(\phi (\boldsymbol W_0 \tilde {\boldsymbol V})\right) \overset{b}{\propto}p_{\boldsymbol V}\left(\boldsymbol W \phi(\tilde {\boldsymbol V})\right) \overset{c}{\propto} p_{\boldsymbol V}\left(\phi (\tilde {\boldsymbol V})\right) \tag{87} pV~G(W0V~G)apV(ϕ(W0V~))bpV(Wϕ(V~))cpV(ϕ(V~))(87)

其中(a)由式(83)可得,(b)由式(86)可得,(c)是因为 V \boldsymbol V V的正交不变性(参考式(36))。
因此,
{ eq.(83) ⟹ p V ~ ∣ G ( V ~ ∣ G ) ∝ p V ( ϕ ( V ~ ) ) eq.(87) ⟹ p V ~ ∣ G ( W 0 V ~ ∣ G ) ∝ p V ( ϕ ( V ~ ) ) (88) \begin{cases} \text{eq.(83)} \Longrightarrow p_{\tilde {\boldsymbol V}|G}(\tilde {\boldsymbol V}|G) \propto p_{\boldsymbol V}(\phi( \tilde {\boldsymbol V})) \\ \text{eq.(87)} \Longrightarrow p_{\tilde {\boldsymbol V}|G}(\boldsymbol W_0\tilde {\boldsymbol V}|G) \propto p_{\boldsymbol V}\left(\phi (\tilde {\boldsymbol V})\right) \\ \end{cases} \tag{88} {eq.(83)pV~G(V~G)pV(ϕ(V~))eq.(87)pV~G(W0V~G)pV(ϕ(V~))(88)

因此式(84)成立,因此 V ~ \tilde {\boldsymbol V} V~服从Haar分布。

引理5:给定维度 s ≥ 0 s \geq 0 s0,若 x ( N ) \boldsymbol x(N) x(N) U ( N ) \boldsymbol U(N) U(N)是序列向量和序列矩阵(关于 N N N),那么,对任意 N N N,给定以下三个条件:
(1) U = U ( N ) ∈ R N × ( N − s ) \boldsymbol U = \boldsymbol U(N) \in \mathbb R^{N \times (N-s)} U=U(N)RN×(Ns)是一个确定的矩阵,且 U T U = I \boldsymbol U^T \boldsymbol U= \pmb I UTU=III
(2) x = x ( N ) ∈ R ( N − s ) \boldsymbol x = \boldsymbol x(N) \in \mathbb R^{ (N-s)} x=x(N)R(Ns)是一个随机向量,且分布满足各向同性(isotropically distributed)。
(3) lim ⁡ N → ∞ 1 N ∥ x ∥ 2 = τ \lim_{N \rightarrow \infty} \frac{1}{N} {\Vert \boldsymbol x \Vert}^2 = \tau limNN1x2=τ
那么,若定义 y = U x \boldsymbol y = \boldsymbol {Ux} y=Ux,则 y \boldsymbol y y的每一个元素经验收敛于(Converge empirically)一个高斯随机变量,即
lim ⁡ N → ∞ { y n } = P L ( 2 ) Y ∼ N ( 0 , τ ) (89) \lim_{N \rightarrow \infty} \{ y_n \}\overset{PL(2)}{=}Y \sim \mathcal N(0, \tau) \tag{89} Nlim{yn}=PL(2)YN(0,τ)(89)

证明:因为 x \boldsymbol x x的分布各向同性,则我们可以用一个归一化的高斯随机变量来表征,即
x = d ∥ x ∥ ∥ w 0 ∥ w 0 ,   w 0 ∼ N ( 0 , I N − s ) \boldsymbol x \overset{d}{=} \frac{\Vert \boldsymbol x \Vert}{\Vert \boldsymbol w_0 \Vert} \boldsymbol w_0, \ \boldsymbol w_0 \sim \mathcal N(\boldsymbol 0, \pmb I_{N-s}) x=dw0xw0, w0N(0,IIINs)

对任意 N N N,构建正交阵 S = [ U   U ⊥ ] \boldsymbol S=[\boldsymbol U \ \boldsymbol U_{\bot}] S=[U U]。令 w 1 ∼ N ( 0 , I s ) \boldsymbol w_1 \sim \mathcal N(\boldsymbol 0, \pmb I_s) w1N(0,IIIs),独立于 w 0 \boldsymbol w_0 w0,定义
w = [ w 0 w 1 ] ∼ N ( 0 , I N ) \boldsymbol{w}=\left[ \begin{array}{c} \boldsymbol{w}_0\\ \boldsymbol{w}_1\\ \end{array} \right] \sim \mathcal N(\boldsymbol 0, \boldsymbol I_N) w=[w0w1]N(0,IN)
根据定义,写出 y \boldsymbol y y:
y = U x = d ∥ x ∥ ∥ w 0 ∥ [ S w − U ⊥ w 1 ] (90) \boldsymbol y = \boldsymbol {Ux} \overset{d}{=} \frac{\Vert \boldsymbol x \Vert}{\Vert \boldsymbol w_0 \Vert} [\boldsymbol S \boldsymbol w - \boldsymbol U_{\bot} \boldsymbol w_1] \tag{90} y=Ux=dw0x[SwUw1](90)

不难得到
lim ⁡ N → ∞ ∥ x ∥ ∥ w 0 ∥ = lim ⁡ N → ∞ ∥ x ∥ ( N − s ) = τ \lim_{N \rightarrow \infty} \frac{\Vert \boldsymbol x \Vert}{\Vert \boldsymbol w_0 \Vert} = \lim_{N \rightarrow \infty} \frac{\Vert \boldsymbol x \Vert}{\sqrt {(N-s)}} = \sqrt \tau Nlimw0x=Nlim(Ns) x=τ

lim ⁡ N → ∞ 1 N − s ∥ U ⊥ w 1 ∥ 2 = lim ⁡ N → ∞ 1 N ∥ w 1 ∥ 2 = 0 \lim_{N \rightarrow \infty} \frac{1}{N-s}{\Vert \boldsymbol U_{\bot} \boldsymbol w_1 \Vert}^2 = \lim_{N \rightarrow \infty} \frac{1}{N} {\Vert \boldsymbol w_1 \Vert}^2=0 NlimNs1Uw12=NlimN1w12=0

Logo

有“AI”的1024 = 2048,欢迎大家加入2048 AI社区

更多推荐