F范数 低秩矩阵近似高秩矩阵

  • Post author:
  • Post category:其他




F范数定义





A

=

m

,

n

=

1

,

1

M

,

N

a

m

,

n

2

\|A\|=\sqrt{\sum_{m,n=1,1}^{M, N}a_{m,n}^2}









A







=

























m


,


n


=


1


,


1



















M


,


N





















a











m


,


n









2













































SVD-singular value decomposition




A

A






A





is a



m

×

n

m\times n






m




×








n





matrix.





A

=

U

Σ

V

=

U

[

Σ

k

0

0

0

]

V

A=U\Sigma V^*=U\begin{bmatrix} \Sigma_k&\mathbf{0} \\\mathbf{0}&\mathbf{0} \end{bmatrix}V^*






A




=








U


Σ



V






















=








U






[














Σ










k

























0































0










0





















]







V




























k

k






k





is the rank of



A

A






A





.

Then we have:





A

=

U

Σ

V

=

T

r

(

(

U

Σ

V

)

(

U

Σ

V

)

T

)

=

T

r

(

U

Σ

V

V

Σ

T

U

T

)

=

m

=

1

k

σ

m

2

=

Σ

\begin{aligned} \|A\| &=\|U\Sigma V^*\|\\ &=\sqrt{Tr((U\Sigma V^*)(U\Sigma V^*)^T)}\\ &= \sqrt{Tr(U\Sigma V^*V\Sigma^TU^T)}\\&=\sqrt{\sum_{m=1}^k\sigma_m^2}=\|\Sigma\| \end{aligned}



















A


















































=







U


Σ



V

































=












T


r


(


(


U


Σ



V




















)


(


U


Σ



V





















)










T









)


































=












T


r


(


U


Σ



V




















V



Σ










T










U










T









)


































=





















m


=


1


















k




















σ










m








2










































=







Σ



























Now we can form a rank



r

k

r \le k






r













k





matrix



A

^

\hat{A}














A







^










by setting the value



δ

r

+

1

,


,

δ

k

\delta_{r+1},\cdots, \delta_{k}







δ











r


+


1



















,











,





δ











k






















in



Σ

k

\Sigma_k







Σ










k





















to be zero, namely:





A

^

=

U

[

Σ

r

0

0

0

]

V

\hat{A}=U\begin{bmatrix} \Sigma_r&\mathbf{0} \\\mathbf{0}&\mathbf{0} \end{bmatrix}V^*














A







^









=








U






[














Σ










r

























0































0










0





















]







V

























Then the Frobenius norm



ε

r

\varepsilon_r







ε










r





















of the error matrix



(

A

A

^

)

(A-\hat{A})






(


A





















A







^







)





is given by:





ε

r

=

(

A

A

^

)

=

U

(

Σ

Σ

^

)

V

)

=

Σ

Σ

^

=

m

=

r

+

1

k

σ

m

2

\begin{aligned} \varepsilon_r&=\|(A-\hat{A})\| \\ &=\| U(\Sigma-\hat{\Sigma})V^*)\| \\ &= \| \Sigma-\hat{\Sigma}\| \\&=\sqrt{\sum_{m=r+1}^k\sigma_m^2} \end{aligned}

















ε










r































































=







(


A

















A







^







)















=







U


(


Σ

















Σ







^







)



V




















)















=







Σ

















Σ







^




















=





















m


=


r


+


1


















k




















σ










m








2






























































Assume



B

B






B





is already a matrix giving the minimum value of



ε

B

\varepsilon_B







ε










B





















and its singular value decomposition is given by:





B

=

U

b

Σ

b

V

b

=

U

b

[

Σ

b

0

0

0

]

V

b

B=U_b\Sigma_b V^*_b=U_b\begin{bmatrix} \Sigma_b&\mathbf{0} \\\mathbf{0}&\mathbf{0} \end{bmatrix}V^*_b






B




=









U










b



















Σ










b



















V










b





























=









U










b






















[














Σ










b

























0































0










0





















]







V










b
































Now we define a new matrix



C

C






C





which is given by:





C

=

U

b

H

A

V

b

=

[

C

11

C

12

C

21

C

22

]

\boldsymbol{C}=\boldsymbol{U}_{b}^{H} \boldsymbol{A} \boldsymbol{V}_{b}=\left[\begin{array}{cc} \boldsymbol{C}_{11} & \boldsymbol{C}_{12} \\ \boldsymbol{C}_{21} & \boldsymbol{C}_{22} \end{array}\right]








C






=











U













b










H





















A







V













b





















=










[


















C













1


1




























C













2


1

















































C













1


2




























C













2


2







































]







Then we have:





ε

B

=

A

B

=

U

b

H

(

A

B

)

V

b

=

C

Σ

b

=

C

11

Σ

^

b

+

C

12

+

C

21

+

C

22

\begin{aligned} \varepsilon_{B} &=\|\boldsymbol{A}-\boldsymbol{B}\| \\ &=\left\|\boldsymbol{U}_{b}^{H}(\boldsymbol{A}-\boldsymbol{B}) \boldsymbol{V}_{b}\right\| \\ &=\left\|\boldsymbol{C}-\boldsymbol{\Sigma}_{b}\right\| \\ &=\left\|\boldsymbol{C}_{11}-\hat{\mathbf{\Sigma}}_{b}\right\|+\left\|\boldsymbol{C}_{12}\right\|+\left\|\boldsymbol{C}_{21}\right\|+\left\|\boldsymbol{C}_{22}\right\| \end{aligned}

















ε











B
































































=









A













B

















=






















































U













b










H



















(




A













B




)





V













b












































































=










C














Σ













b

































=






















































C













1


1




































Σ








^
















b




































































+











C













1


2

























+











C













2


1

























+











C













2


2











































since



B

\boldsymbol{B}








B







is already a matrix giving the minimum value of



ε

B

,

\varepsilon_{B},







ε











B



















,





we must have



C

12

=

0

\boldsymbol{C}_{12}=\mathbf{0}









C













1


2





















=









0






Otherwise, we will be able to construct a new rank



r

r






r





matrix



B

^

\hat{\boldsymbol{B}}
















B









^










, given by:





B

^

=

U

b

[

Σ

^

b

C

12

0

0

]

V

b

H

\hat{\boldsymbol{B}}=\boldsymbol{U}_{b}\left[\begin{array}{cc} \hat{\boldsymbol{\Sigma}}_{b} & \boldsymbol{C}_{12} \\ \boldsymbol{0} & \boldsymbol{0} \end{array}\right] \boldsymbol{V}_{b}^{H}
















B









^









=











U













b























[


























Σ









^
















b



























0


































C













1


2



























0
























]









V













b










H
























so that the new Frobenius norm:





ε

B

^

=

A

B

^

=

U

b

H

(

A

B

^

)

V

b

=

C

11

Σ

^

b

+

C

21

+

C

22

\begin{aligned} \varepsilon_{\hat{B}} &=\|\boldsymbol{A}-\hat{\boldsymbol{B}}\| \\ &=\left\|\boldsymbol{U}_{b}^{H}(\boldsymbol{A}-\hat{\boldsymbol{B}}) \boldsymbol{V}_{b}\right\| \\ &=\left\|\boldsymbol{C}_{11}-\hat{\mathbf{\Sigma}}_{b}\right\|+\left\|\boldsymbol{C}_{21}\right\|+\left\|\boldsymbol{C}_{22}\right\| \end{aligned}

















ε



















B








^
































































=









A





















B









^




















=






















































U













b










H



















(




A





















B









^







)





V













b












































































=






















































C













1


1




































Σ








^
















b




































































+











C













2


1

























+











C













2


2













































will be smaller than



ε

B

,

\varepsilon_{B},







ε











B



















,





which contradicts the assumption that



B

B






B





gives the minimum value. In the same way, we have



C

21

=

0

C_{21}=0







C











2


1





















=








0





and



C

11

=

Σ

^

b

.

C_{11}=\hat{\mathbf{\Sigma}}_{b} .







C











1


1





















=


















Σ








^
















b



















.





Then we have:





C

=

U

b

H

A

V

b

=

[

Σ

^

b

0

0

C

22

]

\boldsymbol{C}=\boldsymbol{U}_{b}^{H} \boldsymbol{A} \boldsymbol{V}_{b}=\left[\begin{array}{cc} \hat{\boldsymbol{\Sigma}}_{b} & \boldsymbol{0} \\ \boldsymbol{0} & \boldsymbol{C}_{22} \end{array}\right]








C






=











U













b










H





















A







V













b





















=










[


























Σ









^
















b



























0

































0













C













2


2







































]









since



Σ

^

b

\hat{\mathbf{\Sigma}}_{b}
















Σ








^
















b






















is diagonal, it consists of



r

r






r





singular values of



A

.

\boldsymbol{A} .








A




.





We can get:





ε

B

=

C

Σ

b

=

C

22

\begin{aligned} \varepsilon_{B} &=\left\|\boldsymbol{C}-\boldsymbol{\Sigma}_{b}\right\| \\ &=\left\|\boldsymbol{C}_{22}\right\| \end{aligned}

















ε











B




















































=










C














Σ













b

































=











C













2


2













































since both



U

b

U_{b}







U











b






















and



V

b

V_{b}







V











b






















are unitary matrices, we have:





A

2

=

C

2

=

Σ

^

b

2

+

C

22

2

\|\boldsymbol{A}\|^{2}=\|\boldsymbol{C}\|^{2}=\left\|\hat{\boldsymbol{\Sigma}}_{b}\right\|^{2}+\left\|\boldsymbol{C}_{22}\right\|^{2}











A

















2












=













C

















2












=



































































Σ









^
















b











































































2












+
















C













2


2
































2















Then:





c

x

2

=

A

2

Σ

^

b

2

=

m

=

1

k

σ

m

2

Σ

^

b

2

\begin{aligned} \left\|c_{x}\right\|^{2} &=\|A\|^{2}-\left\|\hat{\boldsymbol{\Sigma}}_{b}\right\|^{2}\\ &=\sum_{m=1}^{k} \sigma_{m}^{2}-\left\|\hat{\boldsymbol{\Sigma}}_{b}\right\|^{2} \end{aligned}






















c











x
































2











































=







A















2












































































Σ









^
















b











































































2




















=













m


=


1



















k





















σ











m










2





















































































Σ









^
















b











































































2
































Obviously, when



Σ

^

b

\hat{\mathbf{\Sigma}}_{b}
















Σ








^
















b






















holds the



r

r






r





largest singular values



σ

1

,

,

σ

r

\sigma_{1}, \ldots, \sigma_{r}







σ











1



















,









,





σ











r






















of the matrix



A

\boldsymbol{A}








A










C

22

2

\left\|\boldsymbol{C}_{22}\right\|^{2}














C













2


2
































2













and then



ε

B

\varepsilon_{B}







ε











B






















reaches its minimum value:





ε

B

2

=

C

22

2

=

m

=

r

+

1

k

σ

m

2

=

ε

r

2

\varepsilon_{B}^{2}=\left.\left|\boldsymbol{C}_{22}\right|\right|^{2}=\sum_{m=r+1}^{k} \sigma_{m}^{2}=\varepsilon_{r}^{2}







ε











B










2





















=



















C













2


2




































2












=

















m


=


r


+


1



















k





















σ











m










2





















=









ε











r










2
























Therefore, we can draw the conclusion that



A

^

\hat{A}














A







^










is the best rank



r

r






r





approximation to



A

A






A





based on minimisation of the error matrix’ Frobenius norm



A

B

\|\boldsymbol{A}-\boldsymbol{B}\|











A

















B











版权声明:本文为qq_40379678原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。