MLAI/DimensionalityReduction

Canonical Correlation Analysis

데먕 2020. 1. 25. 17:57

1. Overview

Canonical Correlation Analysis (CCA) as a good prediction model. Because CCA well explains data dependency between input and output. So CCA can minimize the prediction error. CCA finds pairs of basis that maximize the correlation between two variables x and y in subspace.  When we perform the regression in the reduced space, the fitting errors are minimized because two variables are highly correlated in the reduced space. $\bar{X}$ and $\bar{Y}$ represent the reduced variable x  and y,  respectively. After some substitution, we can define the CCA equation. The pair basis which is the solution of CCA is obtained by singular value decomposition.

2. Description

2.1 CCA Formulation

$$\underset{u\in R^{p},v\in R^{q}}{argmax}\frac{u^{T}X^{T}Yv}{\sqrt{(u^{T}X^{T}Xu)(v^{T}Y^{Y}Yv)}}$$

X is $n\times p$: n samples in p-dimensional space

Y is $n\times q$: n samples in q-dimensional space

The n samples are paired in X and Y

3. Reference

https://en.wikipedia.org/wiki/Canonical_correlation

https://www.youtube.com/watch?v=hL1pyyq8-Y4

http://users.stat.umn.edu/~helwig/notes/cancor-Notes.pdf

https://stats.idre.ucla.edu/r/dae/canonical-correlation-analysis/

https://ncss-wpengine.netdna-ssl.com/wp-content/themes/ncss/pdf/Procedures/NCSS/Canonical_Correlation.pdf

https://www.slideserve.com/wauna/canonical-correlation-analysis-for-feature-reduction