Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How is the complexity of PCA O(min(p^3,n^3))?

I've been reading a paper on Sparse PCA, which is: http://stats.stanford.edu/~imj/WEBLIST/AsYetUnpub/sparse.pdf

And it states that, if you have n data points, each represented with p features, then, the complexity of PCA is O(min(p^3,n^3)).

Can someone please explain how/why?

like image 517
GrowinMan Avatar asked Dec 10 '13 23:12

GrowinMan


People also ask

What is the complexity of PCA?

So, the complexity of PCA is O(p2n+p3).

Is PCA computationally intensive?

Principal Component analysis (PCA): PCA is an unsupervised linear dimensionality reduction and data visualization technique for very high dimensional data. As having high dimensional data is very hard to gain insights from adding to that, it is very computationally intensive.

Which error is minimal in case of PCA?

In the Minimum Error Formulation, PCA is defined as the linear projection that minimises the average projection cost (mean squared error) between the data points and their projection . We will see in the following sections that, both these formulations lead to the same solution.

Is PCA good for time series?

PCA, as a data transformation, dimensionality reduction, exploration, and visualization tool, does not make any assumptions. You can run it on any data whatsoever, including time series data. In fact, PCA is very often applied for time series data (sometimes it is called "functional PCA", sometimes not).


2 Answers

Covariance matrix computation is O(p2n); its eigen-value decomposition is O(p3). So, the complexity of PCA is O(p2n+p3).

O(min(p3,n3)) would imply that you could analyze a two-dimensional dataset of any size in fixed time, which is patently false.

like image 186
Don Reba Avatar answered Sep 21 '22 08:09

Don Reba


Assuming your dataset is $X \in \R^{nxp}$ where n: number of samples, d: dimensions of a sample, you are interested in the eigenanalysis of $X^TX$ which is the main computational cost of PCA. Now matrices $X^TX \in \R^{pxp}$ and $XX^T \in \R^{nxn}$ have the same min(n, p) non negative eigenvalues and eigenvectors. Assuming p less than n you can solve the eigenanalysis in $O(p^3)$. If p greater than n (for example in computer vision in many cases the dimensionality of sample -number of pixels- is greater than the number of samples available) you can perform eigenanalysis in $O(n^3)$ time. In any case you can get the eigenvectors of one matrix from the eigenvalues and eigenvectors of the other matrix and do that in $O(min(p, n)^3)$ time.

$$X^TX = V \Lambda V^T$$

$$XX^T = U \Lambda U^T$$

$$U = XV\Lambda^{-1/2}$$

like image 23
michaelt Avatar answered Sep 19 '22 08:09

michaelt