I'm trying to figure out how to use PCA to decorrelate an RGB image in python. I'm using the code found in the O'Reilly Computer vision book:
from PIL import Image
from numpy import *
def pca(X):
# Principal Component Analysis
# input: X, matrix with training data as flattened arrays in rows
# return: projection matrix (with important dimensions first),
# variance and mean
#get dimensions
num_data,dim = X.shape
#center data
mean_X = X.mean(axis=0)
for i in range(num_data):
X[i] -= mean_X
if dim>100:
print 'PCA - compact trick used'
M = dot(X,X.T) #covariance matrix
e,EV = linalg.eigh(M) #eigenvalues and eigenvectors
tmp = dot(X.T,EV).T #this is the compact trick
V = tmp[::-1] #reverse since last eigenvectors are the ones we want
S = sqrt(e)[::-1] #reverse since eigenvalues are in increasing order
else:
print 'PCA - SVD used'
U,S,V = linalg.svd(X)
V = V[:num_data] #only makes sense to return the first num_data
#return the projection matrix, the variance and the mean
return V,S,mean_X
I know I need to flatten my image, but the shape is 512x512x3. Will the dimension of 3 throw off my result? How do I truncate this? How do I find a quantitative number of how much information is retained?
One of the use cases of PCA is that it can be used for image compression — a technique that minimizes the size in bytes of an image while keeping as much of the quality of the image as possible.
PCA is mainly used as the dimensionality reduction technique in various AI applications such as computer vision, image compression, etc. It can also be used for finding hidden patterns if data has high dimensions.
Abstract: Principal component analysis (PCA) is widely used as a means of di- mension reduction for high-dimensional data analysis. A main disadvantage of the standard PCA is that the principal components are typically linear combinations of all variables, which makes the results difficult to interpret.
PCA Color Augmentation (also called Fancy PCA) alters the intensities of the RGB channels along the natural variations of the images, denoted by the principal components of the pixel colors (Bargoti & Underwood, 2016). It performs Principal Components Analysis on the color channels, thus, given the name Fancy PCA.
If there are three bands (which is the case for an RGB image), you need to reshape your image like
X = X.reshape(-1, 3)
In your case of a 512x512 image, the new X
will have shape (262144, 3)
. The dimension of 3 will not throw off your result; that dimension represents the features in the image data space. Each row of X
is a sample/observation and each column represents a variable/feature.
The total amount of variance in the image is equal to np.sum(S)
, which is the sum of eigenvalues. The amount of variance you retain will depend on which eigenvalues/eigenvectors you retain. So if you only keep the first eigenvalue/eigenvector, then the fraction of image variance you retain will be equal to
f = S[0] / np.sum(S)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With