PCA of RGB Image

Tags:

I'm trying to figure out how to use PCA to decorrelate an RGB image in python. I'm using the code found in the O'Reilly Computer vision book:

from PIL import Image
from numpy import *

def pca(X):
  # Principal Component Analysis
  # input: X, matrix with training data as flattened arrays in rows
  # return: projection matrix (with important dimensions first),
  # variance and mean

  #get dimensions
  num_data,dim = X.shape

  #center data
  mean_X = X.mean(axis=0)
  for i in range(num_data):
      X[i] -= mean_X

  if dim>100:
      print 'PCA - compact trick used'
      M = dot(X,X.T) #covariance matrix
      e,EV = linalg.eigh(M) #eigenvalues and eigenvectors
      tmp = dot(X.T,EV).T #this is the compact trick
      V = tmp[::-1] #reverse since last eigenvectors are the ones we want
      S = sqrt(e)[::-1] #reverse since eigenvalues are in increasing order
  else:
      print 'PCA - SVD used'
      U,S,V = linalg.svd(X)
      V = V[:num_data] #only makes sense to return the first num_data

   #return the projection matrix, the variance and the mean
   return V,S,mean_X

I know I need to flatten my image, but the shape is 512x512x3. Will the dimension of 3 throw off my result? How do I truncate this? How do I find a quantitative number of how much information is retained?

581

asked Mar 20 '14 12:03

user3433572

1 Answers

If there are three bands (which is the case for an RGB image), you need to reshape your image like

X = X.reshape(-1, 3)

In your case of a 512x512 image, the new X will have shape (262144, 3). The dimension of 3 will not throw off your result; that dimension represents the features in the image data space. Each row of X is a sample/observation and each column represents a variable/feature.

The total amount of variance in the image is equal to np.sum(S), which is the sum of eigenvalues. The amount of variance you retain will depend on which eigenvalues/eigenvectors you retain. So if you only keep the first eigenvalue/eigenvector, then the fraction of image variance you retain will be equal to

f = S[0] / np.sum(S)

answered Sep 28 '22 09:09

bogatron

Related questions
                            
                                PyQt: Occasional segfaults when using QApplication.quit
                            
                                Python - get number of columns from csv file
                            
                                how to store decision tree
                            
                                Filtering signal with Python lfilter
                            
                                Handling IP and port in python and bash [closed]
                            
                                Calling a .py script from a specific file path in Python interpreter
                            
                                Django not installing in virtualenv
                            
                                Python: Is isinstance() necessary in this case?
                            
                                Matplotlib exiting error
                            
                                Getting "pika.exceptions.ConnectionClosed" error while using rabbitmq in python
                            
                                Faster way to transform group with mean value in Pandas
                            
                                sunflower scatter plot using matplotlib
                            
                                Cannot import zmq in python (install issue)
                            
                                How do I pass arguments to after_request?
                            
                                Importing matplotlib.pyplot and BeautifulSoup with cxFreeze
                            
                                How to know that the interpreter is Jython or CPython in the code? [duplicate]
                            
                                Extract metadata from a Video/Image
                            
                                Dropping a single (sub-) column from a MultiIndex
                            
                                Finding equality between different strings that should be equal
                            
                                After Pandas Dataframe pd.concat I get NaNs

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

PCA of RGB Image

Tags:

python

numpy

pca

svd

user3433572

People also ask

1 Answers

bogatron

Recent Activity

Donate For Us