Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to compute "EMD" for 2 numpy arrays i.e "histogram" using opencv?

Since I'm new to opencv, I don't know how to use the cv.CalcEMD2 function with numpy arrays.
I have two arrays:

a=[1,2,3,4,5]  
b=[1,2,3,4]

How can I transfer numpy array to CVhistogram and from Cvhistogram to the function parameter signature?

I would like anyone who answers the question to explain any used opencv functions through the provided solution.

"EMD" == earth mover's distance.

Update:-
also ,It will be helpful if anyone can tell me how to set the cv.CalcEMD2 parameter i.e"signature" using numpy array!!

Note:-
* For those who may be interested in this question ,This answer needs more testing.

like image 428
Someone Someoneelse Avatar asked Mar 29 '13 15:03

Someone Someoneelse


3 Answers

I know the OP wanted to measure Earth Mover's Distance using OpenCV, but if you'd like to do so using Scipy, you can use the following (Wasserstein Distance is also known as Earth Mover's Distance):

from scipy.stats import wasserstein_distance
from scipy.ndimage import imread
import numpy as np

def get_histogram(img):
  '''
  Get the histogram of an image. For an 8-bit, grayscale image, the
  histogram will be a 256 unit vector in which the nth value indicates
  the percent of the pixels in the image with the given darkness level.
  The histogram's values sum to 1.
  '''
  h, w = img.shape
  hist = [0.0] * 256
  for i in range(h):
    for j in range(w):
      hist[img[i, j]] += 1
  return np.array(hist) / (h * w)

a = imread('a.jpg')
b = imread('b.jpg')
a_hist = get_histogram(a)
b_hist = get_histogram(b)
dist = wasserstein_distance(a_hist, b_hist)
print(dist)
like image 102
duhaime Avatar answered Nov 03 '22 04:11

duhaime


You have to define your arrays in terms of weights and coordinates. If you have two arrays a = [1,1,0,0,1] and b = [0,1,0,1] that represent one dimensional histograms, then the numpy arrays should look like this:

a = [[1 1]
     [1 2]
     [0 3]
     [0 4]
     [1 5]]

b = [[0 1]
     [1 2]
     [0 3]
     [1 4]]

Notice that the number of rows can be different. The number of columns should be the dimensions + 1. The first column contains the weights, and the second column contains the coordinates.

The next step is to convert your arrays to a CV_32FC1 Mat before you input the numpy array as a signature to the CalcEMD2 function. The code would look like this:

from cv2 import *
import numpy as np

# Initialize a and b numpy arrays with coordinates and weights
a = np.zeros((5,2))

for i in range(0,5):
    a[i][1] = i+1

a[0][0] = 1
a[1][0] = 1
a[2][0] = 0
a[3][0] = 0
a[4][0] = 1

b = np.zeros((4,2))

for i in range(0,4):
    b[i][1] = i+1

b[0][0] = 0
b[1][0] = 1
b[2][0] = 0
b[3][0] = 1    

# Convert from numpy array to CV_32FC1 Mat
a64 = cv.fromarray(a)
a32 = cv.CreateMat(a64.rows, a64.cols, cv.CV_32FC1)
cv.Convert(a64, a32)

b64 = cv.fromarray(b)
b32 = cv.CreateMat(b64.rows, b64.cols, cv.CV_32FC1)
cv.Convert(b64, b32)

# Calculate Earth Mover's
print cv.CalcEMD2(a32,b32,cv.CV_DIST_L2)

# Wait for key
cv.WaitKey(0)

Notice that the third parameter of CalcEMD2 is the Euclidean Distance CV_DIST_L2. Another option for the third parameter is the Manhattan Distance CV_DIST_L1.

I would also like to mention that I wrote the code to calculate the Earth Mover's distance of two 2D histograms in Python. You can find this code here.

like image 26
Jaime Ivan Cervantes Avatar answered Nov 03 '22 03:11

Jaime Ivan Cervantes


CV.CalcEMD2 expects arrays that also include the weight for each signal according to the documentation.

I would suggest defining your arrays with a weight of 1, like so:

a=array([1,1],[2,1],[3,1],[4,1],[5,1])
b=array([1,1],[2,1],[3,1],[4,1])
like image 3
Bas Jansen Avatar answered Nov 03 '22 05:11

Bas Jansen