Digital image processing of corn kernels

Question

I am trying to identify and count insect-infested corn kernels from good or healthy corn kernels. I have done the thresholding up until drawing contours around all the corn kernels in the image.

insect infested (with hole and fading yellow color) and good corn kernels

FYI, the insect-infested kernels have holes and fading yellow color. How should I get the percentage of infested kernels from an image with the infested and good kernels? I am also open to other suggestions.

Tomer Geva · Accepted Answer

I will offer a solution which implements one of the most fundamental ideas of image processing which is feature representation of objects. In the following example I will show how we can:

Remove the background of the corn kernels
Extract the centroid location of each corn kernel using Green's theorem
Convert each corn kernel from an RGB Region of interest to a histogram
Allocate similar labels to similar kernels using the histogram representation of each kernel and the k-means algorithm.

I will walk through the stages of the algorithm along with results of each stage and the code will be attached at the end

Project infrastructure

Our little project will be conveniently organized under the CornClassifier class. The first stage will be to import the needed libraries and setup the __init__() method.

Each of the parameterrs defined in the __init__() section will be used during the implementation.

To get things started, we will first read the image and save it locally under the CornClassifier parameters for convenience, both in color and in grayscale. Therefore we will write the load_image function, which will make our class infrastructure look as follows:

class CornClassifier:
    def __init__(self, image):
        self.path  = image
        # Image
        self.image           = None
        self.image_grayscale = None
        # Masking parameters
        self.ret  = None
        self.mask = None
        self.masked_image     = None
        self.masked_image_lab = None
        # Corn centroid parameters
        self.centroid_tuples = []
        self.centroid_x      = []
        self.centroid_y      = []
        self.contours        = []  # Saving the contours for the histogram computations
        # Corn histograms
        self.corn_histograms = []

    def load_image(self, show=False):
        """
        :param show: Plotting the image to screen
        :return: loading the image from the path to the attribute `image`
        """
        self.image           = cv2.imread(self.path, cv2.IMREAD_COLOR)
        self.image_grayscale = cv2.imread(self.path, cv2.IMREAD_GRAYSCALE)
        if show:
            plt.imshow(self.image[:,:,[2,1,0]])  # cv2.imread flips the channel order
            plt.show()

Background removal

In this section we will remove the background thus allowing better separation between "good" and "bad" corn kernels. This will be done via utilizing the fact that the background is black whereas the corn kernels are not. The main steps of the section will be:

perform Gaussian Blur to the grayscale image. This will blur the corn kernels a little while making the white shimmer of the black surface darker. This will help separate the background from the corn kernels.
Perform thresholding on the blurred image using Otsu's method which is the preferable choice in the case where we have a black background and white foreground, as is the case of the grayscale image (you can read more on this here).
Assuming that the corn kernels are clearly separated, we will find all the different contours in the binary image output of stage (2). With each contour we will fill the content of each shape to allow better masking of each corn kernel.
After creating the mask, we will apply the mask and change the color-space from RGB to a color-space which allows better separation of the "good" kernels from the "bad" ones. After playing around with some color-spaces, the best one I found is the LAB space which consists of:
- Lightness (intensity)
- A - color component ranging from Green to Magenta
- B - color component ranging from Blue to Yellow

You can read more on the color-spaces available here

This will be implemented in the remove_background function (see code below). The result of this background removal if the following mask: enter image description here

And the resulted masked image (in RGB) will be: enter image description here

Let us note that there are still small artifacts which will be dealt with in the following functions.

Isolating The corn from the remaining artifacts

In this section we will remove any residual artifacts. We will do so by the observation that the resulted masked image after the background removal is almost perfect and any remaining artifacts are small and can be represented by polygons with small number of faces (or corners). Therefore, we will create the isolating_corn function to do just that. The function iterates over all the contours in the mask and discards contours which does not have more than 20 corners in the representing polygon. The polygons which pass the test as saved in the CornClassifier countours parameter and the centroids of each corn kernel is computed using the moments of each contour (Using Green's theorem, the theory behind this is a bit complicated but understandable, you can read more here)

After applying this function we can see that all the artifacts have been discarded, as seen in the Figure below. Is there were any artifact remaining we would see a centroid where there is not any corn kernels. enter image description here

Corn kernel representation

In this section the most important part of the project is happening, we will represent each corn kernel as an equivalent pixel histogram. since the LAB color-space has an 3 color channels, a naïve approach will represent each corn kernel as a (255*3)X1 = 765X1 vector (not including the black equivalent component of the LAB color-space to ignore the background). An example of a few of the histograms is given below. We can see that the green and blue histograms are somewhat similar and the red histogram is different that the other two. enter image description here

Nevertheless, we can do better. The corn kernels are not pure Lambertian surfaces (you can read more about Lambertian reflection here) we will assume that they ar. This means that a change in lighting can be caused by the rotation of each kernel and the shape of each kernel, resulting in a slightly different reflection and a slightly different color. Therefore, we will group together close colors and reduce the total number of bins in each channel from 256 to 16, resulting in a (15*3)X1 = 45X1 histogram vector. The same corn kernel will now be represented in the following histograms: enter image description here Each histogram will be saved to the future use in the clustering algorithm. The implementation of this will be in the compute_histograms function (see code below). We can see that the histograms representation the the corn kernels can be further improved as we can see that some bins have zero value across all corn kernels, but for now we can leave this be.

Clustering

Up until this point we set up the stage for the main event! now that we have our representation of the corn kernels we can group them up into distinct groups. Since we know the number of groups we want to have (2 for "good" and "bad") we can use the K-means algorithm with K=2. There are numerous explanations regarding this algorithm so I will not leave a reference here. The implementation of this will be as follows. We fit the k-means model using our corn_histograms parameter and using n_clusters=2. we then extract the matching labels w.r.t each of the histograms and scatter the centroids of each cluster with different color over the original picture. This will be implemented in the classify_corn (see code below) The result is seen below: enter image description here

We can see that the corn kernel have been divided into two clusters where one cluster (red centroids) show the good corn kernels and the other cluster (blue centroids) show the red kernels. After the clustering we have a labels vector, allocating each kernel to one is the two clusters discovered by the K-means algorithm. Computing the percentages of each of the two groups as can be done as follows:

print(f'Total corn kernels detected: {len(labels)}')
print(f'Number of "Blue" group kernels: {np.sum(labels == 1)} ; Percentage: {np.around(100 * np.sum(labels == 1) / len(labels), 2)} %')
print(f'Number of  "Red" group kernels: {np.sum(labels == 0)} ; Percentage: {100 - np.around(100 * np.sum(labels == 1) / len(labels), 2)} % ')

Resulting in:

Total corn kernels detected: 70
Number of "Blue" group kernels: 27 ; Percentage: 38.57 %
Number of  "Red" group kernels: 43 ; Percentage: 61.43 %

Summary

This project sums up two very important aspects in computer vision:

Feature representation of object, which was in this case the histogram representation of the corn kernels
Clustering object via their feature representation using the k-means algorithm

For convenience sake, The full CornClassifier class is written below, as well as the calling for the functions:

import cv2
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans

class CornClassifier:
    def __init__(self, image):
        self.path  = image
        # Image
        self.image           = None
        self.image_grayscale = None
        # Masking parameters
        self.ret  = None
        self.mask = None
        self.masked_image     = None
        self.masked_image_lab = None
        # Corn centroid parameters
        self.centroid_tuples = []
        self.centroid_x      = []
        self.centroid_y      = []
        self.contours        = []  # Saving the contours for the histogram computations
        # Corn histograms
        self.corn_histograms = []

    def load_image(self, show=False):
        """
        :param show: Plotting the image to screen
        :return: loading the image from the path to the attribute `image`
        """
        self.image           = cv2.imread(self.path, cv2.IMREAD_COLOR)
        self.image_grayscale = cv2.imread(self.path, cv2.IMREAD_GRAYSCALE)
        if show:
            plt.imshow(self.image[:,:,[2,1,0]])  # cv2.imread flips the channel order
            plt.show()

    def remove_background(self, show=False):
        """
        :param show: Plotting the mask to screen
        :return:
        1. Performing gaussian filtering to blur the noise of the black background
        2. Performing Otsu's thresholding - practical example is given in:
         https://opencv24-python-tutorials.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_thresholding/py_thresholding.html#otsus-binarization
        3. Fill contour to better mask the image
        4. Mask the image ans change the colorspace
        """
        image         = self.image_grayscale.copy()
        # Step 1
        blurred_image = cv2.GaussianBlur(image, (5,5), 0)
        # Step 2
        self.ret, self.mask = cv2.threshold(blurred_image, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
        # Step 3 - Filling holes in the corn kernel
        contours, hierarchies = cv2.findContours(self.mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
        for c in contours:
            cv2.fillPoly(self.mask, pts=[c], color=(255, 255, 255))
        # Step 4
        self.masked_image     = cv2.bitwise_and(self.image, self.image, mask=self.mask)
        self.masked_image_lab = cv2.cvtColor(self.masked_image, cv2.COLOR_BGR2LAB)
        if show:
            plt.figure()
            plt.imshow(self.mask, cmap='gray')  # cv2.imread flips the channel order
            plt.figure()
            plt.imshow(self.masked_image[:,:,[2,1,0]])
            plt.show()

    def isolating_corn(self, show=False):
        """
        :param show:
        :return: Extracting the coordinates of each corn object, assuming all corn kernels
        are seperated from each other. We compute the centroids my computing the moments of each
        corn kernel (Green's theorem)
        https://learnopencv.com/find-center-of-blob-centroid-using-opencv-cpp-python/
        """
        mask = self.mask.copy()
        # Finding the different contours
        contours, hierarchies  = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
        for c in contours:
            # removing small contours
            if c.shape[0] < 20:
                continue
            # calculate moments for each contour
            M = cv2.moments(c)
            # calculate x,y coordinate of center
            try:
                cX = int(M["m10"] / M["m00"])
                cY = int(M["m01"] / M["m00"])
                self.centroid_tuples.append((cX, cY))
                self.centroid_x.append(cX)
                self.centroid_y.append(cY)
                self.contours.append(c)
            except ZeroDivisionError:
                pass
        if show:
            plt.figure()
            plt.imshow(self.masked_image[:,:,[2,1,0]])
            plt.scatter(self.centroid_x, self.centroid_y)
            plt.show()

    def compute_histograms(self, show=False):
        """
        :param show:
        :return: Computing the histogram for each corn kernel
        """
        for c in self.contours:
            # Creating an image with just that filled contour
            temp_mask = np.zeros_like(self.image)
            cv2.fillPoly(temp_mask, pts=[c], color=(255, 255, 255))
            single_corn = cv2.bitwise_and(self.masked_image_lab, temp_mask)
            # Generating histograms, avoiding the 0 values
            hist0 = cv2.calcHist([single_corn],[0],None,[16],[0,256])
            hist1 = cv2.calcHist([single_corn],[1],None,[16],[0,256])
            hist2 = cv2.calcHist([single_corn],[2],None,[16],[0,256])
            total_hist = np.squeeze(np.vstack((hist0[1:], hist1[1:], hist2[1:])))
            self.corn_histograms.append(total_hist / sum(total_hist))
        if show:
            plt.figure()
            plt.stem(self.corn_histograms[10], markerfmt='b', basefmt='b')
            plt.stem(self.corn_histograms[1], markerfmt='r', basefmt='r')
            plt.stem(self.corn_histograms[-1], markerfmt='g', basefmt='g')
            plt.show()

    def classify_corn(self):
        kmeans = KMeans(n_clusters=2, init='k-means++', random_state=0).fit(self.corn_histograms)
        labels = kmeans.labels_
        print(f'Total corn kernels detected: {len(labels)}')
        print(f'Number of "Blue" group kernels: {np.sum(labels == 1)} ; Percentage: {np.around(100 * np.sum(labels == 1) / len(labels), 2)} %')
        print(f'Number of  "Red" group kernels: {np.sum(labels == 0)} ; Percentage: {100 - np.around(100 * np.sum(labels == 1) / len(labels), 2)} % ')
        plt.imshow(self.image[:,:,[2,1,0]])
        plt.scatter(np.array(self.centroid_x)[labels.astype(bool)], np.array(self.centroid_y)[labels.astype(bool)], c='b')
        plt.scatter(np.array(self.centroid_x)[~labels.astype(bool)], np.array(self.centroid_y)[~labels.astype(bool)], c='r')
        plt.show()

if __name__ == '__main__':
    corn = CornClassifier('./corn.jpg')
    corn.load_image(False)
    corn.remove_background(False)
    corn.isolating_corn(False)
    corn.compute_histograms(True)
    corn.classify_corn()

Digital image processing of corn kernels

Tags:

python

image-processing

opencv

detection

computer-vision

SMD

1 Answers

Project infrastructure

Background removal

Isolating The corn from the remaining artifacts

Corn kernel representation

Clustering

Summary

Tomer Geva

Recent Activity

Donate For Us

Digital image processing of corn kernels

Tags:

python

image-processing

opencv

detection

computer-vision

SMD

1 Answers

Project infrastructure

Background removal

Isolating The corn from the remaining artifacts

Corn kernel representation

Clustering

Summary

Tomer Geva

Related questions

Recent Activity

Donate For Us