Better way to add label data to convolutional neural network?

Question

I am working on an image classification CNN to practice understanding machine learning, and I want to be as vanilla as possible to really get a concrete understanding of what is happening while also remaining somewhat efficient.

I have a directory structure that goes like this:

training folder 
     3 folders named 0, 1, 2
          0 contains only pictures of cats
          1 contains only pictures of dogs
          2 contains only pictures of ducks
testing folder 
     3 folders named 0, 1, 2
          0 contains only pictures of cats
          1 contains only pictures of dogs
          2 contains only pictures of ducks

I created this snippet of code to go through folder 0, convert all images(of cats) to image array, then go to folder 1 and do the same for all images(of dogs), and finally go to folder 2 and repeat for ducks. I then converted that returned list into a numpy array defined as x_train to feed into the model.

def get_img_array(dir):
    for num in range(0,3):
        image_list = [img for img in os.listdir(dir + str(num)) if img.endswith('.jpg')]
        for img_name in range(0,len(image_list)):
            loaded_image = image.load_img(dir + str(num) + '\' + str(image_list[img_name]), grayscale = False)
            process_img = image.img_to_array(loaded_image)
            processed_list.append(process_img/255)
    return processed_list

but I'm not sure of how to move forward giving them the label y_train and y_test

I'm aware I could create a csv file with the name of each image and a corresponding label such as "0", "1", and "2" in the next column depending on the picture and import them that way, but I'm curious to see if there is a better and more efficient way to add labels with the structure I currently have?

I've tried to research and look over GitHub repos, guides, and SO questions(Convolutional Neural Networks labels unfortunately doesn't have a useful answer) but I've only come across data sets that were hardly explained, or it was imported from a database pre-labeled in some way unknown to me, so an in-depth explanation would be great!

Rajith Thennakoon · Accepted Answer

You can create label array same time when you create pixel array.Lets assume your categories are cat=0,dog=1,ducks=2.initialize a empty numpy array and create label array for each folder.and concatenate the each array for get the final labels.

def get_img_array(dir):
    labels_arr= np.empty(shape=[0,1])
    for num in range(0,3):
        image_list = [img for img in os.listdir(dir + str(num)) if img.endswith('.jpg')]
        for img_name in range(0,len(image_list)):
            loaded_image = image.load_img(dir + str(num) + '\' + str(image_list[img_name]), grayscale = False)
            process_img = image.img_to_array(loaded_image)
            processed_list.append(process_img/255)
        labels = np.full((len(image_list),1),num)
        labels_arr= np.concatenate((labels_arr, labels))
    return processed_list,labels_arr

Check this answer as well for more intuitive. How to prepare training data for image classification

Better way to add label data to convolutional neural network?

Tags:

python

tensorflow

dataset

keras

conv-neural-network

Entroyp

1 Answers

Rajith Thennakoon

Recent Activity

Donate For Us

Better way to add label data to convolutional neural network?

Tags:

python

tensorflow

dataset

keras

conv-neural-network

Entroyp

1 Answers

Rajith Thennakoon

Related questions

Recent Activity

Donate For Us