I am working on an image classification CNN to practice understanding machine learning, and I want to be as vanilla as possible to really get a concrete understanding of what is happening while also remaining somewhat efficient.
I have a directory structure that goes like this:
training folder
3 folders named 0, 1, 2
0 contains only pictures of cats
1 contains only pictures of dogs
2 contains only pictures of ducks
testing folder
3 folders named 0, 1, 2
0 contains only pictures of cats
1 contains only pictures of dogs
2 contains only pictures of ducks
I created this snippet of code to go through folder 0, convert all images(of cats) to image array, then go to folder 1 and do the same for all images(of dogs), and finally go to folder 2 and repeat for ducks. I then converted that returned list into a numpy array defined as x_train to feed into the model.
def get_img_array(dir):
for num in range(0,3):
image_list = [img for img in os.listdir(dir + str(num)) if img.endswith('.jpg')]
for img_name in range(0,len(image_list)):
loaded_image = image.load_img(dir + str(num) + '\\' + str(image_list[img_name]), grayscale = False)
process_img = image.img_to_array(loaded_image)
processed_list.append(process_img/255)
return processed_list
but I'm not sure of how to move forward giving them the label y_train and y_test
I'm aware I could create a csv file with the name of each image and a corresponding label such as "0", "1", and "2" in the next column depending on the picture and import them that way, but I'm curious to see if there is a better and more efficient way to add labels with the structure I currently have?
I've tried to research and look over GitHub repos, guides, and SO questions(Convolutional Neural Networks labels unfortunately doesn't have a useful answer) but I've only come across data sets that were hardly explained, or it was imported from a database pre-labeled in some way unknown to me, so an in-depth explanation would be great!
You can create label array same time when you create pixel array.Lets assume your categories are cat=0,dog=1,ducks=2.initialize a empty numpy array and create label array for each folder.and concatenate the each array for get the final labels.
def get_img_array(dir):
labels_arr= np.empty(shape=[0,1])
for num in range(0,3):
image_list = [img for img in os.listdir(dir + str(num)) if img.endswith('.jpg')]
for img_name in range(0,len(image_list)):
loaded_image = image.load_img(dir + str(num) + '\\' + str(image_list[img_name]), grayscale = False)
process_img = image.img_to_array(loaded_image)
processed_list.append(process_img/255)
labels = np.full((len(image_list),1),num)
labels_arr= np.concatenate((labels_arr, labels))
return processed_list,labels_arr
Check this answer as well for more intuitive. How to prepare training data for image classification
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With