I imagine this is a broadly applicable question, but I'm trying to create a dataset for a particular competition that involves flying a UAV over a field with cardboard geometric shapes with alphanumeric characters painted on. The objective is to detect and classify the shapes and characters.
Currently, I'm using SURF to detect the shape, K-means to segment the shape and character, and a convolutional neural network to classify each. However, I'm experiencing a bottleneck when it comes to training data that can perform well with real data.
What I've Tried
Generating a dataset with Keras' ImageDataGenerator with random rotations, scalings, and skewings of a template image of each of the alphanumeric characters of a typewritten font and geometric shapes: works fine with data from the dataset (go figure) and some outside data but gets confused when the characters are too deviant
Using the MNIST dataset: no complaints, but only contains numbers
Using the EMNIST ByClass dataset (which is different from the MNIST dataset; contains letters as well): doesn't train easily because of size, and doesn't perform well even when trained to a decently high accuracy. In the dataset itself, many images bear little resemblance to the purported class, and some classes are at different rotations than others
Using Tesseract OCR for the characters. This hasn't had great results
What I Haven't Tried
Doing several flyovers with real cardboard cutouts that we create and using several frames from each video for the dataset. Cons: this would require quite a lot of flights and cardboard cutouts and wouldn't offer much data variation.
Using the ImageDataGenerator, but on several different fonts instead of one.
Does anyone have any advice on how to create a custom dataset for a task like this?
this is my dataSetGenerator maybe help you to generate your own dataset
import numpy as np
from os import listdir
from glob import glob
import cv2
def dataSetGenerator(path,resize=False,resize_to=224,percentage=100):
"""
DataSetsFolder
|
|----------class-1
| . |-------image-1
| . | .
| . | .
| . | .
| . |-------image-n
| .
|-------class-n
:param path: <path>/DataSetsFolder
:param resize:
:param resize_to:
:param percentage:
:return: images, labels, classes
"""
classes = listdir(path)
image_list = []
labels = []
for classe in classes:
for filename in glob(path+'/'+classe+'/*.tif'):
if resize:image_list.append(cv2.resize(cv2.imread(filename),(resize_to, resize_to)))
else:image_list.append(cv2.imread(filename))
label=np.zeros(len(classes))
label[classes.index(classe)]=1
labels.append(label)
indice = np.random.permutation(len(image_list))[:int(len(image_list)*percentage/100)]
return np.array([image_list[x] for x in indice]),np.array([labels[x] for x in indice]),np.array(classes)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With