Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create Cifar-10 subset?

I would like to train a deep neural network using fewer training data samples to reduce the time for testing my code. II wanted to know how to subset the Cifar-10 dataset using Keras TensorFlow.I have the following code which is training for Cifar-10 complete dataset.

#load and prepare data
if WhichDataSet == 'CIFAR10':
    (x_train, y_train), (x_test, y_test) = tensorflow.keras.datasets.cifar10.load_data()
else:
    (x_train, y_train), (x_test, y_test) = tensorflow.keras.datasets.cifar100.load_data()
num_classes = np.unique(y_train).shape[0]
K_train = x_train.shape[0]
input_shape = x_train.shape[1:]
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
y_train = tensorflow.keras.utils.to_categorical(y_train, num_classes)
y_test = tensorflow.keras.utils.to_categorical(y_test, num_classes)
like image 570
prb_cm Avatar asked Oct 20 '25 10:10

prb_cm


1 Answers

Create susbset based on labels

Create a subset of dataset excluding few labels. For example, to create a new train dataset with only first five class labels you can use below code

subset_x_train = x_train[np.isin(y_train, [0,1,2,3,4]).flatten()]
subset_y_train = y_train[np.isin(y_train, [0,1,2,3,4]).flatten()]

Create subset irrespective of labels

To create a 10% subset of train data you can use below code

# Shuffle first (optional)
idx = np.arange(len(x_train))
np.random.shuffle(idx)

# get first 10% of data
subset_x_train = x_train[:int(.10*len(idx))]
subset_y_train = y_train[:int(.10*len(idx))]

Repeat the same for x_test and y_test to get a subset of test data.

like image 62
mujjiga Avatar answered Oct 22 '25 00:10

mujjiga



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!