How to train a Keras model with very a big dataset?

Question

I am trying to train an autoencoder using TensorFlow and Keras. My training data has more than 200K 512x128 unlabeled images. If I want to load the data in a matrix, its shape will be (200000, 512, 128, 3). That is a few hundred GB of RAM space. I know I can reduce the batch size while training but that is for limiting memory usage in GPU/CPU.

Is there a workaround to this problem?

Deepak Sadulla · Accepted Answer

You can use the tf.data API for lazily loading the images... Below tutorial goes into the details..

https://www.tensorflow.org/tutorials/load_data/images

Also look into tf.data.Dataset.prefetch, tf.data.Dataset.batch and tf.data.Dataset.cache methods to optimize performance..

https://www.tensorflow.org/guide/data
https://www.tensorflow.org/guide/data_performance

You can also preprocess the data into TFRecords for reading them more efficiently before reading them in your training pipeline...

https://www.tensorflow.org/tutorials/load_data/tfrecord#tfrecord_files_using_tfdata

How to train a Keras model with very a big dataset?

Tags:

python

keras

unsupervised-learning

bigdata

autoencoder

Nirmal Baishnab

1 Answers

Deepak Sadulla

Recent Activity

Donate For Us

How to train a Keras model with very a big dataset?

Tags:

python

keras

unsupervised-learning

bigdata

autoencoder

Nirmal Baishnab

1 Answers

Deepak Sadulla

Related questions

Recent Activity

Donate For Us