Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

tf.data.Dataset: how to get the dataset size (number of elements in a epoch)?

Let's say I have defined a dataset in this way:

filename_dataset = tf.data.Dataset.list_files("{}/*.png".format(dataset)) 

how can I get the number of elements that are inside the dataset (hence, the number of single elements that compose an epoch)?

I know that tf.data.Dataset already knows the dimension of the dataset, because the repeat() method allows repeating the input pipeline for a specified number of epochs. So it must be a way to get this information.

like image 602
nessuno Avatar asked Jun 07 '18 09:06

nessuno


People also ask

What is TF data dataset?

TensorFlow Datasets is a collection of datasets ready to use, with TensorFlow or other Python ML frameworks, such as Jax. All datasets are exposed as tf. data. Datasets , enabling easy-to-use and high-performance input pipelines. To get started see the guide and our list of datasets.

What is TF data dataset From_tensor_slices?

from_tensor_slices creates a dataset with a separate element for each row of the input tensor: >>> t = tf.constant([[1, 2], [3, 4]]) >>> ds = tf.data.Dataset.from_tensor_slices(t) >>> [x for x in ds] [<tf.Tensor: shape=(2,), dtype=int32, numpy=array([1, 2], dtype=int32)>, <tf.Tensor: shape=(2,), dtype=int32, numpy= ...

What is the name of the TF data class that represents a sequence of elements in which each item consists of one or more components?

Dataset abstraction that represents a sequence of elements, in which each element consists of one or more components.


2 Answers

len(list(dataset)) works in eager mode, although that's obviously not a good general solution.

like image 194
markemus Avatar answered Sep 22 '22 02:09

markemus


Take a look here: https://github.com/tensorflow/tensorflow/issues/26966

It doesn't work for TFRecord datasets, but it works fine for other types.

TL;DR:

num_elements = tf.data.experimental.cardinality(dataset).numpy()

like image 35
Jacob Høxbroe Jeppesen Avatar answered Sep 18 '22 02:09

Jacob Høxbroe Jeppesen