How to input a list of lists with different sizes in tf.data.Dataset

Tags:

I have a long list of lists of integers (representing sentences, each one of different sizes) that I want to feed using the tf.data library. Each list (of the lists of list) has different length, and I get an error, which I can reproduce here:

Click to copy

t = [[4,2], [3,4,5]] dataset = tf.data.Dataset.from_tensor_slices(t)

The error I get is:

Click to copy

ValueError: Argument must be a dense tensor: [[4, 2], [3, 4, 5]] - got shape [2], but wanted [2, 2].

Is there a way to do this?

EDIT 1: Just to be clear, I don't want to pad the input list of lists (it's a list of sentences containing over a million elements, with varying lengths) I want to use the tf.data library to feed, in a proper way, a list of lists with varying length.

201

asked Nov 30 '17 18:11

Escachator

2 Answers

You can use tf.data.Dataset.from_generator() to convert any iterable Python object (like a list of lists) into a Dataset:

Click to copy

t = [[4, 2], [3, 4, 5]]  dataset = tf.data.Dataset.from_generator(lambda: t, tf.int32, output_shapes=[None])  iterator = dataset.make_one_shot_iterator() next_element = iterator.get_next()  with tf.Session() as sess:   print(sess.run(next_element))  # ==> '[4, 2]'   print(sess.run(next_element))  # ==> '[3, 4, 5]'

157

answered Sep 18 '22 19:09

mrry

For those working with TensorFlow 2 and looking for an answer I found the following to work directly with ragged tensors. which should be much faster than generator, as long as the entire dataset fits in memory.

Click to copy

t = [[[4,2]],      [[3,4,5]]]  rt=tf.ragged.constant(t) dataset = tf.data.Dataset.from_tensor_slices(rt)  for x in dataset:   print(x)

produces

Click to copy

<tf.RaggedTensor [[4, 2]]> <tf.RaggedTensor [[3, 4, 5]]>

For some reason, it's very particular about having at least 2 dimensions on the individual arrays.

answered Sep 18 '22 19:09

FlashDD

Related questions
                            
                                Refactoring Angular components from many inputs/outputs to a single config object
                            
                                Disable querying collection in Firebase Cloud Firestore with rules
                            
                                How to include an ng-template element without a condition in Angular 2
                            
                                Blocking on stdin makes Java process take 350ms more to exit
                            
                                Numpy np.multiply vs *-Operator [duplicate]
                            
                                MAT_DATE_FORMATS definition/meaning of fields
                            
                                How to make retrofit API call using ViewModel and LiveData
                            
                                How to prevent nested node_modules inside node_modules
                            
                                Pretty print a pandas dataframe in VS Code
                            
                                Package Manager vs. Git Submodule/Subtree
                            
                                What is exactly the importLoaders option of css-loader in Webpack 4?
                            
                                Ngx translate with shared/lazy loading modules

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to input a list of lists with different sizes in tf.data.Dataset

Tags:

Escachator

People also ask

2 Answers

mrry

FlashDD

Recent Activity

Donate For Us