Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Feeding .npy (numpy files) into tensorflow data pipeline

Tags:

Tensorflow seems to lack a reader for ".npy" files. How can I read my data files into the new tensorflow.data.Dataset pipline? My data doesn't fit in memory.

Each object is saved in a separate ".npy" file. each file contains 2 different ndarrays as features and a scalar as their label.

like image 598
Sluggish Crow Avatar asked Feb 20 '18 16:02

Sluggish Crow


People also ask

Can you use NumPy with TensorFlow?

TensorFlow implements a subset of the NumPy API, available as tf. experimental. numpy . This allows running NumPy code, accelerated by TensorFlow, while also allowing access to all of TensorFlow's APIs.


1 Answers

You can do it with tf.py_func, see the example here. The parse function would simply decode the filename from bytes to string and call np.load.

Update: something like this:

def read_npy_file(item):     data = np.load(item.decode())     return data.astype(np.float32)  file_list = ['/foo/bar.npy', '/foo/baz.npy']  dataset = tf.data.Dataset.from_tensor_slices(file_list)  dataset = dataset.map(         lambda item: tuple(tf.py_func(read_npy_file, [item], [tf.float32,]))) 
like image 146
George Avatar answered Sep 28 '22 03:09

George