Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read data from numpy files in TensorFlow? [duplicate]

I have read the CNN Tutorial on the TensorFlow and I am trying to use the same model for my project. The problem is now in data reading. I have around 25000 images for training and around 5000 for testing and validation each. The files are in png format and I can read them and convert them into the numpy.ndarray.

The CNN example in the tutorials use a queue to fetch the records from the file list provided. I tried to create my own such binary file by reshaping my images into 1-D array and attaching a label value in the front of it. So my data looks like this

[[1,12,34,24,53,...,105,234,102],
 [12,112,43,24,52,...,115,244,98],
....
]

The single row of the above array is of length 22501 size where the first element is the label.

I dumped the file to using pickle and the tried to read from the file using the tf.FixedLengthRecordReader to read from the file as demonstrated in example

I am doing the same things as given in the cifar10_input.py to read the binary file and putting them into the record object.

Now when I read from the files the labels and the image values are different. I can understand the reason for this to be that pickle dumps the extra information of braces and brackets also in the binary file and they change the fixed length record size.

The above example uses the filenames and pass it to a queue to fetch the files and then the queue to read a single record from the file.

I want to know if I can pass the numpy array as defined above instead of the filenames to some reader and it can fetch records one by one from that array instead of the files.

like image 838
t0mkaka Avatar asked Nov 20 '22 01:11

t0mkaka


1 Answers

Probably the easiest way to make your data work with the CNN example code is to make a modified version of read_cifar10() and use it instead:

  1. Write out a binary file containing the contents of your numpy array.

    import numpy as np
    images_and_labels_array = np.array([[...], ...],  # [[1,12,34,24,53,...,102],
                                                      #  [12,112,43,24,52,...,98],
                                                      #  ...]
                                       dtype=np.uint8)
    
    images_and_labels_array.tofile("/tmp/images.bin")
    

    This file is similar to the format used in CIFAR10 datafiles. You might want to generate multiple files in order to get read parallelism. Note that ndarray.tofile() writes binary data in row-major order with no other metadata; pickling the array will add Python-specific metadata that TensorFlow's parsing routines do not understand.

  2. Write a modified version of read_cifar10() that handles your record format.

    def read_my_data(filename_queue):
    
      class ImageRecord(object):
        pass
      result = ImageRecord()
    
      # Dimensions of the images in the dataset.
      label_bytes = 1
      # Set the following constants as appropriate.
      result.height = IMAGE_HEIGHT
      result.width = IMAGE_WIDTH
      result.depth = IMAGE_DEPTH
      image_bytes = result.height * result.width * result.depth
      # Every record consists of a label followed by the image, with a
      # fixed number of bytes for each.
      record_bytes = label_bytes + image_bytes
    
      assert record_bytes == 22501  # Based on your question.
    
      # Read a record, getting filenames from the filename_queue.  No
      # header or footer in the binary, so we leave header_bytes
      # and footer_bytes at their default of 0.
      reader = tf.FixedLengthRecordReader(record_bytes=record_bytes)
      result.key, value = reader.read(filename_queue)
    
      # Convert from a string to a vector of uint8 that is record_bytes long.
      record_bytes = tf.decode_raw(value, tf.uint8)
    
      # The first bytes represent the label, which we convert from uint8->int32.
      result.label = tf.cast(
          tf.slice(record_bytes, [0], [label_bytes]), tf.int32)
    
      # The remaining bytes after the label represent the image, which we reshape
      # from [depth * height * width] to [depth, height, width].
      depth_major = tf.reshape(tf.slice(record_bytes, [label_bytes], [image_bytes]),
                               [result.depth, result.height, result.width])
      # Convert from [depth, height, width] to [height, width, depth].
      result.uint8image = tf.transpose(depth_major, [1, 2, 0])
    
      return result
    
  3. Modify distorted_inputs() to use your new dataset:

    def distorted_inputs(data_dir, batch_size):
      """[...]"""
      filenames = ["/tmp/images.bin"]  # Or a list of filenames if you
                                       # generated multiple files in step 1.
      for f in filenames:
        if not gfile.Exists(f):
          raise ValueError('Failed to find file: ' + f)
    
      # Create a queue that produces the filenames to read.
      filename_queue = tf.train.string_input_producer(filenames)
    
      # Read examples from files in the filename queue.
      read_input = read_my_data(filename_queue)
      reshaped_image = tf.cast(read_input.uint8image, tf.float32)
    
      # [...] (Maybe modify other parameters in here depending on your problem.)
    

This is intended to be a minimal set of steps, given your starting point. It may be more efficient to do the PNG decoding using TensorFlow ops, but that would be a larger change.

like image 122
mrry Avatar answered Jun 26 '23 20:06

mrry