Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What am I missing from this csv reader for TensorFlow?

Tags:

tensorflow

It is mostly a copy paste from the tutorial, on the website. I am getting an error:

Invalid argument: ConcatOp : Expected concatenating dimensions in the range [0, 0), but got 0 [[Node: concat = Concat[N=4, T=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](concat/concat_dim, DecodeCSV, DecodeCSV:1, DecodeCSV:2, DecodeCSV:3)]]

the contents of my csv file is:

3,4,1,8,4

 import tensorflow as tf


filename_queue = tf.train.string_input_producer(["test2.csv"])

reader = tf.TextLineReader()
key, value = reader.read(filename_queue)

# Default values, in case of empty columns. Also specifies the type of the
# decoded result.
record_defaults = [[1], [1], [1], [1], [1]]
col1, col2, col3, col4, col5 = tf.decode_csv(
    value, record_defaults=record_defaults)
# print tf.shape(col1)

features = tf.concat(0, [col1, col2, col3, col4])
with tf.Session() as sess:
  # Start populating the filename queue.
  coord = tf.train.Coordinator()
  threads = tf.train.start_queue_runners(coord=coord)

  for i in range(1200):
    # Retrieve a single instance:
    example, label = sess.run([features, col5])

  coord.request_stop()
  coord.join(threads)
like image 588
Cristian F Avatar asked Nov 13 '15 05:11

Cristian F


2 Answers

The issue arises due to the shape of the tensors in your program. TL;DR Instead of tf.concat() you should use tf.pack(), which will transform the four scalar col tensors into a 1-D tensor of length 4.

Before we start, note that you can use the get_shape() method on any Tensor object to get static shape information about that tensor. For example, the commented-out line in your code could be:

print col1.get_shape()
# ==> 'TensorShape([])' - i.e. `col1` is a scalar.

The value tensor returned by reader.read() is a scalar string. tf.decode_csv(value, record_defaults=[...]) produces, for each element of record_defaults, a tensor of the same shape as value, i.e. a scalar in this case. A scalar is a 0-dimensional tensor with a single element. tf.concat(i, xs) is not defined on scalars: it concatenates a list of N-dimensional tensors (xs) into a new N-dimensional tensor, along dimension i, where 0 <= i < N, and there is no valid i if N = 0.

The tf.pack(xs) operator is designed to solve this problem simply. It takes a list of k N-dimensional tensors (with the same shape) and packs them into an N+1-dimensional tensor with size k in the 0th dimension. If you replace the tf.concat() with tf.pack(), your program will work:

# features = tf.concat(0, [col1, col2, col3, col4])
features = tf.pack([col1, col2, col3, col4])

with tf.Session() as sess:
  # Start populating the filename queue.
  # ...
like image 122
mrry Avatar answered Oct 04 '22 02:10

mrry


I am also stuck with this tutorial. I was able to exchange one problem for another, when I changed your with tf.Session() for:

sess = tf.Session()
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)

for i in range(2):
    #print i
    example, label = sess.run([features, col5])

coord.request_stop()
coord.join(threads)

sess.close()

The error disappeared, TF started to run, but it looks like it is stuck. If you uncomment # print you will see that only one iteration runs. Most probably this is not really helpful (because I trade an error for infinite execution).

like image 41
Salvador Dali Avatar answered Oct 04 '22 01:10

Salvador Dali