Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow Error: "Cannot parse tensor from proto"

I am creating a deep CNN with tensorflow. I have already created the architecture, and now I am in the process of training. When I begin to train the model, I use the command:

sess.run(tf.global_variables_initializer())

When this command is called, I get the error located below. My intuition tells me that maybe the tensor shape is too large to parse/initialize. I have researched this error and there seems to be little documentation online. Does this error give enough information to tell what the problem is? Thank you.

2017-10-25 15:07:54.252194: W C:\tf_jenkins\home\workspace\rel-
win\M\windows\PY\35\tensorflow\core\framework\op_kernel.cc:1182] Invalid 
argument: Cannot parse tensor from proto: dtype: DT_FLOAT
tensor_shape {
  dim {
    size: 16
  }
  dim {
    size: 16
  }
  dim {
    size: 7
  }
  dim {
    size: 3298
  }
  dim {
    size: 3298
  }
}
float_val: 0

2017-10-25 15:07:54.252767: E C:\tf_jenkins\home\workspace\rel-
win\M\windows\PY\35\tensorflow\core\common_runtime\executor.cc:644] Executor 
failed to create kernel. Invalid argument: Cannot parse tensor from proto: 
dtype: DT_FLOAT
tensor_shape {
  dim {
    size: 16
  }
  dim {
    size: 16
  }
  dim {
    size: 7
  }
  dim {
    size: 3298
  }
  dim {
    size: 3298
  }
}
float_val: 0

         [[Node: Variable_737/Adam_1/Initializer/zeros = Const[_class=
["loc:@Variable_737"], dtype=DT_FLOAT, value=<Invalid TensorProto: dtype: 
DT_FLOAT tensor_shape { dim { size: 16 } dim { size: 16 } dim { size: 7 } 
dim { size: 3298 } dim { size: 3298 } } float_val: 0>, 
_device="/job:localhost/replica:0/task:0/cpu:0"]()]]
2017-10-25 15:07:54.320979: W C:\tf_jenkins\home\workspace\rel-
win\M\windows\PY\35\tensorflow\core\framework\op_kernel.cc:1182] Invalid 
argument: Cannot parse tensor from proto: dtype: DT_FLOAT
tensor_shape {
  dim {
    size: 16
  }
  dim {
    size: 16
  }
  dim {
    size: 7
  }
  dim {
    size: 3298
  }
  dim {
    size: 3298
  }
}
float_val: 0
like image 449
Devin Haslam Avatar asked Oct 25 '17 20:10

Devin Haslam


2 Answers

As @Tarun Wadhwa said, tensorflow doesn't allow tensors of size > 2 GB on a single device. Your tensor is of size (19 x 10^9 entries) x 4 bytes = 78 GB if you're using dtype='tf.float32'.

Firstly, you can try using 'tf.float16'. This would halve the size of your tensor on RAM. (It will also add some noise to the weights which will provide a regularizing effect--which is a good thing). You can also try upping your stride parameter in convolutional layers.

But you still won't meet the 2 GB allowable limit. In which case, you should distribute your computational graph across multiple GPUs and train the model there. You'll have to re-structure your code by using with tf.device statements which is a whole new ballgame. AWS provides 8 and 16 GPU p2 instances on its EC2.

Why do you need to work with such humongous tensors?

like image 99
Safak Ozkan Avatar answered Nov 15 '22 02:11

Safak Ozkan


You can't create Tensors of size > 2GB. This is not a Tensorflow limit but a Google's protobuf limit. One way to solve this problem is to break a large tensor into smaller tensors.

like image 26
Tarun Wadhwa Avatar answered Nov 15 '22 03:11

Tarun Wadhwa