Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Make TensorFlow use training data generated on-the-fly by custom CUDA routine

Assume we generate our own training data (by sampling from some diffusion process and computing some quantities of interest on it for example) and that we have our own CUDA routine called generate_data which generates labels in GPU memory for a given set of inputs.

Hence, we are in a special setting where we can generate as many batches of training data as we want in an "online" fashion (at each batch iteration we call that generate_data routine to generate a new batch and discard the old batch).

Since the data is generated on the GPU, is there a way to make TensorFlow (the Python API) directly use it during the training process ? (for example to fill a placeholder) That way, such a pipeline would be efficient.

My understanding is that currently you would need in such a setup to copy your data from GPU to CPU, and then let TensorFlow copy it again from CPU to GPU, which is rather wasteful as unnecessary copies are being performed.

EDIT: if it helps, we can assume that the CUDA routine is implemented using Numba's CUDA JIT compiler.

like image 853
BS. Avatar asked Jun 26 '19 17:06

BS.


Video Answer


1 Answers

This is definitely not a complete answer, but hopefully can help.

  • You can integrate your CUDA routine to TensorFlow by writing a custom op. There is currently no other way in TensorFlow to interact with other CUDA routines.

  • As for writing a training loop entirely on GPU, we can write the routine on GPU using tf.while_loop, in a very similar way to this SO question:

    i = tf.Variable(0, name='loop_i')
    
    def cond(i):
        return i < n
    
    def body(i):
        # Building the graph for custom routine and our model
        x, ground_truth = CustomCUDARountine(random_seed, ...)
        predictions = MyModel(x, ...)
    
        # Defining the optimizer
        loss = loss_func(ground_truth, predictions)
        optim = tf.train.GradientDescentOptimizer().minimize(loss)
    
        # loop body
        return tf.tuple([tf.add(i, 1)], control_inputs=[optim])
    
    loop = tf.while_loop(cond, body, [i])
    
    # Run the loop
    tf.get_default_session().run(loop)
    
like image 76
FalconUA Avatar answered Oct 04 '22 17:10

FalconUA