Tensorflow provides ragged tensors (https://www.tensorflow.org/guide/ragged_tensor). PyTorch however doesn't provide such a data structure. Is there a workaround to construct something similar in PyTorch?
import numpy as np
x = np.array([[0], [0, 1]])
print(x) # [list([0]) list([0, 1])]
import tensorflow as tf
x = tf.ragged.constant([[0], [0, 1]])
print(x) # <tf.RaggedTensor [[0], [0, 1]]>
import torch
# x = torch.Tensor([[0], [0, 1]]) # ValueError
Returns the value of this tensor as a standard Python number. This only works for tensors with one element.
Ragged tensors are the TensorFlow equivalent of nested variable-length lists. They make it easy to store and process data with non-uniform shapes, including: Variable-length features, such as the set of actors in a movie. Batches of variable-length sequential inputs, such as sentences or video clips.
NestedTensor allows the user to pack a list of Tensors into a single, efficient datastructure. The only constraint on the input Tensors is that their dimension must match. This enables more efficient metadata representations and access to purpose built kernels.
Tensors are a specialized data structure that are very similar to arrays and matrices. In PyTorch, we use tensors to encode the inputs and outputs of a model, as well as the model's parameters. Tensors are similar to NumPy's ndarrays, except that tensors can run on GPUs or other hardware accelerators.
PyTorch is implementing something called NestedTensors
which seems to have pretty much the same purpose as RaggedTensors
in Tensorflow. You can follow the RFC and progress here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With