Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between tensors and sparse tensors?

Tags:

I am having troubles understanding the meaning and usages for Tensorflow Tensors and Sparse Tensors.

According to the documentation

Tensor

Tensor is a typed multi-dimensional array. For example, you can represent a mini-batch of images as a 4-D array of floating point numbers with dimensions [batch, height, width, channels].

Sparse Tensor

TensorFlow represents a sparse tensor as three separate dense tensors: indices, values, and shape. In Python, the three tensors are collected into a SparseTensor class for ease of use. If you have separate indices, values, and shape tensors, wrap them in a SparseTensor object before passing to the ops below.

My understandings are Tensors are used for operations, input and output. And Sparse Tensor is just another representation of a Tensor(dense?). Hope someone can further explain the differences, and the use cases for them.

like image 442
LYu Avatar asked Dec 05 '17 20:12

LYu


People also ask

What is a sparse tensor?

A sparse tensor is a dataset in which most of the entries are zero, one such example would be a large diagonal matrix. (which has many zero elements). It does not store the whole values of the tensor object but stores the non-zero values and the corresponding coordinates of them.

How many types of tensors are there?

There are four main tensor type you can create: tf. Variable.

What are dense tensors?

Dense tensors store values in a contiguous sequential block of memory where all values are represented. Tensors or multi-dimensional arrays are used in a diverse set of multi-dimensional data analysis applications.

Does Pytorch support sparse tensors?

Pytorch implements an extension of sparse tensors with scalar values to sparse tensors with (contiguous) tensor values. Such tensors are called hybrid tensors.


2 Answers

Matthew did a great job but I would love to give an example to shed more light on Sparse tensors with a example.

If a tensor has lots of values that are zero, it can be called sparse.

Lets consider a sparse 1-D Tensor

[0, 7, 0, 0, 8, 0, 0, 0, 0]

A sparse representation of the same tensor will focus only on the non-zero values

values = [7,8]

We also have to remember where those values occurs, by their indices

indices = [1,4]

The one-dimensional indices form will work with some methods, for this one-dimensional example, but in general indices have multiple dimensions, so it will be more consistent (and work everywhere) to represent indices like this:

indices = [[1], [4]]

With values and indices, we don't have quite enough information yet. How many zeros are there? We represent dense shape of a tensor.

 dense_shape = [9]

These three things together, values, indices, and dense_shape, are a sparse representation of the tensor

In tensorflow 2.0 it can be implemented as

x = tf.SparseTensor(values=[7,8],indices=[[1],[4]],dense_shape=[9])
x
#o/p: <tensorflow.python.framework.sparse_tensor.SparseTensor at 0x7ff04a58c4a8>

print(x.values)
print(x.dense_shape)
print(x.indices)
#o/p: 
tf.Tensor([7 8], shape=(2,), dtype=int32)
tf.Tensor([9], shape=(1,), dtype=int64)
tf.Tensor(
[[1]
 [4]], shape=(2, 1), dtype=int64)

EDITED to correct indices as pointed out in the comments.

like image 88
Tensorflow Support Avatar answered Oct 26 '22 18:10

Tensorflow Support


The difference involves computational speed. If a large tensor has many, many zeroes, it's faster to perform computation by iterating through the non-zero elements. Therefore, you should store the data in a SparseTensor and use the special operations for SparseTensors.

The relationship is similar for matrices and sparse matrices. Sparse matrices are common in dynamic systems, and mathematicians have developed many special methods for operating on them.

like image 20
MatthewScarpino Avatar answered Oct 26 '22 18:10

MatthewScarpino