Using sparse matrices with Keras and Tensorflow

Q: Can keras take sparse Matrix?

You can pass sparse tensors between Keras layers, and also have Keras models return them as outputs. If you use sparse tensors in tf. keras.

Q: What is Sparsetensor?

A sparse tensor is a dataset in which most of the entries are zero, one such example would be a large diagonal matrix. (which has many zero elements). It does not store the whole values of the tensor object but stores the non-zero values and the corresponding coordinates of them.

Q: What is dense tensor?

Dense tensors store values in a contiguous sequential block of memory where all values are represented. Tensors or multi-dimensional arrays are used in a diverse set of multi-dimensional data analysis applications.

Q: What is TF Where?

tf. where will return the indices of condition that are non-zero, in the form of a 2-D tensor with shape [n, d] , where n is the number of non-zero elements in condition ( tf. count_nonzero(condition) ), and d is the number of axes of condition ( tf. rank(condition) ).

Tags:

tensorflow

keras

sparse-matrix

My data can be viewed as a matrix of 10B entries (100M x 100), which is very sparse (< 1/100 * 1/100 of entries are non-zero). I would like to feed the data into into a Keras Neural Network model which I have made, using a Tensorflow backend.

My first thought was to expand the data to be dense, that is, write out all 10B entries into a series of CSVs, with most entries zero. However, this is quickly overwhelming my resources (even doing the ETL overwhelmed pandas and is causing postgres to struggle). So I need to use true sparse matrices.

How can I do that with Keras (and Tensorflow)? While numpy doesn't support sparse matrices, scipy and tensorflow both do. There's lots of discussion (e.g. https://github.com/fchollet/keras/pull/1886 https://github.com/fchollet/keras/pull/3695/files https://github.com/pplonski/keras-sparse-check https://groups.google.com/forum/#!topic/keras-users/odsQBcNCdZg ) about this idea - either using scipy's sparse matrixcs or going directly to Tensorflow's sparse matrices. But I can't find a clear conclusion, and I haven't been able to get anything to work (or even know clearly which way to go!).

How can I do this?

I believe there are two possible approaches:

Keep it as a scipy sparse matrix, then, when giving Keras a minibatch, make it dense
Keep it sparse all the way through, and use Tensorflow Sparse Tensors

I also think #2 is preferred, because you'll get much better performance all the way through (I believe), but #1 is probably easier and will be adequate. I'll be happy with either.

How can either be implemented?

710

asked Jan 08 '17 22:01

SRobertJames

1 Answers

Sorry, don't have the reputation to comment, but I think you should take a look at the answer here: Keras, sparse matrix issue. I have tried it and it works correctly, just one note though, at least in my case, the shuffling led to really bad results, so I used this slightly modified non-shuffled alternative:

def nn_batch_generator(X_data, y_data, batch_size):     samples_per_epoch = X_data.shape[0]     number_of_batches = samples_per_epoch/batch_size     counter=0     index = np.arange(np.shape(y_data)[0])     while 1:         index_batch = index[batch_size*counter:batch_size*(counter+1)]         X_batch = X_data[index_batch,:].todense()         y_batch = y_data[index_batch]         counter += 1         yield np.array(X_batch),y_batch         if (counter > number_of_batches):             counter=0

It produces comparable accuracies to the ones achieved by keras's shuffled implementation (setting shuffle=True in fit).

answered Sep 28 '22 17:09

Marawan Okasha

Related questions
                            
                                TensorFlow: How and why to use SavedModel
                            
                                How do I swap tensor's axes in TensorFlow?
                            
                                TypeError: 'Tensor' object does not support item assignment in TensorFlow
                            
                                Multiple outputs in Keras
                            
                                How to average summaries over multiple batches?
                            
                                TensorFlow: getting variable by name
                            
                                Tensorflow get all variables in scope
                            
                                How does reduce_sum() work in tensorflow?
                            
                                No module named tensorflow in jupyter
                            
                                Logging training and validation loss in tensorboard
                            
                                How to prefetch data using a custom python function in tensorflow
                            
                                What's the difference between scikit-learn and tensorflow? Is it possible to use them together?
                            
                                How to apply Drop Out in Tensorflow to improve the accuracy of neural network?
                            
                                Why do we name variables in Tensorflow?
                            
                                NotImplementedError: Layers with arguments in `__init__` must override `get_config`
                            
                                Tensor flow toggle between CPU/GPU
                            
                                Tensorflow serving No versions of servable <MODEL> found under base path
                            
                                Why can I not import Tensorflow.contrib I get an error of No module named 'tensorflow.python.saved
                            
                                pip3: command not found
                            
                                How do I get the weights of a layer in Keras?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With