convert dictionary to sparse matrix

Tags:

I have a dictionary with keys as user_ids and values as list of movie_ids liked by that user with #unique_users = 573000 and # unique_movies =16000.

{1: [51, 379, 552, 2333, 2335, 4089, 4484], 2: [51, 379, 552, 1674, 1688, 2333, 3650, 4089, 4296, 4484], 5: [783, 909, 1052, 1138, 1147, 2676], 7: [171, 321, 959], 9: [3193], 10: [959], 11: [131,567,897,923],..........}

Now i want to convert this into into a matrix with rows as user_ids and columns as movies_id with values 1 for the movies which user has liked i.e it will be 573000*16000

Ultimately i have to multiply this matrix with it's transpose to have co-occurrence matrix with dim (#unique_movies,#unique_movies).

Also, what will be the time complexity of X'*X operation where X is like (500000,12000).

390

asked Jun 16 '16 14:06

chirag yadav

1 Answers

I think you can construct an empty dok_matrix and fill the values. Then transpose it and convert it to csr_matrix for efficient matrix multiplications.

import numpy as np
import scipy.sparse as sp
d = {1: [51, 379, 552, 2333, 2335, 4089, 4484], 2: [51, 379, 552, 1674, 1688, 2333, 3650, 4089, 4296, 4484], 5: [783, 909, 1052, 1138, 1147, 2676], 7: [171, 321, 959], 9: [3193], 10: [959], 11: [131,567,897,923]}

mat = sp.dok_matrix((573000,16000), dtype=np.int8)

for user_id, movie_ids in d.items():
    mat[user_id, movie_ids] = 1

mat = mat.transpose().tocsr()
print mat.shape

answered Sep 28 '22 00:09

Zichen Wang

Related questions
                            
                                Deploying django by python manage.py runserver to production on VPS
                            
                                Pandas convert Dataframe to Nested Json
                            
                                Query HDF5 in Pandas
                            
                                Python cross platform hidden file
                            
                                Anaconda and VirtualEnv
                            
                                Pandas aligning multiple dataframes with TimeStamp index
                            
                                How should I document class and object attributes using Numpy's style? [closed]
                            
                                Override serializer.data in Django REST Framework
                            
                                Merge multiple declarative bases in SQLAlchemy
                            
                                "Firefox quit unexpectedly." when running basic Selenium script in Python
                            
                                non Invertible of a ARIMA model
                            
                                Python Gaussian Kernel density calculate score for new values
                            
                                Sampling a dataframe based on a given distribution
                            
                                How to plot result of np.histogram with matplotlib analog to plt.hist [duplicate]
                            
                                Django : loaddata to update data
                            
                                Matplotlib animated histogram
                            
                                How to use inverse of a GenericRelation
                            
                                Fortran sources but no Fortran compiler found
                            
                                numpy: how interpolate between two arrays for various timesteps?
                            
                                Can I use setup.py to pack an app that requires PyQt5?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

convert dictionary to sparse matrix

Tags:

python

dictionary

matrix

chirag yadav

People also ask

1 Answers

Zichen Wang

Recent Activity

Donate For Us