Transporting Sparse Matrix from Python to R

Question

I am doing some text analysis work in Python. Unfortunately, I need to switch to R in order to use a particular package (unfortunately, the package cannot be replicated in Python easily).

Currently the text is parsed into bigram counts, reduced to a vocabulary of about 11,000 bigrams, and then stored as a dictionary:

{id1: {'bigrams':[(bigram1, count), (bigram2, count), ...]},
id2: {'bigrams': ...}

I need to get this into a dgCMatrix in R, where the rows are id1, id2, ... and the columns are the different bigrams such that a cell represents the 'count' for that id-bigram.

Any suggestions? I thought about expanding it just to a massive CSV, but that seems super inefficient plus probably infeasible due to memory constraints.

earino · Accepted Answer

Could you could write out the matrix in MatrixMarket format using scipy mmwrite and then read it into R using readMM from the Matrix package?

Transporting Sparse Matrix from Python to R

Tags:

python

r

sparse-matrix

text-analysis

Craig

Video Answer

1 Answers

earino

Recent Activity

Donate For Us

Transporting Sparse Matrix from Python to R

Tags:

python

r

sparse-matrix

text-analysis

Craig

Video Answer

1 Answers

earino

Related questions

Recent Activity

Donate For Us