I am attempting to use the Matrix package to bind two sparse matrices of different size together. The binding is on rows, using the column names for matching.
Table A:
ID | AAAA | BBBB |
------ | ------ | ------ |
XXXX | 1 | 2 |
Table B:
ID | BBBB | CCCC |
------ | ------ | ------ |
YYYY | 3 | 4 |
Binding table A and B:
ID | AAAA | BBBB | CCCC |
------ | ------ | ------ | ------ |
XXXX | 1 | 2 | |
YYYY | | 3 | 4 |
The intention is to insert a large number of small matrices into a single large matrix, to enable continuous querying and update/inserts.
I find that neither the Matrix or slam packages have functionality to handle this.
Similar questions have been asked in the past, but it seems no solution has been found:
Post 1: in-r-when-using-named-rows-can-a-sparse-matrix-column-be-added-concatenated
Post 2: bind-together-sparse-model-matrices-by-row-names
Ideas on how to solve it will be highly appreciated.
Best regards,
Frederik
For my purposes (very sparse matrix with millions of rows, and tens of thousands of columns, more than 99.9% of the values empty) this was still much too slow. What worked was the code below - might be helpful to others as well:
merge.sparse = function(listMatrixes) {
# takes a list of sparse matrixes with different columns and adds them row wise
allColnames <- sort(unique(unlist(lapply(listMatrixes,colnames))))
for (currentMatrix in listMatrixes) {
newColLocations <- match(colnames(currentMatrix),allColnames)
indexes <- which(currentMatrix>0, arr.ind = T)
newColumns <- newColLocations[indexes[,2]]
rows <- indexes[,1]
newMatrix <- sparseMatrix(i=rows,j=newColumns, x=currentMatrix@x,
dims=c(max(rows),length(allColnames)))
if (!exists("matrixToReturn")) {
matrixToReturn <- newMatrix
}
else {
matrixToReturn <- rbind2(matrixToReturn,newMatrix)
}
}
colnames(matrixToReturn) <- allColnames
matrixToReturn
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With