Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Testing a conditional over every element of a matrix

I need to evaluate a conditional over every element in a matrix, where the test involves the dimnames of each index. Specifically, if the rowname and colname of an element share a particular feature, I want to set that element equal to zero.

m <- matrix(rnorm(81),nrow=9)
colnames(m) <- paste(c(rep("A",3),rep("B",3),rep("C",3)),1:ncol(m),sep = "_")
rownames(m) <- paste(c(rep("A",3),rep("B",3),rep("C",3)),1:nrow(m),sep = "_")
m

            A_1        A_2         A_3          B_4        B_5         B_6        C_7        C_8         C_9
A_1 -0.03201198 -2.2241923 -0.65584334 -0.346745371 -0.9263060 -1.99181830  0.9138187 -2.4959751 -0.96723090
A_2 -1.44319826 -0.2225057 -1.35327091 -0.009194619  0.5798469  2.42753826 -1.4574564 -0.8858597 -1.41595891
A_3 -0.05863965 -0.2177708 -0.39131739  0.729532751  1.4106448  0.15899085 -1.7521345  0.5398222 -0.05073061
B_4  1.11006840  1.0315201  0.10434758 -0.508430234 -1.7095192  0.90913528  1.7367210 -0.9006098 -1.41698688
B_5  0.21405173  0.4735690  0.42655214 -0.367748304  0.9820261 -0.77933908  1.1326391  0.5316226  2.24951820
B_6  0.27153476 -0.3506076  0.16943749 -0.666135969  0.2962018  1.12236640 -1.3103133 -1.9494454 -0.57526358
C_7  1.69732641 -1.1439368 -0.02734925 -0.814635435  0.6658583  0.68069434  0.3330596 -1.2564933  0.15807742
C_8  0.35194835 -0.7075880 -0.45814046  0.773997223 -0.6530986  0.01295098  0.2557955  1.4658751 -3.33651509
C_9  0.58610083  0.7908394 -1.38909037 -0.742739398 -0.3745243  2.80990368  0.2172529 -0.3672324  0.56309688

My desired result can be easily achieved through a for loop but this is, of course, prohibitively slow when working with large matrices.

for (i in 1:nrow(m)) {
  for (j in 1:ncol(m)) {
    m[i,j] <- ifelse(sub("_.*", "\\1", rownames(m)[i])==sub("_.*", "\\1", colnames(m)[j]),0,m[i,j])
  }
}
m

          A_1        A_2         A_3          B_4        B_5         B_6        C_7        C_8         C_9
A_1 0.0000000  0.0000000  0.00000000 -0.346745371 -0.9263060 -1.99181830  0.9138187 -2.4959751 -0.96723090
A_2 0.0000000  0.0000000  0.00000000 -0.009194619  0.5798469  2.42753826 -1.4574564 -0.8858597 -1.41595891
A_3 0.0000000  0.0000000  0.00000000  0.729532751  1.4106448  0.15899085 -1.7521345  0.5398222 -0.05073061
B_4 1.1100684  1.0315201  0.10434758  0.000000000  0.0000000  0.00000000  1.7367210 -0.9006098 -1.41698688
B_5 0.2140517  0.4735690  0.42655214  0.000000000  0.0000000  0.00000000  1.1326391  0.5316226  2.24951820
B_6 0.2715348 -0.3506076  0.16943749  0.000000000  0.0000000  0.00000000 -1.3103133 -1.9494454 -0.57526358
C_7 1.6973264 -1.1439368 -0.02734925 -0.814635435  0.6658583  0.68069434  0.0000000  0.0000000  0.00000000
C_8 0.3519484 -0.7075880 -0.45814046  0.773997223 -0.6530986  0.01295098  0.0000000  0.0000000  0.00000000
C_9 0.5861008  0.7908394 -1.38909037 -0.742739398 -0.3745243  2.80990368  0.0000000  0.0000000  0.00000000

Conceptually similar to this question, but I cannot figure out how to evaluate the conditional on dimnames within the apply family. Any suggestions? Thanks!

like image 515
Aaron Avatar asked May 13 '21 20:05

Aaron


2 Answers

The sub is vectorized. Only change required is to replicate the name attributes so that it will be same as the number of elements of the matrix, extract those elements based on the logical vector and assign it to 0

rnm <- sub("_.*", "\\1", rownames(m))
cnm <- sub("_.*", "\\1", colnames(m))
m[rnm[row(m)] == cnm[col(m)]] <- 0

-output

 m
           A_1         A_2         A_3         B_4        B_5         B_6        C_7        C_8         C_9
A_1  0.0000000  0.00000000  0.00000000 -2.84612856  0.1611173  0.29004403 -0.8844186  0.8363131 -0.57395543
A_2  0.0000000  0.00000000  0.00000000  1.92166045  0.2671856 -0.50424582  0.8672366 -2.2496354  0.04046654
A_3  0.0000000  0.00000000  0.00000000 -0.01515657  0.6775447 -0.03862803  1.7764642  0.7146040  0.33652933
B_4  1.0842025  0.32981334 -0.61961179  0.00000000  0.0000000  0.00000000  0.3366096  0.5196087  1.67367867
B_5 -1.2897726  0.04082185  1.62661008  0.00000000  0.0000000  0.00000000 -1.6750488  0.5464289  0.98246881
B_6 -0.4213025 -0.70439232  0.09091241  0.00000000  0.0000000  0.00000000  0.7050487 -0.4151445 -0.04604658
C_7  1.0594241 -1.43071282 -0.75394573  1.08360040  1.3646551 -0.88687658  0.0000000  0.0000000  0.00000000
C_8  1.3908082  1.12093371  1.73690687  1.05202987  0.6152715  0.45601621  0.0000000  0.0000000  0.00000000
C_9 -0.3026523  1.14507291 -1.04714611 -2.38087279 -0.6976168 -0.96394113  0.0000000  0.0000000  0.00000000

Or another option is to reshape to 'long' format, assign the 'Freq' column to 0 based on the substr condition and reshape back to wide with xtabs

xtabs(Freq ~ Var1 + Var2, transform(as.data.frame.table(m), 
     Freq = Freq * (substr(Var1, 1, 1) != substr(Var2, 1, 1))))
like image 60
akrun Avatar answered Oct 21 '22 06:10

akrun


Another base R option using outer

> m * outer(gsub("_.*", "", row.names(m)), gsub("_.*", "", colnames(m)), "!=")
            A_1         A_2        A_3        B_4          B_5        B_6
A_1  0.00000000  0.00000000  0.0000000 -0.9406041 -0.636132786 -1.6351854
A_2  0.00000000  0.00000000  0.0000000  2.5036185  1.097232977  1.0347660
A_3  0.00000000  0.00000000  0.0000000  0.9006587  0.923166200  0.3491107
B_4  1.02027130  0.83789740  0.7906466  0.0000000  0.000000000  0.0000000
B_5  0.07069471 -0.86514897 -1.2658111  0.0000000  0.000000000  0.0000000
B_6  1.19529695 -1.21846895 -0.1679168  0.0000000  0.000000000  0.0000000
C_7  0.14903965  0.05210262 -0.2998190 -0.6323805 -0.631320984 -0.5492681
C_8 -0.40173446 -0.88410976 -0.1568388 -0.1936692 -0.387362790  1.1529557
C_9 -1.01364291 -0.63857820  0.4172131 -1.1776165 -0.009321079  0.4942712
           C_7        C_8        C_9
A_1  0.2739315 -1.5326245  1.0576526
A_2 -0.7770083  0.8391607  0.9051447
A_3 -0.7469622  0.3777556 -1.6092045
B_4 -0.7780711 -0.7269628  0.1827205
B_5  0.6065771  0.4299673  1.5414200
B_6  1.9553316  0.5131711  0.6540521
C_7  0.0000000  0.0000000  0.0000000
C_8  0.0000000  0.0000000  0.0000000
C_9  0.0000000  0.0000000  0.0000000
like image 32
ThomasIsCoding Avatar answered Oct 21 '22 06:10

ThomasIsCoding