I need to evaluate a conditional over every element in a matrix, where the test involves the dimnames
of each index. Specifically, if the rowname
and colname
of an element share a particular feature, I want to set that element equal to zero.
m <- matrix(rnorm(81),nrow=9)
colnames(m) <- paste(c(rep("A",3),rep("B",3),rep("C",3)),1:ncol(m),sep = "_")
rownames(m) <- paste(c(rep("A",3),rep("B",3),rep("C",3)),1:nrow(m),sep = "_")
m
A_1 A_2 A_3 B_4 B_5 B_6 C_7 C_8 C_9
A_1 -0.03201198 -2.2241923 -0.65584334 -0.346745371 -0.9263060 -1.99181830 0.9138187 -2.4959751 -0.96723090
A_2 -1.44319826 -0.2225057 -1.35327091 -0.009194619 0.5798469 2.42753826 -1.4574564 -0.8858597 -1.41595891
A_3 -0.05863965 -0.2177708 -0.39131739 0.729532751 1.4106448 0.15899085 -1.7521345 0.5398222 -0.05073061
B_4 1.11006840 1.0315201 0.10434758 -0.508430234 -1.7095192 0.90913528 1.7367210 -0.9006098 -1.41698688
B_5 0.21405173 0.4735690 0.42655214 -0.367748304 0.9820261 -0.77933908 1.1326391 0.5316226 2.24951820
B_6 0.27153476 -0.3506076 0.16943749 -0.666135969 0.2962018 1.12236640 -1.3103133 -1.9494454 -0.57526358
C_7 1.69732641 -1.1439368 -0.02734925 -0.814635435 0.6658583 0.68069434 0.3330596 -1.2564933 0.15807742
C_8 0.35194835 -0.7075880 -0.45814046 0.773997223 -0.6530986 0.01295098 0.2557955 1.4658751 -3.33651509
C_9 0.58610083 0.7908394 -1.38909037 -0.742739398 -0.3745243 2.80990368 0.2172529 -0.3672324 0.56309688
My desired result can be easily achieved through a for loop but this is, of course, prohibitively slow when working with large matrices.
for (i in 1:nrow(m)) {
for (j in 1:ncol(m)) {
m[i,j] <- ifelse(sub("_.*", "\\1", rownames(m)[i])==sub("_.*", "\\1", colnames(m)[j]),0,m[i,j])
}
}
m
A_1 A_2 A_3 B_4 B_5 B_6 C_7 C_8 C_9
A_1 0.0000000 0.0000000 0.00000000 -0.346745371 -0.9263060 -1.99181830 0.9138187 -2.4959751 -0.96723090
A_2 0.0000000 0.0000000 0.00000000 -0.009194619 0.5798469 2.42753826 -1.4574564 -0.8858597 -1.41595891
A_3 0.0000000 0.0000000 0.00000000 0.729532751 1.4106448 0.15899085 -1.7521345 0.5398222 -0.05073061
B_4 1.1100684 1.0315201 0.10434758 0.000000000 0.0000000 0.00000000 1.7367210 -0.9006098 -1.41698688
B_5 0.2140517 0.4735690 0.42655214 0.000000000 0.0000000 0.00000000 1.1326391 0.5316226 2.24951820
B_6 0.2715348 -0.3506076 0.16943749 0.000000000 0.0000000 0.00000000 -1.3103133 -1.9494454 -0.57526358
C_7 1.6973264 -1.1439368 -0.02734925 -0.814635435 0.6658583 0.68069434 0.0000000 0.0000000 0.00000000
C_8 0.3519484 -0.7075880 -0.45814046 0.773997223 -0.6530986 0.01295098 0.0000000 0.0000000 0.00000000
C_9 0.5861008 0.7908394 -1.38909037 -0.742739398 -0.3745243 2.80990368 0.0000000 0.0000000 0.00000000
Conceptually similar to this question, but I cannot figure out how to evaluate the conditional on dimnames
within the apply
family. Any suggestions? Thanks!
The sub
is vectorized. Only change required is to rep
licate the name attributes so that it will be same as the number of elements of the matrix
, extract those elements based on the logical vector and assign it to 0
rnm <- sub("_.*", "\\1", rownames(m))
cnm <- sub("_.*", "\\1", colnames(m))
m[rnm[row(m)] == cnm[col(m)]] <- 0
-output
m
A_1 A_2 A_3 B_4 B_5 B_6 C_7 C_8 C_9
A_1 0.0000000 0.00000000 0.00000000 -2.84612856 0.1611173 0.29004403 -0.8844186 0.8363131 -0.57395543
A_2 0.0000000 0.00000000 0.00000000 1.92166045 0.2671856 -0.50424582 0.8672366 -2.2496354 0.04046654
A_3 0.0000000 0.00000000 0.00000000 -0.01515657 0.6775447 -0.03862803 1.7764642 0.7146040 0.33652933
B_4 1.0842025 0.32981334 -0.61961179 0.00000000 0.0000000 0.00000000 0.3366096 0.5196087 1.67367867
B_5 -1.2897726 0.04082185 1.62661008 0.00000000 0.0000000 0.00000000 -1.6750488 0.5464289 0.98246881
B_6 -0.4213025 -0.70439232 0.09091241 0.00000000 0.0000000 0.00000000 0.7050487 -0.4151445 -0.04604658
C_7 1.0594241 -1.43071282 -0.75394573 1.08360040 1.3646551 -0.88687658 0.0000000 0.0000000 0.00000000
C_8 1.3908082 1.12093371 1.73690687 1.05202987 0.6152715 0.45601621 0.0000000 0.0000000 0.00000000
C_9 -0.3026523 1.14507291 -1.04714611 -2.38087279 -0.6976168 -0.96394113 0.0000000 0.0000000 0.00000000
Or another option is to reshape to 'long' format, assign the 'Freq' column to 0 based on the substr
condition and reshape back to wide with xtabs
xtabs(Freq ~ Var1 + Var2, transform(as.data.frame.table(m),
Freq = Freq * (substr(Var1, 1, 1) != substr(Var2, 1, 1))))
Another base R option using outer
> m * outer(gsub("_.*", "", row.names(m)), gsub("_.*", "", colnames(m)), "!=")
A_1 A_2 A_3 B_4 B_5 B_6
A_1 0.00000000 0.00000000 0.0000000 -0.9406041 -0.636132786 -1.6351854
A_2 0.00000000 0.00000000 0.0000000 2.5036185 1.097232977 1.0347660
A_3 0.00000000 0.00000000 0.0000000 0.9006587 0.923166200 0.3491107
B_4 1.02027130 0.83789740 0.7906466 0.0000000 0.000000000 0.0000000
B_5 0.07069471 -0.86514897 -1.2658111 0.0000000 0.000000000 0.0000000
B_6 1.19529695 -1.21846895 -0.1679168 0.0000000 0.000000000 0.0000000
C_7 0.14903965 0.05210262 -0.2998190 -0.6323805 -0.631320984 -0.5492681
C_8 -0.40173446 -0.88410976 -0.1568388 -0.1936692 -0.387362790 1.1529557
C_9 -1.01364291 -0.63857820 0.4172131 -1.1776165 -0.009321079 0.4942712
C_7 C_8 C_9
A_1 0.2739315 -1.5326245 1.0576526
A_2 -0.7770083 0.8391607 0.9051447
A_3 -0.7469622 0.3777556 -1.6092045
B_4 -0.7780711 -0.7269628 0.1827205
B_5 0.6065771 0.4299673 1.5414200
B_6 1.9553316 0.5131711 0.6540521
C_7 0.0000000 0.0000000 0.0000000
C_8 0.0000000 0.0000000 0.0000000
C_9 0.0000000 0.0000000 0.0000000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With