I have a tab-delimited file with several columns. I want only those rows whose pvalue < .05.
Probe A_sig A_Pval B_sig B_Pval C_sig C_Pval D_sig D_Pval
ILMN_122 12.31 0.04 23.6 0.4 124.5 0.04 567.4 0.008
ILMN_456 56.12 0 23.89 0.55 567.2 0.214 56.35 0.01
ILMN_198 981.2 0.06 31.56 0.02 12.56 0.4 789.4 0.045
ILMN_980 876.0 0.001 124.7 0.01 167.3 0.12 245.7 0.35
ILMN_542 123.9 0.16 219.8 0.04 567.3 0.03 987.6 0.34
ILMN_567 134.1 0 542.5 0.24 12.56 0.65 5.67 0.56
ILMN_452 213.4 0.98 12.6 0.12 17.89 0.03 467.8 0.003
ILMN_142 543.8 0.04 245.6 0.89 456.34 0.001 12.67 0.002
ILMN_765 187.4 0.05 34.6 0.001 67.8 0.06 78.34 0.02
I need an output as follows:
Probe A_sig A_Pval B_sig B_Pval C_sig C_Pval
ILMN_122 12.31 0.04 32.56 0.004 311.4 0.001
ILMN_980 876.0 0.001 123.4 0.001 678.9 0.02
ILMN_142 543.8 0.04 56.56 0.015 67.8 0.04
Assuming that your data is in a data frame called mydata, you can select the rows you want by writing
mydata[mydata$A_Pval<0.05 & mydata$B_Pval<0.05 & mydata$C_Pval<0.05,]
It might be easier to understand by doing it in multiple steps:
# gives a logical vector telling you if A_Pval is smaller than 0.05
significant_A <- mydata$A_Pval<0.05
# gives a logical vector telling you if B_Pval is smaller than 0.05
significant_B <- mydata$B_Pval<0.05
# gives a logical vector telling you if C_Pval is smaller than 0.05
significant_C <- mydata$C_Pval<0.05
# combine the results to one logical vector
# significant_all[i] has value TRUE if all the p-values in row i
# are smaller than 0.05
significant_all <- significant_A & significant_B & significant_C
# pick the rows you want
mydata[significant_all,]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With