I got two big data frames, one (df1
) has this structure
chr init
1 12 25289552
2 3 180418785
3 3 180434779
The other (df2
) has this
V1 V2 V3
10 1 69094 medium
11 1 69094 medium
12 12 25289552 high
13 1 69095 medium
14 3 180418785 medium
15 3 180434779 low
What I'm trying to do is to add the column V3
of df2
to df1
, to get the info of the mutation
chr init Mut
1 12 25289552 high
2 3 180418785 medium
3 3 180434779 low
I'm trying loading both into R and then doing a for loop using match but it doesn't work. Do you know any special way to do this? I am also open to do using awk or something similar
We can compare two columns in R by using ifelse(). This statement is used to check the condition given and return the data accordingly.
Method. To find the positions of two matching columns, we first initialize a pandas dataframe with two columns of city names. Then we use where() of numpy to compare the values of two columns. This returns an array that represents the indices where the two columns have the same value.
Use merge
df1 <- read.table(text=' chr init
1 12 25289552
2 3 180418785
3 3 180434779', header=TRUE)
df2 <- read.table(text=' V1 V2 V3
10 1 69094 medium
11 1 69094 medium
12 12 25289552 high
13 1 69095 medium
14 3 180418785 medium
15 3 180434779 low', header=TRUE)
merge(df1, df2, by.x='init', by.y='V2') # this works!
init chr V1 V3
1 25289552 12 12 high
2 180418785 3 3 medium
3 180434779 3 3 low
To get your desired output the way you show it
output <- merge(df1, df2, by.x='init', by.y='V2')[, c(2,1,4)]
colnames(output)[3] <- 'Mut'
output
chr init Mut
1 12 25289552 high
2 3 180418785 medium
3 3 180434779 low
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With