Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Changing the first column depending of the remaining ones

Tags:

replace

r

I would like to change the group (S1...S5) in the Var1 column to depict the largest number in Freq column as S1, the second largest as S2 and so on. Please realize that the factors in this case are the numbers in the Position column. Thus, for the Position == 26998698, for example, we will finish with 1587 as S1 in the output instead S3, 340 as S2 in the output instead S4 and so on.

df <- 'Var1 Freq Position
S1    1 26998698
S2  125 26998698
S3 1587 26998698
S4  340 26998698
S5    8 26998698
S1   68 27252684
S2  703 27252684
S3  913 27252684
S4  293 27252684
S5   58 27252684
S1    7 27209738
S2  383 27209738
S3 1425 27209738
S4  239 27209738
S5    6 27209738'
df<- read.table(text=df, header=T)

My expected output

output <- 'Var1 Freq Position
S5    1 26998698
S3  125 26998698
S1 1587 26998698
S2  340 26998698
S4    8 26998698
S4   68 27252684
S2  703 27252684
S1  913 27252684
S3  293 27252684
S5   58 27252684
S4    7 27209738
S2  383 27209738
S1 1425 27209738
S3  239 27209738
S5    6 27209738'
output<- read.table(text=output, header=T)

Some ideas to perform that?

like image 723
user2120870 Avatar asked Dec 16 '25 14:12

user2120870


2 Answers

Here's an approach using dplyr:

library(dplyr)
df %>% 
  group_by(Position) %>% 
  mutate(Var1 = Var1[dense_rank(desc(Freq))])
#Source: local data frame [15 x 3]
#Groups: Position [3]
#
#     Var1  Freq Position
#   (fctr) (int)    (int)
#1      S5     1 26998698
#2      S3   125 26998698
#3      S1  1587 26998698
#4      S2   340 26998698
#5      S4     8 26998698
#6      S4    68 27252684
#...

After grouping the data by Position, we compute the dense_rank (i.e. minimum rank without gaps) of Freq and use that to index Var1. Since we want to actually compute the opposite of a min_rank without gaps, we use desc(Freq), i.e. in descending order.

like image 155
talat Avatar answered Dec 19 '25 06:12

talat


just another option using data.table

library(data.table)

setDT(df)[, Var1:= Var1[frank(-Freq, ties.method="dense")], by = Position]

#    Var1 Freq Position
# 1:   S5    1 26998698
# 2:   S3  125 26998698
# 3:   S1 1587 26998698
# 4:   S2  340 26998698
# 5:   S4    8 26998698
# 6:   S4   68 27252684
# 7:   S2  703 27252684
# 8:   S1  913 27252684
# 9:   S3  293 27252684
#10:   S5   58 27252684
#11:   S4    7 27209738
#12:   S2  383 27209738
#13:   S1 1425 27209738
#14:   S3  239 27209738
#15:   S5    6 27209738
like image 42
Veerendra Gadekar Avatar answered Dec 19 '25 07:12

Veerendra Gadekar



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!