Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multi replace values according to template

Tags:

r

In order to make a GROUP VARIABLE for long data, I want to group multiple values into one new value.

I have already one solution but I feel there could be a better implementation.

set.seed(1337)
df <- data.frame(coli = sample(rep(1:6,2)), newi = 0 )

replaceList <- list(oneAndTwo=1:2, threeAndFour=3:4, fiveAndSix=5:6)

Data looks like:

> df
   coli newi
1     1    0
2     6    0
3     1    0
4     5    0
5     3    0
6     2    0
7     6    0
8     2    0
9     4    0
10    4    0
11    3    0
12    5    0

Lookup template looks like:

> replaceList
$oneAndTwo
[1] 1 2

$threeAndFour
[1] 3 4

$fiveAndSix
[1] 5 6

Desired result:

   coli         newi
1     1    oneAndTwo
2     6   fiveAndSix
3     1    oneAndTwo
4     5   fiveAndSix
5     3 threeAndFour
6     2    oneAndTwo
7     6   fiveAndSix
8     2    oneAndTwo
9     4 threeAndFour
10    4 threeAndFour
11    3 threeAndFour
12    5   fiveAndSix 

My working try

mapply(function(fnd,rplc){IND=df$coli %in% fnd;df$newi[IND]<<-rplc},fnd=replaceList,rplc=names(replaceList))

If there is a better practice, also in regards to how to set up the replaceList I'm happy to learn.

How would you tackle/approach such a problem?

like image 653
Andre Elrico Avatar asked Mar 13 '18 10:03

Andre Elrico


People also ask

Can I Find and Replace multiple values at once?

The easiest way to find and replace multiple entries in Excel is by using the SUBSTITUTE function. The formula's logic is very simple: you write a few individual functions to replace an old value with a new one.

How do you find and replace multiple values at once in Word?

Go to Home > Replace. Enter the word or phrase you want to replace in Find what. Enter your new text in Replace with. Choose Replace All to change all occurrences of the word or phrase.

How do you replace multiple cells at once?

To do this, click on the "Edit" menu, then click on "Find and Replace." In the "Find what" field, type in the text or value that you want to replace. In the "Replace with" field, type in the text or value that you want to use as a replacement. Then, click on the "Replace All" button.


2 Answers

We can stack the list to a key/value dataset ('df2') and then do a match between the 'coli' of 'df' with 'values' column of 'df2' to get the corresponding index for 'ind' and assign it to 'newi'

df2 <- stack(replaceList)
df$newi <- df2$ind[match(df$coli, df2$values)]
df
#   coli         newi
#1     4 threeAndFour
#2     3 threeAndFour
#3     6   fiveAndSix
#4     1    oneAndTwo
#5     2    oneAndTwo
#6     1    oneAndTwo
#7     5   fiveAndSix
#8     2    oneAndTwo
#9     4 threeAndFour
#10    6   fiveAndSix
#11    3 threeAndFour
#12    5   fiveAndSix
like image 84
akrun Avatar answered Oct 18 '22 04:10

akrun


Make a named vector instead of your replaceList list, then match by name:

set.seed(1337);df <- data.frame(coli = sample(rep(1:6,2)), newi = 0 )

# make a named vector
myLookup <- setNames(c("oneAndTwo","oneAndTwo","threeAndFour","threeAndFour","fiveAndSix","fiveAndSix"),
                   1:6)

# then match by name
df$newi <- myLookup[ df$coli ]

# check
head(df)
#   coli         newi
# 1    1    oneAndTwo
# 2    6   fiveAndSix
# 3    1    oneAndTwo
# 4    5   fiveAndSix
# 5    3 threeAndFour
# 6    2    oneAndTwo

Other (preferred) option would be to use cut, and get factor column:

# using cut, no need for lookup
df$newiFactor <- cut(df$coli, c(0, 2, 4, 6))

# check
head(df[order(df$coli), ])
#    coli         newi newiFactor
# 1     1    oneAndTwo      (0,2]
# 3     1    oneAndTwo      (0,2]
# 6     2    oneAndTwo      (0,2]
# 8     2    oneAndTwo      (0,2]
# 5     3 threeAndFour      (2,4]
# 11    3 threeAndFour      (2,4]

Note: we could use labels option for cut and get your desired naming "oneAndTwo", etc. Again, in this case, I prefer to have numerical looking names: "(0,2]", etc.

like image 37
zx8754 Avatar answered Oct 18 '22 03:10

zx8754