Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Update id to its parent_id unless it has different values

Tags:

r

First of all apologies for the poor unclear question title, it is quite a specific question which I didn't know how to word in one line!

Anyway, my problem is as follows. I have a data frame with an id, a parent id, and two values, say a and b. I want to update the id of a row to its parent_id unless it values are not equal to its parent_id's.

So for example say I have the table:

id parent_id a b  
1    0       x x
2    1       x x
3    1       x y
4    0       y y
5    4       x x 
6    1       x x
7    4       y y

Which could be generated with the code

 x <- data.frame('id' = c(1,2,3,4,5,6,7),
                 'parent_id' = c(0,1,1,0,4,1,4),
                 'a' = c('x','x','x','y','x','x','y'),
                 'b' = c('x','x','y','y','x','x','y'))

This should become:

id parent_id a b
1    0       x x
1    1       x x
3    1       x y
4    0       y y
5    4       x x
1    1       x x
4    4       y y

So id 2 has become 1 as that was its parent_id, and is properties a & b are both equal to x, the same as id 1, however id 3 has remained the same as although its parent_id is 1, it does not have the same properties.

Any help would be appreciated.

like image 479
user1165199 Avatar asked Dec 27 '25 21:12

user1165199


2 Answers

Someone else might have a more elegant solution, but this gets what you want:

# list of a-b pairs for parent_id
parent.id.ab <- with(x, lapply(parent_id, FUN=function(y) c(a[id==y], b[id==y])))

# list of a-b pairs for id
id.ab <- with(x, mapply(function(y,z) c(y,z), a, b, SIMPLIFY=FALSE))

# condition is a vector of TRUE/FALSE, TRUE if the parent_id a-b pair equals the id a-b.
# When the parent_id is 0, its a-b pair is integer(0). Since all(logical(0)) is TRUE,
# we only use all(z == y) when z and y have the same length.
condition <- mapply(function(z,y) if (length(z) == length(y)) all(z == y) else FALSE, 
                    parent.id.ab, id.ab)

x$id <- ifelse(condition, x$parent_id, x$id)
like image 136
Matthew Plourde Avatar answered Dec 30 '25 12:12

Matthew Plourde


Thanks to both of you, thelatemails answer works for the example I gave you but in reality my the id's in my dataframe are not 1,2,3,4, etc... but more random and do not contain all numbers so the code falls down when you get ds[matches, ]. I think mplourde answer would have done the trick to but I ended up doing it using the following code:

  betValues <- with(x, paste(id, a, b, sep="-"))

  x[, 'id'] <- with(x, ifelse(parent_id== 0, id, 
          ifelse(!paste(parent_id, a, b, sep="-") %in%  betValues, id, parent_id)))

which I think works and is pretty quick and neat.

Thanks

like image 20
user1165199 Avatar answered Dec 30 '25 10:12

user1165199



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!