I have an R data frame which looks like:
User |request_id |previous_request_id ------------------------------------- A |9 |5 A |3 |1 A |5 |NA A |1 |9 B |2 |8 B |8 |7 B |7 |NA B |4 |2
Each row corresponds to a request a particular user made. Each row has a user ID, a request ID and the ID of their previous request. Where there is no previous request the previous_request_id field is NA.
For each user I want to order each request by using the previous request id, with:
The result of the above rules applied to the first table should look like:
User |request_id |previous_request_id |Order --------------------------------------------- A |9 |5 |2 A |3 |1 |4 A |5 |NA |1 A |1 |9 |3 B |2 |8 |3 B |8 |7 |2 B |7 |NA |1 B |4 |2 |4
Is there a way to do this within R? I believe a graphical database package may be the way to do this but so far I haven't been able to find anything in my research (centred on the Cypher language of Neo4j).
Any help here would be greatly appreciated!
To sort a data frame in R, use the order( ) function. By default, sorting is ASCENDING. Prepend the sorting variable by a minus sign to indicate DESCENDING order.
Use the $ operator to address a column by name.
Accessing the columns of a data frame The column items in a data frame in R can be accessed using: Single brackets [] , which would display them as a column. Double brackets [[]] , which would display them as a list. Dollar symbol $ , which would display them as a list.
The ncol() function in R programming R programming helps us with ncol() function by which we can get the information on the count of the columns of the object. That is, ncol() function returns the total number of columns present in the object.
There are many ways to do this, but here's what I came up with...
df <- read.delim(text="User|request_id|previous_request_id
A|9|5
A|3|1
A|5|NA
A|1|9
B|2|8
B|8|7
B|7|NA
B|4|2", sep="|")
df$order <- rep(NA, nrow(df))
df$order[is.na(df$previous_request_id)] <- 1
df$order[df$order[match(df$previous_request_id, df$request_id)] == 1] <- 2
df$order[df$order[match(df$previous_request_id, df$request_id)] == 2] <- 3
df$order[df$order[match(df$previous_request_id, df$request_id)] == 3] <- 4
But notice that we are repeating the same code (almost) over and over. We can create a loop to shorten the code up a bit...
max_user_len <- max(table(df$User))
df$order <- rep(NA, nrow(df))
df$order[is.na(df$previous_request_id)] <- 1
sapply(1:max_user_len, function(x)df$order[df$order[match(df$previous_request_id, df$request_id)] == x] <<- x+1)
> df$order
[1] 2 4 1 3 3 2 1 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With