In this data frame:
Item <- c("A","B","A","A","A","A","A","B")
Trial <- c("Fam","Fam","Test","Test","Test","Test","Test","Test")
Condition <-c("apple","cherry","Trash","Trash","Trash","Trash","Trash","Trash")
ID <- c(rep("01",8))
df <- data.frame(cbind(Item,Trial,Condition,ID))
I would like to replace the "Trash" value of df$condition
at df$Trial == "Test"
. The new value of df$condition
should be a copy df$condition
at df$Trial == "Fam"
, based on a match of Fam and Test Trials in df$Item
.
So my final data frame should look like this
Item Trial Condition ID
1 A Fam apple 01
2 B Fam cherry 01
3 A Test apple 01
4 A Test apple 01
5 A Test apple 01
6 A Test apple 01
7 A Test apple 01
8 B Test cherry 01
Ultimately I would like to do this for unique ID's in my original data frame. So I guess I will have to apply the function within ddply
or so later on.
You can extract a column of pandas DataFrame based on another value by using the DataFrame. query() method. The query() is used to query the columns of a DataFrame with a boolean expression.
The best way to replicate columns in R is by using the CBIND() function and the REP() function. First, you use the REP() function to select a column and create one or more copies. Then, you use the CBIND() function to merge the original dataset and the replicated columns into a single data frame.
You can use the loc and iloc functions to access columns in a Pandas DataFrame. Let's see how. If we wanted to access a certain column in our DataFrame, for example the Grades column, we could simply use the loc function and specify the name of the column in order to retrieve it.
You could do a self binary join on df
when Trial != "Test"
and update the Condition
column by reference using the data.table
package, for instance
library(data.table) ## V 1.9.6+
setDT(df)[df[Trial != "Test"], Condition := i.Condition, on = c("Item", "ID")]
df
# Item Trial Condition ID
# 1: A Fam apple 01
# 2: B Fam cherry 01
# 3: A Test apple 01
# 4: A Test apple 01
# 5: A Test apple 01
# 6: A Test apple 01
# 7: A Test apple 01
# 8: B Test cherry 01
Or (with some modification of @docendos) suggestion, simply
setDT(df)[, Condition := Condition[Trial != "Test"], by = .(Item, ID)]
Here is an option using dplyr
library(dplyr)
distinct(df) %>%
filter(Trial=='Fam') %>%
left_join(df, ., by = c('Item', 'ID')) %>%
mutate(Condition = ifelse(Condition.x=='Trash',
as.character(Condition.y), as.character(Condition.x))) %>%
select(c(1,2,4,7))
Or as suggested by @docendodiscimus
df %>%
group_by(ID, Item) %>%
mutate(Condition = Condition[Condition != "Trash"])
You could also just create a for-loop and loop through all the values that need to be changed. This setup makes it easy to add other items and/or change the type of condition later on.
> for(i in 1:nrow(df)) {
>
> if(df[i, 1] == "A") {
> df2[i, 3] <- "apple"
> }
> else if(df[i, 1] == "B") {
> df2[i, 3] <- "cherry"
> }
> }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With