I'd like to know what is wrong with my code rather than a solution. I wish to loop over some strings my data is as follows:
id source transaction
1 a > b 6 > 0
2 J > k 5
3 b > c 4 > 0
I have a list and wish to go over this list and find the rows that contains that element and compute average.
mylist <- c ("a", "b")
So my desired output will for one of the element in the list is
source avg
a 6
b 2
I do not know who to loop over the list and send them to a csv file. I tried this
mylist <- c( "a", "b" )
for(i in mylist)
{
KeepData <- df [grepl(i, df$source), ]
KeepData <- cSplit(KeepData, "transaction", ">", "long")
avg<- mean(KeepData$transactions)
result <- list(i,avg )
write.table(result ,file="C:/Users.csv", append=TRUE,sep=",",col.names=FALSE,row.names=FALSE)
}
But It gives me "NA" result with the following warning
Warning messages: 1: In mean.default(KeepData$transactions) :
argument is not numeric or logical: returning NA 2: In mean.default(KeepData$transactions) : argument is not numeric or logical: returning NA
We can use cSplit
to split the 'source' and convert the dataset to 'long' format, then specify the 'i', grouped by 'source', get the mean
of 'transaction' (using data.table
methods)
library(splitstackshape)
cSplit(df1, "source", " > ", "long")[source %in% mylist, .(avg = mean(transaction)), source]
# source avg
#1: a 6
#2: b 5
Or another option is separate_rows
from tidyr
to convert to 'long' format, then use the dplyr
methods to summarise
after grouping by 'source'
library(tidyr)
library(dplyr)
separate_rows(df1, source) %>%
filter(source %in% mylist) %>%
group_by(source) %>%
summarise(avg = mean(transaction))
For the new dataset ('df2'), we need to split both the columns to 'long' format, and then get the mean
of 'transaction' grouped by 'source'
cSplit(df2, 2:3, " > ", "long")[source %in% my_list, .(avg = mean(transaction)), source]
# source avg
#1: a 6
#2: b 2
The for
loop can be modified to
for(i in mylist) {
KeepData <- cSplit(df2, 2:3, ">", "long")
KeepData <- KeepData[grepl(i, source)]
avg<- mean(KeepData$transaction)
result <- list(i,avg )
print(result)
write.table(result ,file="C:/Users.csv",
append=TRUE,sep=",",col.names=FALSE,row.names=FALSE)
}
#[[1]]
#[1] "a"
#[[2]]
#[1] 6
#[[1]]
#[1] "b"
#[[2]]
#[1] 2
df1 <- structure(list(id = 1:3, source = c("a > b", "J > k", "b > c"
), transaction = c(6L, 5L, 4L)), .Names = c("id", "source", "transaction"
), class = "data.frame", row.names = c(NA, -3L))
df2 <- structure(list(id = 1:3, source = c("a > b", "J > k", "b > c"
), transaction = c("6 > 0", "5", "4 > 0")), .Names = c("id",
"source", "transaction"), class = "data.frame", row.names = c(NA,
-3L))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With