Loop over strings in r

Question

I'd like to know what is wrong with my code rather than a solution. I wish to loop over some strings my data is as follows:

id    source    transaction

 1     a > b       6 > 0
 2     J > k       5
 3     b > c       4 > 0

I have a list and wish to go over this list and find the rows that contains that element and compute average.

mylist <- c ("a", "b")

So my desired output will for one of the element in the list is

source  avg
a        6 
b        2

I do not know who to loop over the list and send them to a csv file. I tried this

mylist <- c( "a", "b" )

for(i in mylist)
{

  KeepData <- df [grepl(i, df$source), ]
   KeepData <- cSplit(KeepData, "transaction", ">", "long")

  avg<- mean(KeepData$transactions)
  result <- list(i,avg )

  write.table(result ,file="C:/Users.csv", append=TRUE,sep=",",col.names=FALSE,row.names=FALSE)

}

But It gives me "NA" result with the following warning

Warning messages: 1: In mean.default(KeepData$transactions) :
argument is not numeric or logical: returning NA 2: In mean.default(KeepData$transactions) : argument is not numeric or logical: returning NA

akrun · Accepted Answer

We can use cSplit to split the 'source' and convert the dataset to 'long' format, then specify the 'i', grouped by 'source', get the mean of 'transaction' (using data.table methods)

library(splitstackshape)
cSplit(df1, "source", " > ", "long")[source %in% mylist, .(avg = mean(transaction)), source]
#   source avg
#1:      a   6
#2:      b   5

Or another option is separate_rows from tidyr to convert to 'long' format, then use the dplyr methods to summarise after grouping by 'source'

library(tidyr)
library(dplyr)
separate_rows(df1, source) %>%
        filter(source %in% mylist) %>%
        group_by(source) %>% 
        summarise(avg  = mean(transaction))

Update

For the new dataset ('df2'), we need to split both the columns to 'long' format, and then get the mean of 'transaction' grouped by 'source'

cSplit(df2, 2:3,  " > ", "long")[source %in% my_list, .(avg = mean(transaction)), source]
#   source avg
#1:      a   6
#2:      b   2

The for loop can be modified to

for(i in mylist) {
   KeepData <-  cSplit(df2, 2:3,  ">", "long")
   KeepData <- KeepData[grepl(i, source)]
   avg<- mean(KeepData$transaction)
   result <- list(i,avg )
   print(result)
   write.table(result ,file="C:/Users.csv", 
             append=TRUE,sep=",",col.names=FALSE,row.names=FALSE)
 }
#[[1]]
#[1] "a"

#[[2]]
#[1] 6

#[[1]]
#[1] "b"

#[[2]]
#[1] 2

data

df1 <- structure(list(id = 1:3, source = c("a > b", "J > k", "b > c"
 ), transaction = c(6L, 5L, 4L)), .Names = c("id", "source", "transaction"
), class = "data.frame", row.names = c(NA, -3L))


df2 <- structure(list(id = 1:3, source = c("a > b", "J > k", "b > c"
), transaction = c("6 > 0", "5", "4 > 0")), .Names = c("id", 
"source", "transaction"), class = "data.frame", row.names = c(NA, 
-3L))

Loop over strings in r

Tags:

loops

r

MFR

1 Answers

Update

data

akrun

Recent Activity

Donate For Us

Loop over strings in r

Tags:

loops

r

MFR

1 Answers

Update

data

akrun

Related questions

Recent Activity

Donate For Us