Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Average of n rows

Tags:

r

I have a dataframe with three columns: Id, Date and Value and want to downsample this by average: take the next 20 rows, build average of Value from these 20 rows and add it to a new dataframe with the same structure. Date should be the first value of the 20 rows.

I tried it this way (probably horrible :):

resample.downsample <- function(data, by=20)
{
  i <- 0
  nmax <- nrow(data)
  means <- c()
  while(i < nmax)
  {
    means <- c(means, mean(subset(data, Id > i & Id <= i+by)$Value))
    i <- i+by
  }
  return (
    data.frame(
      Id = seq(1, length.out=(nmax/by), by=1),
      Date = seq(startDate, length.out=(nmax/by), by=(1/by)), 
      Value = means  
    )
  )
}

This works so for small datasets, but runs forever on my real datasets (~4000000 rows). Any ideas how to optimize this function?

Sample-Data (input, output should have the same structure, classes: integer, numeric, POSIXct/POSIXt):

    Value   Id  Date
1   125 1   2011-06-30 22:41:50
2   127 2   2011-06-30 22:41:50
3   126 3   2011-06-30 22:41:50
4   123 4   2011-06-30 22:41:50
5   130 5   2011-06-30 22:41:50
6   131 6   2011-06-30 22:41:50
7   128 7   2011-06-30 22:41:50
like image 455
Fge Avatar asked Aug 01 '11 20:08

Fge


People also ask

How do I average every 7 rows in Excel?

Put the following formula in the next column to your data starting in row 7 and then just paste to the bottom of your data... Enter the formula, =AVERAGE(OFFSET($A$1,ROW()*7-1,,-7)), into cell B1 and Copy down to cell B200.

How do you average multiple rows in Excel?

Do the following: Click a cell below, or to the right, of the numbers for which you want to find the average. On the Home tab, in the Editing group, click the arrow next to. AutoSum , click Average, and then press Enter.

How do you find the average of all 5 rows?

In Excel, have you ever tried to average every 5 rows or columns, that is to say, you need to do these operations: =average (A1:A5), =average(A6:A10), =average(A11:A15),…of course, you can apply the Average function to get the average of every 5 cells every time, but, if there are hundreds and thousands cells in your ...


1 Answers

See this answer for a method that should work for you. How to get the sum of each four rows of a matrix in R. In your case it would be:

colMeans(matrix(data$Value, nrow=20))

Your current method to get the first Date should be fine.

like image 157
Aaron left Stack Overflow Avatar answered Sep 22 '22 12:09

Aaron left Stack Overflow