Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sliding window in R

I have a dataframe DF, with two columns A and B shown below:

A                    B                  
1                    0             
3                    0               
4                    0                   
2                    1                    
6                    0                    
4                    1                     
7                    1                 
8                    1                     
1                    0   

A sliding window approach is performed as shown below. The mean is calulated for column B in a sliding window of size 3 sliding by 1 using: rollapply(DF$B, width=3,by=1). The mean values for each window are shown on the left side.

    A:         1    3    4    2    6    4    7    8    1                                          
    B:         0    0    0    1    0    1    1    1    0                                
              [0    0    0]                                              0
                    [0    0    1]                                        0.33
                          [0    1    0]                                  0.33
                                [1    0    1]                            0.66
                                      [0    1    1]                      0.66
                                            [1    1    1]                1
                                                 [1    1    0]           0.66
output:        0   0.33 0.33 0.66   0.66    1     1    1   0.66

Now, for each row/coordinate in column A, all windows containing the coordinate are considered and should retain the highest mean value which gives the results as shown in column 'output'.

I need to obtain the output as shown above. The output should like:

A                   B                  Output   
1                   0                      0
3                   0                      0.33
4                   0                      0.33
2                   1                      0.66
6                   0                      0.66
4                   1                      1
7                   1                      1
8                   1                      1
1                   0                    0.66

Any help in R?

like image 776
chas Avatar asked Apr 11 '13 07:04

chas


People also ask

What is a moving window in R?

Moving window analysis, sometimes referred to as focal analysis, is the process of calculating a value for a specific neighborhood of cells in a given raster. Typical functions calculated across the neighborhood are sum, mean, min, max, range, etc.

What is sliding window function?

PDF. The WINDOW clause for a sliding windowed query specifies the rows over which analytic functions are computed across a group of rows in relation to the current row. These aggregate functions produce an output row aggregated by the keys in one or more columns for each input row.

What is rolling window?

A Rolling window is expressed relative to the delivery date and automatically shifts forward with the passage of time. For example, a customer with a 5-year Rolling window who gets a delivery on May 4, 2015 would receive data covering the period from May 4, 2015 to May 4, 2020.

What is window size in statistics?

The time interval between two snapshots is referred to as the window size. A given longitudinal network can be analysed from various actor-level perspectives, such as exploring how actors change their degree centrality values or participation statistics over time.


1 Answers

Try this:

# form input data
library(zoo)
B <- c(0, 0, 0, 1, 0, 1, 1, 1, 0)

# calculate
k <- 3
rollapply(B, 2*k-1, function(x) max(rollmean(x, k)), partial = TRUE)

The last line returns:

[1] 0.0000000 0.3333333 0.3333333 0.6666667 0.6666667 1.0000000 1.0000000
[8] 1.0000000 0.6666667

If there are NA values you might want to try this:

k <- 3
B <- c(1, 0, 1, 0, NA, 1)
rollapply(B, 2*k-1, function(x) max(rollapply(x, k, mean, na.rm = TRUE)), partial = TRUE)

where the last line gives this:

[1] 0.6666667 0.6666667 0.6666667 0.5000000 0.5000000 0.5000000

Expanding it out these are formed as:

c(mean(B[1:3], na.rm = TRUE), ##
max(mean(B[1:3], na.rm = TRUE), mean(B[2:4], na.rm = TRUE)), ##
max(mean(B[1:3], na.rm = TRUE), mean(B[2:4], na.rm = TRUE), mean(B[3:5], na.rm = TRUE)),
max(mean(B[2:4], na.rm = TRUE), mean(B[3:5], na.rm = TRUE), mean(B[4:6], na.rm = TRUE)),
max(mean(B[3:5], na.rm = TRUE), mean(B[4:6], na.rm = TRUE)), ##
mean(B[4:6], na.rm = TRUE)) ##

If you don't want the k-1 components at each end (marked with ## above) drop partial = TRUE.

like image 179
G. Grothendieck Avatar answered Sep 20 '22 23:09

G. Grothendieck