Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A faster function to lower the resolution of a raster R

I am using the raster package to lower the resolution of big rasters, using the function aggregate like this

require(raster)    
x <- matrix(rpois(1000000, 2),1000)

a <-raster(x)
plot(a)

agg.fun <- function(x,...) 
    if(sum(x)==0){
        return(NA)
    } else {
        which.max(table(x))
    }

a1<-aggregate(a,fact=10,fun=agg.fun)
plot(a1)

the raster images I have to aggregate are much bigger 34000x34000 so I would like to know if there is a faster way to implement the agg.fun function.

like image 543
Leosar Avatar asked May 14 '16 16:05

Leosar


1 Answers

You can use gdalUtils::gdalwarp for this. For me, it's less efficient than @JosephWood's fasterAgg.Fun for rasters with 1,000,000 cells, but for Joseph's larger example it's much faster. It requires that the raster exists on disk, so factor writing time into the below if your raster is in memory.

Below, I've used the modification of fasterAgg.Fun that returns the most frequent value, rather than its index in the block.

library(raster)
x <- matrix(rpois(10^8, 2), 10000)
a <- raster(x)

fasterAgg.Fun <- function(x,...) {
    myRle.Alt <- function (x1) {
        n1 <- length(x1)
        y1 <- x1[-1L] != x1[-n1]
        i <- c(which(y1), n1)
        x1[i][which.max(diff(c(0L, i)))]
    }

    if (sum(x)==0) {
        return(NA)
    } else {
        myRle.Alt(sort(x, method="quick"))
    }
}
system.time(a2 <- aggregate(a, fact=10, fun=fasterAgg.Fun))

##   user  system elapsed 
##  67.42    8.82   76.38

library(gdalUtils)
writeRaster(a, f <- tempfile(fileext='.tif'), datatype='INT1U')
system.time(a3 <- gdalwarp(f, f2 <- tempfile(fileext='.tif'), r='mode', 
                           multi=TRUE, tr=res(a)*10, output_Raster=TRUE))

##   user  system elapsed 
##   0.00    0.00    2.93

Note that there is a slight difference in the definition of the mode when there are ties: gdalwarp selects the highest value, while the functions passed to aggregate above (via which.max's behaviour) select the lowest (e.g., see which.max(table(c(1, 1, 2, 2, 3, 4)))).

Also, storing the raster data as integer is important (when applicable). If the data are stored as float (the writeRaster default), for example, the gdalwarp operation above takes ~14 sec on my system. See ?dataType for available types.

like image 169
jbaums Avatar answered Sep 28 '22 08:09

jbaums