Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Resample raster

I am trying to resample a forest cover raster with high resolution (25 meters) and categorical data (1 to 13) to a new RasterLayer with a lower resolution (~ 1 km). My idea is to combine the forest cover data with other lower-resolution raster data :

  1. I tried raster::resample(), but since the data is categorical I lost a lot of information:

    summary(as.factor(df$loss_year_mosaic_30m))
      0       1   2   3  4   5   6   7  8   9   10  11   12  13
    3777691  65  101 50 151 145 159 295 291 134 102 126 104  91 
    

    As you can see, the new raster has the desired resolution but have lots of zeros as well. I suppose that is normal since I used the ´ngb´ option in resample.

  2. The second strategy was using raster::aggregate() but I find difficult to define a factor integer since the change of resolution is not straightforward (like the double of the resolution or alike).

    My high-resolution raster has the following resolution, and I want it to aggregate it to a 0.008333333, 0.008333333 (x, y) resolution to the same extent.

    loss_year
    class       : RasterLayer 
    dimensions  : 70503, 59566, 4199581698  (nrow, ncol, ncell)
    resolution  : 0.00025, 0.00025  (x, y)
    extent      : -81.73875, -66.84725, -4.2285, 13.39725  (xmin, xmax, ymin, ymax)
    coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
    data source : /Volumes/LaCie/Deforestacion/Hansen/loss_year_mosaic_30m.tif 
    names       : loss_year_mosaic_30m 
    values      : 0, 13  (min, max)
    

    I have tried a factor of ~33.33 following the description of the aggregate help: "The number of cells is the number of cells of x divided by fact*fact (when fact is a single number)." Nonetheless, the resulting raster data do not seem to have the same number of rows and columns as my other low-resolution rasters.

I have never used this high-resolution data, and I am also computationally limited (some of this commands can be parallelized using clusterR, but sometimes they took the same time than the non-parallelized commands, especially since they do not work for nearest neighboor calculations).

I am short of ideas; maybe I can try layerize to obtain a count raster, but I have to ´aggregate´ and the factor problem arises. Since this processes are taking me days to process, I do want to know the most efficient way to create a lower resolution raster without losing much information

A reproducible example could be the following:

r_hr <- raster(nrow=70, ncol=70) #High resolution raster with categorical data
set.seed(0)
r_hr[] <- round(runif(1:ncell(r_hr), 1, 5))
r_lr <- raster(nrow=6, ncol=6) #Low resolution raster

First strategy: loss of information

r <- resample(r_hr, r_lr, method = "ngb") #The raster data is categorical

Second strategy: difficult to define an aggregate factor

r <- aggregate(r_hr, factor) #How to define a factor to get exactly the same number of cells of h_lr?

Another option: layerize

r_brick <- layerize(r_hr)
aggregate(r_brick, factor) #How to define factor to coincide with the r_lr dimensions? 

Thanks for your help!

like image 300
topcat Avatar asked Jun 21 '16 23:06

topcat


People also ask

What does resampling a raster do?

Changes the spatial resolution of a raster dataset and sets rules for aggregating or interpolating values across the new pixel sizes.

What does it mean to resample a raster graphic?

Raster resampling refers to change of spatial resolution (increasing or decreasing) of the raster dataset. The resampling process calculates the new pixel values from the original digital pixel values in the uncorrected image.

Can you increase the resolution of raster?

We can resample the raster as well to adjust the resolution. If we want a higher resolution raster, we will apply a grid with more pixels within the same extent. If we want a lower resolution raster, we will apply a grid with fewer pixels within the same extent.


2 Answers

r_hr <- raster(nrow=70, ncol=70) #High resolution raster with categorical data
set.seed(0)
r_hr[] <- round(runif(1:ncell(r_hr), 1, 5))
r_lr <- raster(nrow=6, ncol=6)

r_hr
#class       : RasterLayer 
#dimensions  : 70, 70, 4900  (nrow, ncol, ncell)
#resolution  : 5.142857, 2.571429  (x, y)
#extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
#coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
#data source : in memory
#names       : layer 
#values      : 1, 5  (min, max)

r_lr
#class       : RasterLayer 
#dimensions  : 6, 6, 36  (nrow, ncol, ncell)
#resolution  : 60, 30  (x, y)
#extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
#coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 

Direct aggregate is not possible, because 70/6 is not an integer.

dim(r_hr)[1:2] / dim(r_lr)[1:2]
#[1] 11.66667 11.66667

Nearest neighbor resampling is not a good idea either as the results would be arbitrary.

Here is a by layer approach that you suggested and dww also showed already.

b <- layerize(r_hr)
fact <- round(dim(r_hr)[1:2] / dim(r_lr)[1:2])
a <- aggregate(b, fact)
x <- resample(a, r_lr)

Now you have proportions. If you want a single class you could do

y <- which.max(x)

In that case, another approach would be to aggregate the classes

ag <- aggregate(r_hr, fact, modal) 
agx <- resample(ag, r_lr, method='ngb')

Note that agx and y are the same. But they can both be problematic as you might have cells with 5 classes with each about 20%, making it rather unreasonable to pick one winner.

like image 89
Robert Hijmans Avatar answered Sep 19 '22 10:09

Robert Hijmans


It is pretty standard practice to aggregate land cover maps into layers of %cover. I.e you should aim to produce 13 layers, each being something like %cover in that grid cell. Doing this allows you to reduce the resolution while retaining as much information as possible. N.B if you require a different summary statistic than %, should be easy to adapt the following method to whatever statistic you want, by changing the fun = function in aggregate.

The following method is pretty fast (it takes just a few seconds on my laptop to process raster with 100 million cells):

First, let's create some dummy rasters to use

Nhr <- 1e4 # resolution of high-res raster
Nlr <- 333 # resolution of low-res raster
r.hr <- raster(ncols=Nhr, nrows=Nhr)
r.lr <- raster(ncols=Nlr, nrows=Nlr)
r.hr[] <- sample(1:13, Nhr^2, replace=T)

Now, we begin by aggregating the high res raster to almost the same resolution as the low res one (to nearest integer number of cells). Each resulting layer contains the fraction of area within that cell in which value of original raster is N.

Nratio <- as.integer(Nhr/Nlr) # ratio of high to low resolutions, to nearest integer value for aggregation
layer1 <- aggregate(r.hr, Nratio, fun=function(x, na.rm=T) {mean(x==1, na.rm=na.rm)})
layer2 <- aggregate(r.hr, Nratio, fun=function(x, na.rm=T) {mean(x==2, na.rm=na.rm)})

And finally, resample low res raster to the desired resolution

layer1 <- resample(layer1, r.lr, method = "ngb") 
layer2 <- resample(layer2, r.lr, method = "ngb")

repeat for each layer, and build your layers into a stack or a multi-band raster

like image 38
dww Avatar answered Sep 21 '22 10:09

dww