Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select rasters in stack based on layer partial name match

I have a stack of rasters (one per species) and then I have a data frame with lat/long columns along with a species name.

fls = list.files(pattern="median")
s <- stack(fls)
df<-c("x","y","species name")

I want to be able to just select one raster at a time to use with an extract function. I want the selection to be based on the partial match based on the species name column. I want to do this because the raster names might not match perfectly the names in the species list, there might be a lower/upper case mismatch or the raster layer name might be longer, for example "species_name_median", or there might also be "_" instead of a blank.

for(i:length(df.species name))
{
  result<-extract(s[[partial match to "species name[i]" ]],df.xy)
}

I hope this makes sense that I just want to use one raster at a time for the extraction. I am able to easily select a single raster using s[[i]] but there is no guarantee that every species in the list has its equivalent raster.

like image 340
Herman Toothrot Avatar asked May 13 '13 14:05

Herman Toothrot


People also ask

What is the selection environment in the raster catalog?

The selection environment allows you to perform queries based on data within the raster catalog or mosaic dataset and its relationship to feature layers. For example, using Select By Location or Select By Attribute from the Selection menu, you could select the following:

How to override extent error in raster?

If common extents are a problem in the rasters, you can use the "quick=TRUE" argument in stack () to override the extent error. Seemingly, the problem with your code is that nlayers cannot be applied to a list object.

How to draw raster image in AutoCAD?

Right-click the raster catalog and click Properties . Click the Selection tab. Choose a Show selected features option. Optionally, check and draw rasters . The rasters for the selected polygons in the raster catalog will be drawn when the polygons are selected.

How to use Asterisk (*) for partial match in Excel?

We use the Asterisk (*) as a wildcard that matches zero or more text strings. Table_array is $B$4:$C$9. Press “Enter”. The formula has performed the partial match string. Now apply the same formula 2 or more times to master this function. Read More: How to Use VLOOKUP for Partial Match in Excel (4 Ways) 4. XLOOKUP to Perform Partial Match String


2 Answers

If your data of points to query on consists of a data.frame of x and y coordinates and the appropriate species name for the layer to query on you can use these two commands to do everything:

#  Find the layer to match on using 'grepl' and 'which' converting all names to lowercase for consistency
df$layer <- lapply( df$species , function(x) which( grepl( tolower(x) , tolower(names(s)) ) ) )


# Extract each value from the appropriate layer in the stack
df$Value <- sapply( seq_len(nrow(df)) , function(x) extract( s[[ df$layer[x] ]] , df[ x , 1:2 ] ) )

How it works

Starting from the first line:

  • First we define a new column vector df$layer which will be the index of the rasterLayer in the stack that we need to use for that row.
  • lapply iterates along all the elements in the column df$species and applies an anonymous function using each item in df$species as an input variable x in turn. lapply is a loop construct even though it doesn't look like one.
  • on the first iteration we take the first element of df$species which is now x and use it in grepl (means something like 'global regular pattern matching logical') to find which elements of the names of our stack s contain our species pattern. We use tolower() on both the pattern to match against (x) and the elements to match in (names(s)) to ensure we match even when the case doesn't match case, e.g. "Tiger" won't find "tiger".
  • grepl returns a logical vector of which elements it found matches of the pattern in, e.g. grepl( "abc" , c("xyz", "wxy" , "acb" , "zxabcty" ) ) returns F , F , T , T. We use which to get the index of those elements.
  • The idea is that we get one, and only one match of a layer in the stack to the species name for each row, so the only TRUE index will be the index of the layer in the stack we want.

On the second line, sapply:

  • sapply is an iterator much like lapply but it returns a vector rather than a list of values. TBH you could use either in this use-case.
  • Now we iterate across a sequence of numbers from 1 to nrow(df).
  • We use the row number in another anonymous function as our input variable x
  • We want to extract the "x" and "y" coordinates (columns 1 and 2 respectively) for the current row (given by x) of the data.frame, using the layer that we got in our previous line.
  • We assign the result of doing all this to another column in our data.frame which contains the extracted value for that x/y coord for the appropriate layer

I hope that helps!!

And a worked example with some data:

require( raster )
#  Sample rasters - note the scale of values in each layer  
# Tens
r1 <- raster( matrix( sample(1:10,100,repl=TRUE) , ncol = 10 ) )    
# Hundreds
r2 <- raster( matrix( sample(1e2:1.1e2,100,repl=TRUE) , ncol = 10 ) )   
# Thousands
r3 <- raster( matrix( sample(1e3:1.1e3,100,repl=TRUE) , ncol = 10 ) )

#  Stack the rasters
s <- stack( r1,r2,r3 )
#  Name the layers in the stack
names(s) <- c("LIon_medIan" , "PANTHeR_MEAN_AVG" , "tiger.Mean.JULY_2012")


#  Data of points to query on
df <- data.frame( x = runif(10) , y = runif(10) , species = sample( c("lion" , "panther" , "Tiger" ) , 10 , repl = TRUE ) )

#  Run the previous code
df$layer <- lapply( df$species , function(x) which( grepl( tolower(x) , tolower(names(s)) ) ) )
df$Value <- sapply( seq_len(nrow(df)) , function(x) extract( s[[ df$layer[x] ]] , df[ x , 1:2 ] ) )

#  And the result (note the scale of Values is consistent with the scale of values in each rasterLayer in the stack)
df
#          x         y species layer Value
#1  0.4827577 0.7517476    lion     1     1
#2  0.8590993 0.9929104    lion     1     3
#3  0.8987446 0.4465397   tiger     3  1084
#4  0.5935572 0.6591223 panther     2   107
#5  0.6382287 0.1579990 panther     2   103
#6  0.7957626 0.7931233    lion     1     4
#7  0.2836228 0.3689158   tiger     3  1076
#8  0.5213569 0.7156062    lion     1     3
#9  0.6828245 0.1352709 panther     2   103
#10 0.7030304 0.8049597 panther     2   105
like image 113
Simon O'Hanlon Avatar answered Sep 28 '22 18:09

Simon O'Hanlon


Did you try to subset your RasterStack?

Something like this

for(i in 1: length(df.species.name)) #assuming it is the 'partial species name'
{
  result <- subset(s, grep(df.species.name[i], ignore.case = TRUE, value = TRUE)
}

It would be interesting to know how different raster and species names may be. This would allow better approaches, tunning regular expression if necessary. You'll find many references to grep here. Try ?grep too.

like image 26
Paulo E. Cardoso Avatar answered Sep 28 '22 19:09

Paulo E. Cardoso