My questions are related to the changes induced by the upgrade from PROJ4 to PROJ6
and the consequences in various R spatial packages (sp
, sf
, raster
).
We receive now a lot of warnings about “Discarded datum” that looks a bit worrying and I’m a little puzzled about what I should do with that. I can see that this can have dire consequences in some situations and that they can be ignored in other circumstances.
It seems that I’m not the only one to be a little lost (see here). I hope that my questions with a specific reproducible example will help us to better understand the topic.
I understand that we can remove the warnings, and I have read the context : r-spatial blog post, Migration to PROJ6/GDAL3 and these workshop notes (mapview seems to handle this differently in more recent versions)
Question 1 :
Probably a naive question :
I understand that there is a need for a new notation/format (WKT) implemented in PROJ6 (eg because of a need of higher precision) but I don’t understand why there is a need to remove the datum part form the old proj4 string notation. Why not just keep it as it has always been (and implement the new features in the new WKT format/notation)
Question 2 :
It seems that we have 3 cases concerning the drop of the datum in the old proj4 format :
sf
défault ?)sp
default ?? )The example below illustrate different cases where we have these warnings.
Why do we have these 3 different cases on the same CRS (here EPSG 31370)?
What are the consequences of the removal of the datum and/or +towgs84
part ?
Should I be less worried by the second warning than by the third ?
Question 3 :
In the reproducible example below I try to extract the values from a raster corresponding to 3 points, the raster and the points having a different CRS. However depending on the approach used I get different results. I have the impression that this is related to this PROJ4 –> PROJ6 changes and datum dropping but I might be wrong. I created this example only because I wanted to understand the consequences of these “datum dropping” in the crs
I use the function raster::extract
and 3 different general approaches (each time for both
sf
and sp
objects for the points) from which I would expect the same output :
raster::extract
do the job to match the crs of the points and rasterWith the 3rd approach, I get 2 different set of values for sp
or sf
objects and
I get yet a third set of values with the second (and first approach) (then the values are the same if
I use sp
or sf
objects).
351.7868 236.4216 309.0073
)Question 4 :
Is it possible to provide general recommendations about what should be done when we have these warning messages ?
For example :
raster
uses)sf
, eg : st_transform(SF, crs = xxxx)
Briefly : the CRS is mainly stored in WKT format.
The old proj4string is available on request and do not drop the datum/towgs84
part
library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.2, PROJ 6.2.1
library(sp)
library(raster)
# create 3 points
coo <- data.frame(x = c(246000, 247000, 246500), y = c(184000, 186000, 185000))
# create an sf spatial object
SF <- st_as_sf(coo, coords = c("x", "y"), crs = 31370)
# Check the CRS :
# the proj4string includes the datum/+towgs84 information - no warning
st_crs(SF)$proj4string
#> [1] "+proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=51.1666672333333 +lat_2=49.8333339 +x_0=150000.013 +y_0=5400088.438 +ellps=intl +towgs84=-99.059,53.322,-112.486,0.419,-0.83,1.885,-1 +units=m +no_defs"
# Same value with `raster::crs` but with a
# Warning "Discarded datum (...) but +towgs84= values preserved"
raster::crs(SF)
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown_based_on_International_1909_Hayford_ellipsoid in CRS definition,
#> but +towgs84= values preserved
#> CRS arguments:
#> +proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=51.1666672333333
#> +lat_2=49.8333339 +x_0=150000.013 +y_0=5400088.438 +ellps=intl
#> +towgs84=-99.059,53.322,-112.486,0.419,-0.83,1.885,-1 +units=m +no_defs
# WKT
st_crs(SF)
#> Coordinate Reference System:
#> User input: EPSG:31370
#> wkt:
#> BOUNDCRS[
#> SOURCECRS[
#> PROJCRS["Belge 1972 / Belgian Lambert 72",
#> BASEGEOGCRS["Belge 1972",
#> DATUM["Reseau National Belge 1972",
#> ELLIPSOID["International 1924",6378388,297,
#> LENGTHUNIT["metre",1]]],
#> PRIMEM["Greenwich",0,
#> ANGLEUNIT["degree",0.0174532925199433]],
#> ID["EPSG",4313]],
#> CONVERSION["Belgian Lambert 72",
#> METHOD["Lambert Conic Conformal (2SP)",
#>
#> (...)
#>
#> ID["EPSG",1609],
#> REMARK["Scale difference is given by information source as 0.999999. Given in this record in ppm to assist application usage. Very similar parameter values (to slightly less precision) used for BD72 to ETRS89: see code 1652."]]]
cat(raster::wkt(SF)) # does not work with sf
#> Error in raster::wkt(SF): tentative d'obtenir le slot "crs" d'un objet (classe "sf") qui n'est pas un objet S4
Briefly : the CRS is mainly stored in proj4 string format and drops the datum and towgs84
part
unlike sf
).
the new WKT notation is stored as a comment in the CRS object but is different from sf
SP <- coo
coordinates(SP) <- ~x+y
proj4string(SP) <- CRS("+init=epsg:31370")
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
#> datum Reseau_National_Belge_1972 in CRS definition
# the proj4 string do not contain the `towgs84` part
# Warning "Discarded datum (...)"
CRS("+init=epsg:31370")
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
#> datum Reseau_National_Belge_1972 in CRS definition
#> CRS arguments:
#> +proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=51.1666672333333
#> +lat_2=49.8333339 +x_0=150000.013 +y_0=5400088.438 +ellps=intl +units=m
#> +no_defs
# With `raster::crs` same proj4string but no warning
raster::crs(SP)
#> CRS arguments:
#> +proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=51.1666672333333
#> +lat_2=49.8333339 +x_0=150000.013 +y_0=5400088.438 +ellps=intl +units=m
#> +no_defs
# WKT notation + a warning: this WKT is indeed different from the SF one (no datum here ?)
cat(comment(CRS("+init=epsg:31370")))
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
#> datum Reseau_National_Belge_1972 in CRS definition
#> PROJCRS["Belge 1972 / Belgian Lambert 72",
#> BASEGEOGCRS["Belge 1972",
#> DATUM["Reseau National Belge 1972",
#> ELLIPSOID["International 1924",6378388,297,
#> LENGTHUNIT["metre",1]]],
#>
#> (...)
#>
#> USAGE[
#> SCOPE["unknown"],
#> AREA["Belgium - onshore"],
#> BBOX[49.5,2.5,51.51,6.4]]]
# Same output without warning
cat(raster::wkt(SP))
#> PROJCRS["Belge 1972 / Belgian Lambert 72",
#> BASEGEOGCRS["Belge 1972",
#> DATUM["Reseau National Belge 1972",
#> ELLIPSOID["International 1924",6378388,297,
#> LENGTHUNIT["metre",1]]],
#>
#> (...)
#>
#> USAGE[
#> SCOPE["unknown"],
#> AREA["Belgium - onshore"],
#> BBOX[49.5,2.5,51.51,6.4]]]
In raster
, both the old proj4 notation and the new WKT notation seem to be stored.
r <- raster::raster(system.file("external/test.grd", package="raster"))
raster::crs(r)
#> CRS arguments:
#> +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889
#> +k=0.9999079 +x_0=155000 +y_0=463000 +datum=WGS84 +units=m +no_defs
cat(raster::wkt(r))
#> PROJCRS["unknown",
#> BASEGEOGCRS["unknown",
#> DATUM["World Geodetic System 1984",
#> ELLIPSOID["WGS 84",6378137,298.257223563,
#> LENGTHUNIT["metre",1]],
#> ID["EPSG",6326]],
#>
#> (...)
#>
#> AXIS["(N)",north,
#> ORDER[2],
#> LENGTHUNIT["metre",1,
#> ID["EPSG",9001]]]]
This dataset do not contain +towgs84
part in the proj4string. But when you do read
a raster with a +towgs84
part in the proj4string it seems to be dropped.
Non reproducible example :
GISfolder <- "/my/path"
tmp <- raster(paste0(GISfolder, 'my_file.tif'))
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
#> datum Unknown_based_on_International_1909_Hayford_ellipsoid in CRS definition
raster::crs(tmp)
#> CRS arguments:
#> +proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=49.8333339
#> +lat_2=51.1666672333333 +x_0=150000.01256 +y_0=5400088.4378 +ellps=intl
#> +units=m +no_defs
cat(raster::wkt(tmp))
#> PROJCRS["unknown",
#> BASEGEOGCRS["unknown",
#> DATUM["Unknown based on International 1909 (Hayford) ellipsoid",
#> ELLIPSOID["International 1909 (Hayford)",6378388,297,
#> LENGTHUNIT["metre",1,
#> ID["EPSG",9001]]]],
#> PRIMEM["Greenwich",0,
#> ANGLEUNIT["degree",0.0174532925199433],
#> ID["EPSG",8901]]],
#>
#> (...)
#>
#> AXIS["(N)",north,
#> ORDER[2],
#> LENGTHUNIT["metre",1,
#> ID["EPSG",9001]]]]
I should probably also explore what happens when we use the stars
package instead of
raster
but this question
is already quite long (and i have a lot of code build on the raster package)
raster::extract
# extract the values from the raster,
# the function extract reprojects the points
# in the same crs as the raster layer
extract(r, SF)
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown_based_on_International_1909_Hayford_ellipsoid in CRS definition,
#> but +towgs84= values preserved
#> Warning in .local(x, y, ...): Transforming SpatialPoints to the CRS of the
#> Raster
#> [1] 351.7868 236.4216 309.0073
extract(r, SP)
#> Warning in .local(x, y, ...): Transforming SpatialPoints to the CRS of the
#> Raster
#> [1] 351.7868 236.4216 309.0073
SF_proj <- st_transform(SF, crs = raster::crs(r))
extract(r, SF_proj)
#> [1] 351.7868 236.4216 309.0073
SF with WKT from the raster
SF_proj <- st_transform(SF, crs = raster::wkt(r))
extract(r, SF_proj)
#> [1] 351.7868 236.4216 309.0073
SP with proj4string from the raster
SP_proj <- spTransform(SP, raster::crs(r))
extract(r, SP_proj)
#> [1] 351.7868 236.4216 309.0073
SP with WKT from the raster
The wkt fromat is not accepted by sp::spTransform
–> does not work
# error
SP_proj <- spTransform(SP, raster::wkt(r))
#> Error in CRS(CRSobj): PROJ4 argument-value pairs must begin with +: PROJCRS["unknown",
#> BASEGEOGCRS["unknown",
#> DATUM["World Geodetic System 1984",
#> ELLIPSOID["WGS 84",6378137,298.257223563,
#> LENGTHUNIT["metre",1]],
#> ID["EPSG",6326]],
#>
#> (...)
#>
#> AXIS["(N)",north,
#> ORDER[2],
#> LENGTHUNIT["metre",1,
#> ID["EPSG",9001]]]]
# extract(r, SP_proj)
–> results different from the previous attempts
# EPSG 31370 proj4 string with the datum:
lambert72 <- sf::st_crs(31370)$proj4string
lambert72
#> [1] "+proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=51.1666672333333 +lat_2=49.8333339 +x_0=150000.013 +y_0=5400088.438 +ellps=intl +towgs84=-99.059,53.322,-112.486,0.419,-0.83,1.885,-1 +units=m +no_defs"
# there is a warning when we project the raster but the full string seems to be used
r2 <- raster::projectRaster(r, crs = lambert72)
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown_based_on_International_1909_Hayford_ellipsoid in CRS definition,
#> but +towgs84= values preserved
raster::crs(r2)
#> CRS arguments:
#> +proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=51.1666672333333
#> +lat_2=49.8333339 +x_0=150000.013 +y_0=5400088.438 +ellps=intl
#> +towgs84=-99.059,53.322,-112.486,0.419,-0.83,1.885,-1 +units=m +no_defs
extract(r2, SP)
#> [1] 341.6399 222.1028 301.2286
Using the WKT of SF
–> does not work because raster::projectRaster
does not accept WKT format
for its crs
argument
lambert72 <- sf::st_crs(31370)
lambert72
#> Coordinate Reference System:
#> User input: EPSG:31370
#> wkt:
#> BOUNDCRS[
#> SOURCECRS[
#> PROJCRS["Belge 1972 / Belgian Lambert 72",
#> BASEGEOGCRS["Belge 1972",
#> DATUM["Reseau National Belge 1972",
#> ELLIPSOID["International 1924",6378388,297,
#> LENGTHUNIT["metre",1]]],
#> PRIMEM["Greenwich",0,
#>
#> (...)
#>
#> AREA["Belgium - onshore"],
#> BBOX[49.5,2.5,51.51,6.4]],
#> ID["EPSG",1609],
#> REMARK["Scale difference is given by information source as 0.999999. Given in this record in ppm to assist application usage. Very similar parameter values (to slightly less precision) used for BD72 to ETRS89: see code 1652."]]]
r2 <- raster::projectRaster(r, crs = lambert72)
#> Error in wkt(projto): tentative d'obtenir le slot "crs" d'un objet (classe "crs") qui n'est pas un objet S4
Using the proj4string of SP (without datum)
–> results different from the previous attempts
# EPSG 31370 proj4 string without the datum:
lambert72 <- sp::CRS("+init=epsg:31370")@projargs
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
#> datum Reseau_National_Belge_1972 in CRS definition
lambert72
#> [1] "+proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=51.1666672333333 +lat_2=49.8333339 +x_0=150000.013 +y_0=5400088.438 +ellps=intl +units=m +no_defs"
# warning
r3 <- raster::projectRaster(r, crs = lambert72)
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
#> datum Unknown_based_on_International_1909_Hayford_ellipsoid in CRS definition
raster::crs(r3)
#> CRS arguments:
#> +proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=51.1666672333333
#> +lat_2=49.8333339 +x_0=150000.013 +y_0=5400088.438 +ellps=intl +units=m
#> +no_defs
extract(r3, coo)
#> [1] 348.5775 329.1199 277.2260
sessionInfo()
#> R version 3.6.2 (2019-12-12)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 18.04.4 LTS
#>
#> (...)
#>
#> other attached packages:
#> [1] raster_3.3-13 sp_1.4-2 sf_0.9-5 knitr_1.29
#>
#> (...)
#>
Used with :
GEOS 3.8.0, GDAL 3.0.2, PROJ 6.2.1
Created on 2020-09-03 by the reprex package (v0.3.0)
Some notes given in https://gis.stackexchange.com/questions/372692. Please see there first.
- I understand that there is a need for a new notation/format (WKT) implemented in PROJ6 (eg because of a need of higher precision) but I don’t understand why there is a need to remove the datum part form the old proj4 string notation. Why not just keep it as it has always been (and implement the new features in the new WKT format/notation)
The +datum=
part is deprecated in GDAL's exportToProj4()
from GDAL >= 3. Since sf, rgdal and raster use GDAL to read files, the Proj4 string representation is without all +datum=
perhaps except WGS84, NAD83 and NAD27. The warnings come from checking which nodes are present internally before exportToProj4()
is run, and which are present afterwards. We cannot rely on +datum=
and +towgs84=
when we use PROJ >= 6/GDAL >= 3.
Further comments relate to the examples:
> library(sf)
Linking to GEOS 3.8.1, GDAL 3.1.3, PROJ 7.1.1
> #> Linking to GEOS 3.8.0, GDAL 3.0.2, PROJ 6.2.1
> library(sp)
> library(raster)
> packageVersion("sf")
[1] ‘0.9.6’
> packageVersion("sp")
[1] ‘1.4.4’
> packageVersion("raster")
[1] ‘3.3.13’
> library(rgdal)
rgdal: version: 1.5-17, (SVN revision 1060)
Geospatial Data Abstraction Library extensions to R successfully loaded
Loaded GDAL runtime: GDAL 3.1.3, released 2020/09/01
Path to GDAL shared files: /usr/local/share/gdal
GDAL binary built with GEOS: TRUE
Loaded PROJ runtime: Rel. 7.1.1, September 1st, 2020, [PJ_VERSION: 711]
Path to PROJ shared files: /home/rsb/.local/share/proj:/usr/local/share/proj:/usr/local/share/proj
PROJ CDN enabled: FALSE
Linking to sp version:1.4-4
To mute warnings of possible GDAL/OSR exportToProj4() degradation,
use options("rgdal_show_exportToProj4_warnings"="none") before loading rgdal.
I'm using development versions, and the most recent releases of PROJ and GDAL.
> coo <- data.frame(x = c(246000, 247000, 246500), y = c(184000, 186000, 185000))
> SF <- st_as_sf(coo, coords = c("x", "y"), crs = 31370)
> st_crs(SF)$proj4string
[1] "+proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=51.1666672333333 +lat_2=49.8333339 +x_0=150000.013 +y_0=5400088.438 +ellps=intl +towgs84=-99.059,53.322,-112.486,0.419,-0.83,1.885,-1 +units=m +no_defs"
> st_crs(SF)
Coordinate Reference System:
User input: EPSG:31370
wkt:
PROJCRS["Belge 1972 / Belgian Lambert 72",
BASEGEOGCRS["Belge 1972",
DATUM["Reseau National Belge 1972",
ELLIPSOID["International 1924",6378388,297,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4313]],
CONVERSION["Belgian Lambert 72",
METHOD["Lambert Conic Conformal (2SP)",
ID["EPSG",9802]],
PARAMETER["Latitude of false origin",90,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8821]],
PARAMETER["Longitude of false origin",4.36748666666667,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8822]],
PARAMETER["Latitude of 1st standard parallel",51.1666672333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8823]],
PARAMETER["Latitude of 2nd standard parallel",49.8333339,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8824]],
PARAMETER["Easting at false origin",150000.013,
LENGTHUNIT["metre",1],
ID["EPSG",8826]],
PARAMETER["Northing at false origin",5400088.438,
LENGTHUNIT["metre",1],
ID["EPSG",8827]]],
CS[Cartesian,2],
AXIS["easting (X)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["northing (Y)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["unknown"],
AREA["Belgium - onshore"],
BBOX[49.5,2.5,51.51,6.4]],
ID["EPSG",31370]]
Now no +datum=
remains in the Proj4 string, but all of the CRS specification is present in the WKT2_2019 string. There is no $proj4string
in the "crs"
object, it is generated on-the-fly if you ask for it.
We are still working on coercion, but already we have:
> cat(raster::wkt(as(SF, "Spatial")), "\n")
PROJCRS["Belge 1972 / Belgian Lambert 72",
BASEGEOGCRS["Belge 1972",
DATUM["Reseau National Belge 1972",
ELLIPSOID["International 1924",6378388,297,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4313]],
CONVERSION["Belgian Lambert 72",
METHOD["Lambert Conic Conformal (2SP)",
ID["EPSG",9802]],
PARAMETER["Latitude of false origin",90,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8821]],
PARAMETER["Longitude of false origin",4.36748666666667,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8822]],
PARAMETER["Latitude of 1st standard parallel",51.1666672333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8823]],
PARAMETER["Latitude of 2nd standard parallel",49.8333339,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8824]],
PARAMETER["Easting at false origin",150000.013,
LENGTHUNIT["metre",1],
ID["EPSG",8826]],
PARAMETER["Northing at false origin",5400088.438,
LENGTHUNIT["metre",1],
ID["EPSG",8827]]],
CS[Cartesian,2],
AXIS["easting (X)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["northing (Y)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["unknown"],
AREA["Belgium - onshore"],
BBOX[49.5,2.5,51.51,6.4]],
ID["EPSG",31370]]
Next:
> SP <- coo
> coordinates(SP) <- ~x+y
> proj4string(SP) <- CRS("+init=epsg:31370")
Warning message:
In showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj = prefer_proj) :
Discarded datum Reseau_National_Belge_1972 in CRS definition
> cat(wkt(SP), "\n")
PROJCRS["Belge 1972 / Belgian Lambert 72",
BASEGEOGCRS["Belge 1972",
DATUM["Reseau National Belge 1972",
ELLIPSOID["International 1924",6378388,297,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4313]],
CONVERSION["Belgian Lambert 72",
METHOD["Lambert Conic Conformal (2SP)",
ID["EPSG",9802]],
PARAMETER["Latitude of false origin",90,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8821]],
PARAMETER["Longitude of false origin",4.36748666666667,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8822]],
PARAMETER["Latitude of 1st standard parallel",51.1666672333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8823]],
PARAMETER["Latitude of 2nd standard parallel",49.8333339,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8824]],
PARAMETER["Easting at false origin",150000.013,
LENGTHUNIT["metre",1],
ID["EPSG",8826]],
PARAMETER["Northing at false origin",5400088.438,
LENGTHUNIT["metre",1],
ID["EPSG",8827]],
ID["EPSG",19961]],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]],
USAGE[
SCOPE["unknown"],
AREA["Belgium - onshore"],
BBOX[49.5,2.5,51.51,6.4]]]
You note that +towgs84=
has gone, that is because the DATUM in WKT2_2019 is absolutely sufficient to generate coordinate operations when needed. PROJ >= 6/GDAL >= 3 do not need to transform to the WGS84 GEOGCRS hub and onward to the target CRS. The warning comes because sp::CRS()
generates both the WKT2_2019 string, which is fully specified, and the legacy Proj4 string - with bits missing for modern PROJ/GDAL, which we hope nobody relies on any more - if you do, you have been warned.
I'll leave this here now, referring to the reply on the SE thread. If a raster developer could comment, this would be helpful, but as far as we can see from reverse dependency check, raster seems to have transitioned to using WKT2_2019 (like the other packages) in preference to Proj4 when PROJ >= 6/GDAL >= 3. Since some platforms are still PROJ < 6/GDAL < 3, we have to provide for both settings as far as possible.
Partial answer based on what I think I understand now.
NB : I’m relally not totally certain of this. So please provide feedback if I’m wrong…
The general idea is that sf
, sp
tend to use by default the
new WKT notation (which handles correctly the datum) even if they can display (or retrieve on request)
the old and deprecated proj4 string notation (with or without datum).
The situation is less clear so far (to me at least)
concerning raster
which is able to provide the WKT notation (raster::wkt
) as a
character string but seems to still rely heavily on proj4 strings.
So the projections should be fine in most of the cases unless you force the use of the
proj4 notation. But for raster
I’m still puzzled and I miss probably something…
I would be very uncofortable to use raster::projectRaster
for the moment .
We can now try to understand which answers are correct and why :
library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.2, PROJ 6.2.1
library(sp)
library(raster)
# create a raster
r <- raster::raster(system.file("external/test.grd", package="raster"))
# create an sf spatial object
coo <- data.frame(x = c(246000, 247000, 246500), y = c(184000, 186000, 185000))
SF <- st_as_sf(coo, coords = c("x", "y"), crs = 31370)
# create an equivalent sp object
SP <- coo
coordinates(SP) <- ~x+y
proj4string(SP) <- CRS(SRS_string = "EPSG:31370") # better than CRS("+init=epsg:31370") ??
The following approaches seem to be safe because we extract the WKT notation from the
raster (as a character string provided by raster::wkt
) and we transform the
sf
and sp
points into this new coordinate reference system.
SF_to_r <- st_transform(SF, crs = raster::wkt(r))
raster::extract(r, SF_to_r) # result (correct) : 351.7868 236.4216 309.0073
# note the use of `SRS_string` argument. `CRS(raster::wkt(r))` won't work
SP_to_r <- spTransform(SP, CRS(SRS_string = raster::wkt(r)))
raster::extract(r, SF_to_r) # result (correct) : 351.7868 236.4216 309.0073
class(raster::wkt(r)) # character
The following approaches seem to work also (same results) when we use raster::crs
that returns a CRS
class object from sp
package. I guess this is because
both sf
and sp
use safely the new WKT notation available from this object (even
if the object contains apparently only the proj4 string, the WKT being somewhat “hidden”
in a comment attached to the object)
SF_to_r <- st_transform(SF, crs = raster::crs(r))
raster::extract(r, SF_to_r) # result : 351.7868 236.4216 309.0073
SP_to_r <- spTransform(SP, raster::crs(r))
raster::extract(r, SF_to_r) # result : 351.7868 236.4216 309.0073
class(raster::crs(r)) # CRS class from `sp`
str(raster::crs(r)) # 1 slot with the proj4 string
cat(comment(raster::crs(r))) # this is where the WKT notation is "hidden"
When we project the raster, we can certainly expect failure with the 2
following approaches (one with sp
, one with sf
) because we force the use of the
proj4 string (with $proj4string
and @projargs
that provide a simple character vector).
This should be always avoided…
I’m not sure to understand why these two options provide different results but I’m quite confident now that both results are wrong. Maybe they differ because the datum is droped at different moments in the pipeline (the initial strings provided as a character vectors are different between sp and sf) ?
r_to_sf <- raster::projectRaster(r, crs = sf::st_crs(31370)$proj4string)
raster::extract(r_to_sf, SF) # result (wrong) : 341.6399 222.1028 301.2286
r_to_sp <- raster::projectRaster(r, crs = sp::CRS("+init=epsg:31370")@projargs)
raster::extract(r_to_sp, SF) # result (wrong) : 348.5775 329.1199 277.2260
class(sf::st_crs(31370)$proj4string) # character
class(sp::CRS("+init=epsg:31370")@projargs) # character
We could expect that providing a full CRS object (instead of forcing the use of the
character proj4 string) would solve the problem. But it does not seem to be the case.
Maybe because raster
relies internally on the old proj4 strings ??
However according to Roger Bivand :
as far as we can see from reverse dependency check, raster seems to have transitioned to using WKT2_2019 (like the other packages) in preference to Proj4 when PROJ >= 6/GDAL >= 3
So I’m probably wrong somewhere and I still don’t know how I can safely reproject a raster object …
r_to_sp <- raster::projectRaster(r, crs = sp::CRS("+init=epsg:31370"))
raster::extract(r_to_sp, SP) # result (wrong) : 341.6399 222.1028 301.2286
# same result with a slightly different syntax for CRS
r_to_sp <- raster::projectRaster(r, crs = sp::CRS(SRS_string = "EPSG:31370"))
raster::extract(r_to_sp, SP) # result (wrong) : 341.6399 222.1028 301.2286
With stars
package we can obtain the “correct” results either by reprojecting the points or the
raster. However it seems that the raster::extract
function has some features that are not available
straightaway with stars
(eg computing a weight for each cell when using polygons)
Usefull raster
vs stars
function comparison
library(stars)
STARS <- stars::read_stars(system.file("external/test.grd", package="raster"))
# reproject the points into the same crs as the stars raster
SF_to_STARS <- st_transform(SF, crs = st_crs(STARS))
aggregate(STARS, SF_to_STARS, function(x) x[1], as_points = FALSE)$test.grd # result (correct) = 351.7868 236.4216 309.0073
# reproject the stars raster into the same crs as the points
STARS_to_SF <- st_transform(STARS, crs = st_crs(SF))
aggregate(STARS_to_SF, SF, function(x) x[1], as_points = FALSE)$test.grd # result (correct) = 351.7868 236.4216 309.0073
This might also be useful to the reflexion ?
Recommenadations from Roger Bivand :
If possible, avoid instantiating “CRS” objects with Proj4 strings, rather use
CRS(SRS_string=
For example :
# preferd syntax :
CRS(SRS_string = "OGC:CRS84")
#> Error in if (in_format == 4L) {: valeur manquante là où TRUE / FALSE est requis
# instead of :
CRS("+proj=longlat +datum=WGS84")
So maybe (??) also :
CRS(SRS_string = "EPSG:31370")
instead of :
CRS("+init=epsg:31370")
Avoid
proj4string(x) <- proj4string(y)
and prefer :slot(x, "proj4string") <- slot(y, "proj4string")
Created on 2020-09-09 by the reprex package (v0.3.0)
Here is my response to your Question 1 :
I understand that there is a need for a new notation/format (WKT) implemented in PROJ6 (eg because of a need of higher precision) but I don’t understand why there is a need to remove the datum part form the old proj4 string notation. Why not just keep it as it has always been (and implement the new features in the new WKT format/notation)
I do not know why the PROJ developers took the decision to break backwards compatibility, but I assume that there were very good reasons for it; and that nobody has volunteered to work on this.
As R/spatial developers (and others who build software with PROJ) we have to live with this. The problem is that we need to accommodate for different PROJ versions (especially on linux systems). Trying to move forward while being backwards compatible has created a terrible mess.
Not being able to use the proj4 notation is a real loss in a scripting environment like R. The proj4 notation can be directly understood; EPSG codes are opaque and their use easily leads to error. Also, if there is no EPSG code available you need to figure out how to write your own WKT.
CRS in raster
The CRS object in raster
is the same as in sp
and rgdal
. It stores both the proj4 and the wkt notations. Roger Bivand has explained why the warnings are given.
Extraction
To extract values from a raster, always transform the points (lines, polygons), not the raster. Transforming a raster will lead to newly estimated values that are different from the original values. See the discussion here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With