Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PROJ4 to PROJ6 upgrade and "Discarded datum" warnings

Tags:

r

sp

raster

rgdal

sf

Context

My questions are related to the changes induced by the upgrade from PROJ4 to PROJ6 and the consequences in various R spatial packages (sp, sf, raster).

We receive now a lot of warnings about “Discarded datum” that looks a bit worrying and I’m a little puzzled about what I should do with that. I can see that this can have dire consequences in some situations and that they can be ignored in other circumstances.

It seems that I’m not the only one to be a little lost (see here). I hope that my questions with a specific reproducible example will help us to better understand the topic.

I understand that we can remove the warnings, and I have read the context : r-spatial blog post, Migration to PROJ6/GDAL3 and these workshop notes (mapview seems to handle this differently in more recent versions)

Questions

Question 1 :

Probably a naive question :

I understand that there is a need for a new notation/format (WKT) implemented in PROJ6 (eg because of a need of higher precision) but I don’t understand why there is a need to remove the datum part form the old proj4 string notation. Why not just keep it as it has always been (and implement the new features in the new WKT format/notation)

Question 2 :

It seems that we have 3 cases concerning the drop of the datum in the old proj4 format :

  1. no warning : the datum remains as in the old proj4string notation (sf défault ?)
  2. we have a warning “Discarded datum (…) but +towgs84= values preserved”
  3. we have a warning “Discarded datum (…)” (in that case the “+towgs84=” part of the string is discarded/dropped –> sp default ?? )

The example below illustrate different cases where we have these warnings.

Why do we have these 3 different cases on the same CRS (here EPSG 31370)?
What are the consequences of the removal of the datum and/or +towgs84 part ?
Should I be less worried by the second warning than by the third ?

Question 3 :

In the reproducible example below I try to extract the values from a raster corresponding to 3 points, the raster and the points having a different CRS. However depending on the approach used I get different results. I have the impression that this is related to this PROJ4 –> PROJ6 changes and datum dropping but I might be wrong. I created this example only because I wanted to understand the consequences of these “datum dropping” in the crs

I use the function raster::extract and 3 different general approaches (each time for both sf and sp objects for the points) from which I would expect the same output :

  1. no manual reprojection, I let raster::extract do the job to match the crs of the points and raster
  2. reprojecting manually the points toward the crs of the raster
  3. reprojecting manually the raster toward the crs of the points

With the 3rd approach, I get 2 different set of values for sp or sf objects and I get yet a third set of values with the second (and first approach) (then the values are the same if I use sp or sf objects).

  • Which values are the “correct” ones ? (probably 351.7868 236.4216 309.0073)
  • Why are some approaches failing (is this related to the warning messages / datum stuff ?) ?
  • When I do a projection of a spatial object and I have these warnings, how can I check that my spatial objects have been “correctly” projected ?

Question 4 :

Is it possible to provide general recommendations about what should be done when we have these warning messages ?

For example :

  • If possible do not use the old proj4 string notation but use the WKT one instead (but this is not always possible. Here for example - if I’m not wrong I’m forced to use the old proj4 notation because this is what raster uses)
  • If I knew the epsg code of the raster used here, I guess the safest appoach would be to reproject the points using that code with sf , eg : st_transform(SF, crs = xxxx)

Reproducible example

Create points and raster spatial data + check how the CRS is handled

SF points spatial object

Briefly : the CRS is mainly stored in WKT format. The old proj4string is available on request and do not drop the datum/towgs84 part

library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.2, PROJ 6.2.1
library(sp)
library(raster)

# create 3 points
coo <- data.frame(x = c(246000, 247000, 246500), y = c(184000, 186000, 185000))

# create an sf spatial object
SF <- st_as_sf(coo, coords = c("x", "y"), crs = 31370)

# Check the CRS : 
# the proj4string includes the datum/+towgs84 information - no warning
st_crs(SF)$proj4string
#> [1] "+proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=51.1666672333333 +lat_2=49.8333339 +x_0=150000.013 +y_0=5400088.438 +ellps=intl +towgs84=-99.059,53.322,-112.486,0.419,-0.83,1.885,-1 +units=m +no_defs"

# Same value with `raster::crs` but with a 
# Warning "Discarded datum (...) but +towgs84= values preserved"
raster::crs(SF)
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown_based_on_International_1909_Hayford_ellipsoid in CRS definition,
#>  but +towgs84= values preserved
#> CRS arguments:
#>  +proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=51.1666672333333
#> +lat_2=49.8333339 +x_0=150000.013 +y_0=5400088.438 +ellps=intl
#> +towgs84=-99.059,53.322,-112.486,0.419,-0.83,1.885,-1 +units=m +no_defs

# WKT 
st_crs(SF)
#> Coordinate Reference System:
#>   User input: EPSG:31370 
#>   wkt:
#> BOUNDCRS[
#>     SOURCECRS[
#>         PROJCRS["Belge 1972 / Belgian Lambert 72",
#>             BASEGEOGCRS["Belge 1972",
#>                 DATUM["Reseau National Belge 1972",
#>                     ELLIPSOID["International 1924",6378388,297,
#>                         LENGTHUNIT["metre",1]]],
#>                 PRIMEM["Greenwich",0,
#>                     ANGLEUNIT["degree",0.0174532925199433]],
#>                 ID["EPSG",4313]],
#>             CONVERSION["Belgian Lambert 72",
#>                 METHOD["Lambert Conic Conformal (2SP)",
#>
#> (...)
#> 
#>         ID["EPSG",1609],
#>         REMARK["Scale difference is given by information source as 0.999999. Given in this record in ppm to assist application usage. Very similar parameter values (to slightly less precision) used for BD72 to ETRS89: see code 1652."]]]
cat(raster::wkt(SF)) # does not work with sf
#> Error in raster::wkt(SF): tentative d'obtenir le slot "crs" d'un objet (classe "sf") qui n'est pas un objet S4

SP points spatial object

Briefly : the CRS is mainly stored in proj4 string format and drops the datum and towgs84 part unlike sf). the new WKT notation is stored as a comment in the CRS object but is different from sf

SP <- coo
coordinates(SP) <- ~x+y
proj4string(SP) <- CRS("+init=epsg:31370")
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
#> datum Reseau_National_Belge_1972 in CRS definition

# the proj4 string do not contain the `towgs84` part
# Warning "Discarded datum (...)"
CRS("+init=epsg:31370")
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
#> datum Reseau_National_Belge_1972 in CRS definition
#> CRS arguments:
#>  +proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=51.1666672333333
#> +lat_2=49.8333339 +x_0=150000.013 +y_0=5400088.438 +ellps=intl +units=m
#> +no_defs
# With `raster::crs` same proj4string but no warning
raster::crs(SP)
#> CRS arguments:
#>  +proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=51.1666672333333
#> +lat_2=49.8333339 +x_0=150000.013 +y_0=5400088.438 +ellps=intl +units=m
#> +no_defs

# WKT notation + a warning: this WKT is indeed different from the SF one (no datum here ?)
cat(comment(CRS("+init=epsg:31370")))
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
#> datum Reseau_National_Belge_1972 in CRS definition
#> PROJCRS["Belge 1972 / Belgian Lambert 72",
#>     BASEGEOGCRS["Belge 1972",
#>         DATUM["Reseau National Belge 1972",
#>             ELLIPSOID["International 1924",6378388,297,
#>                 LENGTHUNIT["metre",1]]],
#>
#> (...)
#> 
#>     USAGE[
#>         SCOPE["unknown"],
#>         AREA["Belgium - onshore"],
#>         BBOX[49.5,2.5,51.51,6.4]]]
# Same output without warning
cat(raster::wkt(SP))
#> PROJCRS["Belge 1972 / Belgian Lambert 72",
#>     BASEGEOGCRS["Belge 1972",
#>         DATUM["Reseau National Belge 1972",
#>             ELLIPSOID["International 1924",6378388,297,
#>                 LENGTHUNIT["metre",1]]],
#>
#> (...)
#> 
#>     USAGE[
#>         SCOPE["unknown"],
#>         AREA["Belgium - onshore"],
#>         BBOX[49.5,2.5,51.51,6.4]]]

Raster

In raster, both the old proj4 notation and the new WKT notation seem to be stored.

r <- raster::raster(system.file("external/test.grd", package="raster"))
raster::crs(r)
#> CRS arguments:
#>  +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889
#> +k=0.9999079 +x_0=155000 +y_0=463000 +datum=WGS84 +units=m +no_defs
cat(raster::wkt(r))
#> PROJCRS["unknown",
#>     BASEGEOGCRS["unknown",
#>         DATUM["World Geodetic System 1984",
#>             ELLIPSOID["WGS 84",6378137,298.257223563,
#>                 LENGTHUNIT["metre",1]],
#>             ID["EPSG",6326]],
#>
#> (...)
#> 
#>         AXIS["(N)",north,
#>             ORDER[2],
#>             LENGTHUNIT["metre",1,
#>                 ID["EPSG",9001]]]]

This dataset do not contain +towgs84 part in the proj4string. But when you do read a raster with a +towgs84 part in the proj4string it seems to be dropped.
Non reproducible example :

GISfolder <- "/my/path"
tmp <- raster(paste0(GISfolder, 'my_file.tif'))
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
#> datum Unknown_based_on_International_1909_Hayford_ellipsoid in CRS definition
raster::crs(tmp)
#> CRS arguments:
#>  +proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=49.8333339
#> +lat_2=51.1666672333333 +x_0=150000.01256 +y_0=5400088.4378 +ellps=intl
#> +units=m +no_defs
cat(raster::wkt(tmp))
#> PROJCRS["unknown",
#>     BASEGEOGCRS["unknown",
#>         DATUM["Unknown based on International 1909 (Hayford) ellipsoid",
#>             ELLIPSOID["International 1909 (Hayford)",6378388,297,
#>                 LENGTHUNIT["metre",1,
#>                     ID["EPSG",9001]]]],
#>         PRIMEM["Greenwich",0,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8901]]],
#>
#> (...)
#> 
#>         AXIS["(N)",north,
#>             ORDER[2],
#>             LENGTHUNIT["metre",1,
#>                 ID["EPSG",9001]]]]

I should probably also explore what happens when we use the stars package instead of raster but this question is already quite long (and i have a lot of code build on the raster package)

Extract the values from the raster

Using the automatic reprojection from raster::extract

# extract the values from the raster, 
# the function extract reprojects the points
# in the same crs as the raster layer
extract(r, SF)
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown_based_on_International_1909_Hayford_ellipsoid in CRS definition,
#>  but +towgs84= values preserved
#> Warning in .local(x, y, ...): Transforming SpatialPoints to the CRS of the
#> Raster
#> [1] 351.7868 236.4216 309.0073
extract(r, SP)
#> Warning in .local(x, y, ...): Transforming SpatialPoints to the CRS of the
#> Raster
#> [1] 351.7868 236.4216 309.0073

Reproject manually the points toward the crs of the raster

SF with proj4string from the raster
SF_proj <- st_transform(SF, crs = raster::crs(r))
extract(r, SF_proj)
#> [1] 351.7868 236.4216 309.0073
SF with WKT from the raster
SF_proj <- st_transform(SF, crs = raster::wkt(r))
extract(r, SF_proj)
#> [1] 351.7868 236.4216 309.0073
SP with proj4string from the raster
SP_proj <- spTransform(SP, raster::crs(r))
extract(r, SP_proj)
#> [1] 351.7868 236.4216 309.0073
SP with WKT from the raster

The wkt fromat is not accepted by sp::spTransform –> does not work

# error
SP_proj <- spTransform(SP, raster::wkt(r))
#> Error in CRS(CRSobj): PROJ4 argument-value pairs must begin with +: PROJCRS["unknown",
#>     BASEGEOGCRS["unknown",
#>         DATUM["World Geodetic System 1984",
#>             ELLIPSOID["WGS 84",6378137,298.257223563,
#>                 LENGTHUNIT["metre",1]],
#>             ID["EPSG",6326]],
#>
#> (...)
#> 
#>         AXIS["(N)",north,
#>             ORDER[2],
#>             LENGTHUNIT["metre",1,
#>                 ID["EPSG",9001]]]]
# extract(r, SP_proj)

Reproject manually the raster toward the crs of the points

Using the proj4string of SF (with datum)

–> results different from the previous attempts

# EPSG 31370 proj4 string with the datum:
lambert72 <- sf::st_crs(31370)$proj4string
lambert72
#> [1] "+proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=51.1666672333333 +lat_2=49.8333339 +x_0=150000.013 +y_0=5400088.438 +ellps=intl +towgs84=-99.059,53.322,-112.486,0.419,-0.83,1.885,-1 +units=m +no_defs"
# there is a warning when we project the raster but the full string seems to be used
r2 <- raster::projectRaster(r, crs = lambert72)
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown_based_on_International_1909_Hayford_ellipsoid in CRS definition,
#>  but +towgs84= values preserved
raster::crs(r2)
#> CRS arguments:
#>  +proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=51.1666672333333
#> +lat_2=49.8333339 +x_0=150000.013 +y_0=5400088.438 +ellps=intl
#> +towgs84=-99.059,53.322,-112.486,0.419,-0.83,1.885,-1 +units=m +no_defs

extract(r2, SP)
#> [1] 341.6399 222.1028 301.2286
Using the WKT of SF

–> does not work because raster::projectRaster does not accept WKT format for its crs argument

lambert72 <- sf::st_crs(31370)
lambert72
#> Coordinate Reference System:
#>   User input: EPSG:31370 
#>   wkt:
#> BOUNDCRS[
#>     SOURCECRS[
#>         PROJCRS["Belge 1972 / Belgian Lambert 72",
#>             BASEGEOGCRS["Belge 1972",
#>                 DATUM["Reseau National Belge 1972",
#>                     ELLIPSOID["International 1924",6378388,297,
#>                         LENGTHUNIT["metre",1]]],
#>                 PRIMEM["Greenwich",0,
#>
#> (...)
#> 
#>             AREA["Belgium - onshore"],
#>             BBOX[49.5,2.5,51.51,6.4]],
#>         ID["EPSG",1609],
#>         REMARK["Scale difference is given by information source as 0.999999. Given in this record in ppm to assist application usage. Very similar parameter values (to slightly less precision) used for BD72 to ETRS89: see code 1652."]]]
r2 <- raster::projectRaster(r, crs = lambert72)
#> Error in wkt(projto): tentative d'obtenir le slot "crs" d'un objet (classe "crs") qui n'est pas un objet S4
Using the proj4string of SP (without datum)

–> results different from the previous attempts

# EPSG 31370 proj4 string without the datum:
lambert72 <- sp::CRS("+init=epsg:31370")@projargs
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
#> datum Reseau_National_Belge_1972 in CRS definition
lambert72
#> [1] "+proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=51.1666672333333 +lat_2=49.8333339 +x_0=150000.013 +y_0=5400088.438 +ellps=intl +units=m +no_defs"
# warning
r3 <- raster::projectRaster(r, crs = lambert72)
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
#> datum Unknown_based_on_International_1909_Hayford_ellipsoid in CRS definition
raster::crs(r3)
#> CRS arguments:
#>  +proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=51.1666672333333
#> +lat_2=49.8333339 +x_0=150000.013 +y_0=5400088.438 +ellps=intl +units=m
#> +no_defs

extract(r3, coo)
#> [1] 348.5775 329.1199 277.2260

sessionInfo

sessionInfo()
#> R version 3.6.2 (2019-12-12)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 18.04.4 LTS
#>
#> (...)
#> 
#> other attached packages:
#> [1] raster_3.3-13 sp_1.4-2      sf_0.9-5      knitr_1.29   
#> 
#> (...)
#> 

Used with :

GEOS 3.8.0, GDAL 3.0.2, PROJ 6.2.1

Created on 2020-09-03 by the reprex package (v0.3.0)

like image 682
Gilles Avatar asked Sep 03 '20 16:09

Gilles


3 Answers

Some notes given in https://gis.stackexchange.com/questions/372692. Please see there first.

  1. I understand that there is a need for a new notation/format (WKT) implemented in PROJ6 (eg because of a need of higher precision) but I don’t understand why there is a need to remove the datum part form the old proj4 string notation. Why not just keep it as it has always been (and implement the new features in the new WKT format/notation)

The +datum= part is deprecated in GDAL's exportToProj4() from GDAL >= 3. Since sf, rgdal and raster use GDAL to read files, the Proj4 string representation is without all +datum= perhaps except WGS84, NAD83 and NAD27. The warnings come from checking which nodes are present internally before exportToProj4() is run, and which are present afterwards. We cannot rely on +datum= and +towgs84= when we use PROJ >= 6/GDAL >= 3.

Further comments relate to the examples:

> library(sf)
Linking to GEOS 3.8.1, GDAL 3.1.3, PROJ 7.1.1
> #> Linking to GEOS 3.8.0, GDAL 3.0.2, PROJ 6.2.1
> library(sp)
> library(raster)
> packageVersion("sf")
[1] ‘0.9.6’
> packageVersion("sp")
[1] ‘1.4.4’
> packageVersion("raster")
[1] ‘3.3.13’
> library(rgdal)
rgdal: version: 1.5-17, (SVN revision 1060)
Geospatial Data Abstraction Library extensions to R successfully loaded
Loaded GDAL runtime: GDAL 3.1.3, released 2020/09/01
Path to GDAL shared files: /usr/local/share/gdal
GDAL binary built with GEOS: TRUE 
Loaded PROJ runtime: Rel. 7.1.1, September 1st, 2020, [PJ_VERSION: 711]
Path to PROJ shared files: /home/rsb/.local/share/proj:/usr/local/share/proj:/usr/local/share/proj
PROJ CDN enabled: FALSE
Linking to sp version:1.4-4
To mute warnings of possible GDAL/OSR exportToProj4() degradation,
use options("rgdal_show_exportToProj4_warnings"="none") before loading rgdal.

I'm using development versions, and the most recent releases of PROJ and GDAL.

> coo <- data.frame(x = c(246000, 247000, 246500), y = c(184000, 186000, 185000))
> SF <- st_as_sf(coo, coords = c("x", "y"), crs = 31370)
> st_crs(SF)$proj4string
[1] "+proj=lcc +lat_0=90 +lon_0=4.36748666666667 +lat_1=51.1666672333333 +lat_2=49.8333339 +x_0=150000.013 +y_0=5400088.438 +ellps=intl +towgs84=-99.059,53.322,-112.486,0.419,-0.83,1.885,-1 +units=m +no_defs"
> st_crs(SF)
Coordinate Reference System:
  User input: EPSG:31370 
  wkt:
PROJCRS["Belge 1972 / Belgian Lambert 72",
    BASEGEOGCRS["Belge 1972",
        DATUM["Reseau National Belge 1972",
            ELLIPSOID["International 1924",6378388,297,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4313]],
    CONVERSION["Belgian Lambert 72",
        METHOD["Lambert Conic Conformal (2SP)",
            ID["EPSG",9802]],
        PARAMETER["Latitude of false origin",90,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8821]],
        PARAMETER["Longitude of false origin",4.36748666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8822]],
        PARAMETER["Latitude of 1st standard parallel",51.1666672333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8823]],
        PARAMETER["Latitude of 2nd standard parallel",49.8333339,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8824]],
        PARAMETER["Easting at false origin",150000.013,
            LENGTHUNIT["metre",1],
            ID["EPSG",8826]],
        PARAMETER["Northing at false origin",5400088.438,
            LENGTHUNIT["metre",1],
            ID["EPSG",8827]]],
    CS[Cartesian,2],
        AXIS["easting (X)",east,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["northing (Y)",north,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["unknown"],
        AREA["Belgium - onshore"],
        BBOX[49.5,2.5,51.51,6.4]],
    ID["EPSG",31370]]

Now no +datum= remains in the Proj4 string, but all of the CRS specification is present in the WKT2_2019 string. There is no $proj4string in the "crs" object, it is generated on-the-fly if you ask for it.

We are still working on coercion, but already we have:

> cat(raster::wkt(as(SF, "Spatial")), "\n")
PROJCRS["Belge 1972 / Belgian Lambert 72",
    BASEGEOGCRS["Belge 1972",
        DATUM["Reseau National Belge 1972",
            ELLIPSOID["International 1924",6378388,297,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4313]],
    CONVERSION["Belgian Lambert 72",
        METHOD["Lambert Conic Conformal (2SP)",
            ID["EPSG",9802]],
        PARAMETER["Latitude of false origin",90,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8821]],
        PARAMETER["Longitude of false origin",4.36748666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8822]],
        PARAMETER["Latitude of 1st standard parallel",51.1666672333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8823]],
        PARAMETER["Latitude of 2nd standard parallel",49.8333339,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8824]],
        PARAMETER["Easting at false origin",150000.013,
            LENGTHUNIT["metre",1],
            ID["EPSG",8826]],
        PARAMETER["Northing at false origin",5400088.438,
            LENGTHUNIT["metre",1],
            ID["EPSG",8827]]],
    CS[Cartesian,2],
        AXIS["easting (X)",east,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["northing (Y)",north,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["unknown"],
        AREA["Belgium - onshore"],
        BBOX[49.5,2.5,51.51,6.4]],
    ID["EPSG",31370]] 

Next:


> SP <- coo
> coordinates(SP) <- ~x+y
> proj4string(SP) <- CRS("+init=epsg:31370")
Warning message:
In showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj = prefer_proj) :
  Discarded datum Reseau_National_Belge_1972 in CRS definition
> cat(wkt(SP), "\n")
PROJCRS["Belge 1972 / Belgian Lambert 72",
    BASEGEOGCRS["Belge 1972",
        DATUM["Reseau National Belge 1972",
            ELLIPSOID["International 1924",6378388,297,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4313]],
    CONVERSION["Belgian Lambert 72",
        METHOD["Lambert Conic Conformal (2SP)",
            ID["EPSG",9802]],
        PARAMETER["Latitude of false origin",90,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8821]],
        PARAMETER["Longitude of false origin",4.36748666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8822]],
        PARAMETER["Latitude of 1st standard parallel",51.1666672333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8823]],
        PARAMETER["Latitude of 2nd standard parallel",49.8333339,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8824]],
        PARAMETER["Easting at false origin",150000.013,
            LENGTHUNIT["metre",1],
            ID["EPSG",8826]],
        PARAMETER["Northing at false origin",5400088.438,
            LENGTHUNIT["metre",1],
            ID["EPSG",8827]],
        ID["EPSG",19961]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]],
    USAGE[
        SCOPE["unknown"],
        AREA["Belgium - onshore"],
        BBOX[49.5,2.5,51.51,6.4]]] 

You note that +towgs84= has gone, that is because the DATUM in WKT2_2019 is absolutely sufficient to generate coordinate operations when needed. PROJ >= 6/GDAL >= 3 do not need to transform to the WGS84 GEOGCRS hub and onward to the target CRS. The warning comes because sp::CRS() generates both the WKT2_2019 string, which is fully specified, and the legacy Proj4 string - with bits missing for modern PROJ/GDAL, which we hope nobody relies on any more - if you do, you have been warned.

I'll leave this here now, referring to the reply on the SE thread. If a raster developer could comment, this would be helpful, but as far as we can see from reverse dependency check, raster seems to have transitioned to using WKT2_2019 (like the other packages) in preference to Proj4 when PROJ >= 6/GDAL >= 3. Since some platforms are still PROJ < 6/GDAL < 3, we have to provide for both settings as far as possible.

like image 181
Roger Bivand Avatar answered Oct 01 '22 17:10

Roger Bivand


Partial answer based on what I think I understand now.
NB : I’m relally not totally certain of this. So please provide feedback if I’m wrong…

The general idea is that sf, sp tend to use by default the new WKT notation (which handles correctly the datum) even if they can display (or retrieve on request) the old and deprecated proj4 string notation (with or without datum).

The situation is less clear so far (to me at least) concerning raster which is able to provide the WKT notation (raster::wkt) as a character string but seems to still rely heavily on proj4 strings.

So the projections should be fine in most of the cases unless you force the use of the proj4 notation. But for raster I’m still puzzled and I miss probably something… I would be very uncofortable to use raster::projectRaster for the moment .

We can now try to understand which answers are correct and why :

library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.2, PROJ 6.2.1
library(sp)
library(raster)

# create a raster
r <- raster::raster(system.file("external/test.grd", package="raster"))

# create an sf spatial object
coo <- data.frame(x = c(246000, 247000, 246500), y = c(184000, 186000, 185000))
SF <- st_as_sf(coo, coords = c("x", "y"), crs = 31370)

# create an equivalent sp object
SP <- coo
coordinates(SP) <- ~x+y
proj4string(SP) <- CRS(SRS_string = "EPSG:31370") # better than CRS("+init=epsg:31370") ??

The following approaches seem to be safe because we extract the WKT notation from the raster (as a character string provided by raster::wkt) and we transform the sf and sp points into this new coordinate reference system.

SF_to_r <- st_transform(SF, crs = raster::wkt(r))
raster::extract(r, SF_to_r) # result (correct) : 351.7868 236.4216 309.0073

# note the use of `SRS_string` argument. `CRS(raster::wkt(r))` won't work
SP_to_r <- spTransform(SP, CRS(SRS_string = raster::wkt(r))) 
raster::extract(r, SF_to_r) # result (correct) : 351.7868 236.4216 309.0073

class(raster::wkt(r)) # character

The following approaches seem to work also (same results) when we use raster::crs that returns a CRS class object from sp package. I guess this is because both sf and sp use safely the new WKT notation available from this object (even if the object contains apparently only the proj4 string, the WKT being somewhat “hidden” in a comment attached to the object)

SF_to_r <- st_transform(SF, crs = raster::crs(r))
raster::extract(r, SF_to_r) # result : 351.7868 236.4216 309.0073

SP_to_r <- spTransform(SP, raster::crs(r))
raster::extract(r, SF_to_r) # result : 351.7868 236.4216 309.0073

class(raster::crs(r)) # CRS class from `sp`
str(raster::crs(r)) # 1 slot with the proj4 string
cat(comment(raster::crs(r))) # this is where the WKT notation is "hidden"

When we project the raster, we can certainly expect failure with the 2 following approaches (one with sp, one with sf) because we force the use of the proj4 string (with $proj4string and @projargs that provide a simple character vector). This should be always avoided…

I’m not sure to understand why these two options provide different results but I’m quite confident now that both results are wrong. Maybe they differ because the datum is droped at different moments in the pipeline (the initial strings provided as a character vectors are different between sp and sf) ?

r_to_sf <- raster::projectRaster(r, crs = sf::st_crs(31370)$proj4string)
raster::extract(r_to_sf, SF) # result (wrong) : 341.6399 222.1028 301.2286

r_to_sp <- raster::projectRaster(r, crs = sp::CRS("+init=epsg:31370")@projargs)
raster::extract(r_to_sp, SF) # result (wrong) : 348.5775 329.1199 277.2260

class(sf::st_crs(31370)$proj4string) # character
class(sp::CRS("+init=epsg:31370")@projargs) # character

We could expect that providing a full CRS object (instead of forcing the use of the character proj4 string) would solve the problem. But it does not seem to be the case. Maybe because raster relies internally on the old proj4 strings ??

However according to Roger Bivand :

as far as we can see from reverse dependency check, raster seems to have transitioned to using WKT2_2019 (like the other packages) in preference to Proj4 when PROJ >= 6/GDAL >= 3

So I’m probably wrong somewhere and I still don’t know how I can safely reproject a raster object …

r_to_sp <- raster::projectRaster(r, crs = sp::CRS("+init=epsg:31370"))
raster::extract(r_to_sp, SP) # result (wrong) : 341.6399 222.1028 301.2286

# same result with a slightly different syntax for CRS
r_to_sp <- raster::projectRaster(r, crs = sp::CRS(SRS_string = "EPSG:31370"))
raster::extract(r_to_sp, SP) # result (wrong) : 341.6399 222.1028 301.2286

With stars package we can obtain the “correct” results either by reprojecting the points or the raster. However it seems that the raster::extract function has some features that are not available straightaway with stars (eg computing a weight for each cell when using polygons)

Usefull raster vs stars function comparison

library(stars)
STARS <- stars::read_stars(system.file("external/test.grd", package="raster"))

# reproject the points into the same crs as the stars raster
SF_to_STARS <- st_transform(SF, crs = st_crs(STARS))
aggregate(STARS, SF_to_STARS, function(x) x[1], as_points = FALSE)$test.grd # result (correct) = 351.7868 236.4216 309.0073

# reproject the stars raster into the same crs as the points
STARS_to_SF <- st_transform(STARS, crs = st_crs(SF))
aggregate(STARS_to_SF, SF, function(x) x[1], as_points = FALSE)$test.grd # result (correct) = 351.7868 236.4216 309.0073

This might also be useful to the reflexion ?

Recommenadations from Roger Bivand :

If possible, avoid instantiating “CRS” objects with Proj4 strings, rather use CRS(SRS_string=

For example :

# preferd syntax :
CRS(SRS_string = "OGC:CRS84")
#> Error in if (in_format == 4L) {: valeur manquante là où TRUE / FALSE est requis
# instead of :
CRS("+proj=longlat +datum=WGS84")

So maybe (??) also :

CRS(SRS_string = "EPSG:31370")

instead of :

CRS("+init=epsg:31370")

Avoid proj4string(x) <- proj4string(y) and prefer : slot(x, "proj4string") <- slot(y, "proj4string")

Created on 2020-09-09 by the reprex package (v0.3.0)

like image 31
Gilles Avatar answered Oct 01 '22 18:10

Gilles


Here is my response to your Question 1 :

I understand that there is a need for a new notation/format (WKT) implemented in PROJ6 (eg because of a need of higher precision) but I don’t understand why there is a need to remove the datum part form the old proj4 string notation. Why not just keep it as it has always been (and implement the new features in the new WKT format/notation)

I do not know why the PROJ developers took the decision to break backwards compatibility, but I assume that there were very good reasons for it; and that nobody has volunteered to work on this.

As R/spatial developers (and others who build software with PROJ) we have to live with this. The problem is that we need to accommodate for different PROJ versions (especially on linux systems). Trying to move forward while being backwards compatible has created a terrible mess.

Not being able to use the proj4 notation is a real loss in a scripting environment like R. The proj4 notation can be directly understood; EPSG codes are opaque and their use easily leads to error. Also, if there is no EPSG code available you need to figure out how to write your own WKT.

CRS in raster

The CRS object in raster is the same as in sp and rgdal. It stores both the proj4 and the wkt notations. Roger Bivand has explained why the warnings are given.

Extraction

To extract values from a raster, always transform the points (lines, polygons), not the raster. Transforming a raster will lead to newly estimated values that are different from the original values. See the discussion here

like image 29
Robert Hijmans Avatar answered Oct 01 '22 18:10

Robert Hijmans