I have a raster time series stored in multiple GeoTIFF
files (*.tif
) that I'd like to convert to a single NetCDF
file. The data is uint16
.
I could probably use gdal_translate
to convert each image to netcdf using:
gdal_translate -of netcdf -co FORMAT=NC4 20150520_0164.tif foo.nc
and then some scripting with NCO
to extract dates from filenames and then concatenate, but I was wondering whether I might do this more effectively in Python using xarray
and it's new rasterio
backend.
I can read a file easily:
import glob
import xarray as xr
f = glob.glob('*.tif')
da = xr.open_rasterio(f[0])
da
which returns
<xarray.DataArray (band: 1, y: 5490, x: 5490)>
[30140100 values with dtype=uint16]
Coordinates:
* band (band) int64 1
* y (y) float64 5e+05 5e+05 5e+05 5e+05 5e+05 4.999e+05 4.999e+05 ...
* x (x) float64 8e+05 8e+05 8e+05 8e+05 8.001e+05 8.001e+05 ...
Attributes:
crs: +init=epsg:32620
and I can write one of these to NetCDF:
ds.to_netcdf('foo.nc')
but ideally I would be able to use something like xr.open_mfdataset
, write the time values (extracted from the filenames) and then write the entire aggregation to netCDF
. And have dask
handle the out-of-core memory issues. :-)
Can something like this be done with xarray
and dask
?
Then we can easily convert one band (or a time slice) of the NetCDF data into a GeoTIFF format just use the command: However, we’d like to export all time slices with meaningful names such as mslp_yyyy-mm.tif.
In ArcMap Catalog, browse to the downloaded data ( NetCDF_time_slice_to_Raster.zip) which contains a toolbox inside a geodatabase. Open the NetCDF_time_slice_export script tool. Select the Input_NetCDF_layer and an Output Folder. Click OK to run the tool. The tool exports all the time slices (bands) from the NetCDF raster layer as TIFs.
The command-line tool of gdal_translate provided by GDAL should be the most commonly used option for converting raster data between different formats, while CDO is another excellent command line suite for manipulating and analyzing climate data.
Select the Input_NetCDF_layer and an Output Folder. Click OK to run the tool. The tool exports all the time slices (bands) from the NetCDF raster layer as TIFs. The following shows the Python code used for this script tool.
Xarray should be able to do the concat step for you. I have adapted your example a bit below. It will be up to you to parse the filenames into something useful.
import glob
import pandas as pd
import xarray as xr
def time_index_from_filenames(filenames):
'''helper function to create a pandas DatetimeIndex
Filename example: 20150520_0164.tif'''
return pd.DatetimeIndex([pd.Timestamp(f[:8]) for f in filenames])
filenames = glob.glob('*.tif')
time = xr.Variable('time', time_index_from_filenames(filenames))
chunks = {'x': 5490, 'y': 5490, 'band': 1}
da = xr.concat([xr.open_rasterio(f, chunks=chunks) for f in filenames], dim=time)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With