Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Import netCDF file to Pandas dataframe

Merry Christmas! I am still very new to Python and Pandas, so any help is appreciated. I am trying to read in a netCDF file, which I can do and then import that into a Pandas Dataframe. The netcDF file is 2D so I just want to 'dump it in'. I have tried the DataFrame method but it doesn't recognize the object. Presumably I need to convert the netCDF object to a 2D numpy array? Again, thanks for any ideas on the best way to do this.

like image 299
user1911866 Avatar asked Dec 26 '12 01:12

user1911866


People also ask

How do I import netCDF4 into Python?

To install with anaconda (conda) simply type conda install netCDF4 . Alternatively, you can install with pip . To be sure your netCDF4 module is properly installed start an interactive session in the terminal (type python and press 'Enter'). Then import netCDF4 as nc .

How do I convert NC to CSV?

xarray usage to convert netcdf to csv We open the netcdf file (using open_dataset() method), convert it to a dataframe ( to_dataframe() method) and write this object to a csv file ( to_csv() method).


2 Answers

The xarray library handles arbitrary-dimensional netCDF data, and retains metadata. Xarray provides a simple method of opening netCDF files, and converting them to pandas dataframes:

import xarray as xr  ds = xr.open_dataset('/path/to/netcdf') df = ds.to_dataframe() 

This will create a dataframe with a multi-index with all of the dimensions in it. Unfortunately, Pandas doesn't support arbitrary metadata, so that will be lost in the conversion, but you can keep the ds around, and use the metadata from that.

like image 131
naught101 Avatar answered Sep 17 '22 16:09

naught101


If your NetCDF file (or OPeNDAP dataset) follows CF Metadata conventions you can take advantage of them by using the NetCDF4-Python package, which makes accessing them in Pandas really easy. (I'm using the Enthought Python Distribution which includes both Pandas and NetCDF4-Python).

In the example below, the NetCDF file is being served via OPeNDAP, and the NetCDF4-Python library lets you open and work with a remote OPeNDAP dataset just as if it was a local NetCDF file, which is pretty slick. If you want to see the attributes of the NetCDF4 file, point your browser at this link http://geoport-dev.whoi.edu/thredds/dodsC/HUDSON_SVALLEY/5951adc-a1h.nc.html

You should be able to run this without changes:

from matplotlib import pyplot as plt import pandas as pd import netCDF4  url='http://geoport-dev.whoi.edu/thredds/dodsC/HUDSON_SVALLEY/5951adc-a1h.nc' vname = 'Tx_1211' station = 0  nc = netCDF4.Dataset(url) h = nc.variables[vname] times = nc.variables['time'] jd = netCDF4.num2date(times[:],times.units) hs = pd.Series(h[:,station],index=jd)  fig = plt.figure(figsize=(12,4)) ax = fig.add_subplot(111) hs.plot(ax=ax,title='%s at %s' % (h.long_name,nc.id)) ax.set_ylabel(h.units) 

The result may be seen here in the Ipython Notebook: http://nbviewer.ipython.org/4615153/

like image 34
Rich Signell Avatar answered Sep 18 '22 16:09

Rich Signell