NetCDF Data and PostGIS

I had a raster weather data set in NetCDF format, that I wanted to manipulate in PostgreSQL.  Specifically, I wanted to take the temperature and relative humidity data from raster and convert it into a point dataset, in order to do some vector to vector containment queries.  My goals were:

  1. Extract a point from each cell of the raster, using the centroid (resulting in over 51,000 points per band).
  2. Get Relative Humidity and Temperature data for each point.
  3. Create a final table that had one point geometry and a column for RH and T.

In order to extract the raster data from NetCDF, you have to know some basic information about the “bands” that you are attempting to access.  You can access this information using gdalinfo.  In my case, I had a file named “”, so I used “gdalinfo” to get:

Driver: netCDF/Network Common Data Format
Size is 512, 512
Coordinate System is `'
 NC_GLOBAL#History=created by wgrib2
 SUBDATASET_1_DESC=[165x312] latitude (64-bit floating-point)
 SUBDATASET_2_DESC=[165x312] longitude (64-bit floating-point)
 SUBDATASET_3_DESC=[1x165x312] TMP_2maboveground (32-bit floating-point)
 SUBDATASET_4_DESC=[1x165x312] RELH_2maboveground (32-bit floating-point)
Corner Coordinates:
Upper Left ( 0.0, 0.0)
Lower Left ( 0.0, 512.0)
Upper Right ( 512.0, 0.0)
Lower Right ( 512.0, 512.0)
Center ( 256.0, 256.0)

Once I knew the names of the bands that I wanted, I exported each raster band into a separate temporary table, extracted points and values from each of these tables into 2 separate tables with a postgis vector geometry column, and then merged them into a final table with vector geometry column and columns for temp and rh.  This multi-table process may seem a bit overkill, but by extracting each raster into it’s own point table, I was able to make use of spatial indexing to speed the merging of these separate tables into a single final point table (note: the intermediate point tables had to be permanent in order to create the spatial index).  Below is a full log of my command line and PSQL actions (note, it takes less than a minute for all processing the final insert query to run on a dataset that spans the state of Virginia, 42,000 square miles):

From a bash shell prompt:

# convert  temperature
raster2pgsql NETCDF:"":TMP_2maboveground temp_tmp > temp_tmp.sql
# convert rh
raster2pgsql NETCDF:"":RELH_2maboveground rh_tmp > rh_tmp.sql
cat temp_tmp.sql | psql rastertest
cat rh_tmp.sql | psql rastertest
psql rastertest

From PostgreSQL command line:

create table rh_pts_tmp (geom geometry, rh float8);
insert into rh_pts_tmp(geom, rh) 
select a.geom,st_value( b.rast, 1, a.x, a.y)
 from (SELECT (ST_PixelAsCentroids(rast, 1)).*
 FROM rh_tmp WHERE rid = 1) as a,
 rh_tmp as b
create index tmp_rh_gix on rh_pts_tmp using GIST (geom);
create table temp_pts_tmp (geom geometry, temp float8);
insert into temp_pts_tmp(geom, temp) 
select a.geom,st_value( b.rast, 1, a.x, a.y)
 from (SELECT (ST_PixelAsCentroids(rast, 1)).*
 FROM temp_tmp WHERE rid = 1) as a,
 temp_tmp as b
create index tmp_temp_gix on temp_pts_tmp using GIST (geom);
create table weather_points (fid serial, temp float8, rh float8, the_geom geometry);
insert into weather_points (temp, rh, the_geom)
select foo.temp, bar.rh, foo.geom
from temp_pts_tmp as foo 
left outer join rh_pts_tmp as bar
on (foo.geom && bar.geom and foo.geom = bar.geom);
# Clean up 
drop table temp_tmp;
drop table rh_tmp;



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s