How to use hydrointerp

This section will describe how to use the hydrointerp package. Nearly all outputs are either as Pandas DataFrames or Xarray Datasets.

Necessary imports

For the examples, the numpy, pandas, xarray, and hydrointerp packages are needed.

import numpy as np
import pandas as pd
import xarray as xr
from hydrointerp import Interp, datasets

Loading in appropriate input data

The input must be either a grid as an Xarray Dataset or as points as a Pandas DataFrame. Both of these input data must have associated naming parameters.

### Input Parameters
In [1]: nc1 = 'nasa_gpm_2015-06-18'

In [2]: csv1 = 'ecan_data_2015-06-18'

In [3]: grid_time_name = 'time'

In [4]: grid_x_name = 'lon'

In [5]: grid_y_name = 'lat'

In [6]: grid_data_name = 'precipitationCal'

In [7]: grid_crs = 4326

In [8]: point_time_name = 'date'

In [9]: point_x_name = 'NZTMX'

In [10]: point_y_name = 'NZTMY'

In [11]: point_data_name = 'precip'

In [12]: point_crs = 2193

### Read input data
In [13]: ds = xr.open_dataset(datasets.get_path(nc1))

In [14]: df1 = pd.read_csv(datasets.get_path(csv1), parse_dates=['date'], infer_datetime_format=True)

In [15]: print(df1.head())
        date  precip      NZTMX      NZTMY
0 2015-06-18    96.5  1507022.0  5266024.0
1 2015-06-19   166.0  1507022.0  5266024.0
2 2015-06-18    63.0  1506391.0  5253154.0
3 2015-06-19    73.0  1506391.0  5253154.0
4 2015-06-18    88.5  1482760.0  5244669.0

### Assign nan toplaces wherethe quality index is below 0.4
In [16]: ds2 = ds[[grid_data_name]].where(ds.precipitationQualityIndex > 0.4)

In [17]: print(ds2)
<xarray.Dataset>
Dimensions:           (lat: 160, lon: 150, time: 2)
Coordinates:
  * time              (time) datetime64[ns] 2015-06-18 2015-06-19
  * lon               (lon) float32 165.1 165.1 165.2 ... 179.8 179.9 179.9
  * lat               (lat) float32 -48.95 -48.85 -48.75 ... -33.15 -33.05
Data variables:
    precipitationCal  (time, lon, lat) float32 0.006805 0.07318 0.0 ... 0.0 0.0

### Close the file (by removing the object)
In [18]: ds.close()

### Create example points
In [19]: points_df = df1.loc[[6, 15, 132], [point_x_name, point_y_name]].copy()

In [20]: points_df.rename(columns={point_x_name: 'x', point_y_name: 'y'}, inplace=True)

Initialising Interp

The package and general usage is via the main Interp class. It must be initialised with appropriate datasets and name parameters. Bare in mind, it is not required to have both input grids and points. One set is fine, and the appropriate interpolation methods will appear.

In [21]: interpc = Interp(ds2, grid_time_name, grid_x_name, grid_y_name, grid_data_name, grid_crs, point_data=df1, point_time_name=point_time_name, point_x_name=point_x_name, point_y_name=point_y_name, point_data_name=point_data_name, point_crs=point_crs)

Nan filling

If your grid has nans (which the example does), fill those nans with the grid_interp_na method.

In [22]: nan1 = ds2[grid_data_name].isnull().sum()

In [23]: interpc.grid_interp_na()

In [24]: nan2 = interpc.grid_data['precip'].isnull().sum()

In [25]: assert (nan1 > 0) & (nan2 == 0)

Base Interpolators

All the 2D interpolators you’ll need…

## Parameters
In [26]: to_crs = 2193

In [27]: grid_res = 10000

In [28]: bbox=None

In [29]: order=2

In [30]: extrapolation='constant'

In [31]: cval=np.nan

In [32]: digits = 2

In [33]: min_lat = -48

In [34]: max_lat = -41

In [35]: min_lon = 170

In [36]: max_lon = 178

In [37]: min_val=0

In [38]: method='linear'

## grid to grid
In [39]: interp1 = interpc.grid_to_grid(grid_res, to_crs, bbox, order, extrapolation, min_val=min_val)
Preparing input and output
Running interpolations...
Packaging up the output

In [40]: print(interp1)
<xarray.Dataset>
Dimensions:  (time: 2, x: 140, y: 180)
Coordinates:
  * time     (time) datetime64[ns] 2015-06-18 2015-06-19
  * x        (x) float64 8.568e+05 8.668e+05 8.768e+05 ... 2.237e+06 2.247e+06
  * y        (y) float64 4.548e+06 4.558e+06 4.568e+06 ... 6.328e+06 6.338e+06
Data variables:
    precip   (time, x, y) float32 nan nan nan nan nan ... 0.0 -0.0 0.0 nan nan

## points to grid
In [41]: interp2 = interpc.points_to_grid(grid_res, to_crs, bbox, method, extrapolation, min_val=min_val)
Prepare input and output data
Run interpolations...
2015-06-18 00:00:00
2015-06-19 00:00:00
Packaging up the output

In [42]: print(interp2)
<xarray.Dataset>
Dimensions:  (time: 2, x: 35, y: 33)
Coordinates:
  * time     (time) datetime64[ns] 2015-06-18 2015-06-19
  * y        (y) float64 5.018e+06 5.028e+06 5.038e+06 ... 5.328e+06 5.338e+06
  * x        (x) float64 1.329e+06 1.339e+06 1.349e+06 ... 1.659e+06 1.669e+06
Data variables:
    precip   (time, y, x) float64 nan nan nan nan nan ... nan nan nan nan nan

## grid to points
In [43]: interp3 = interpc.grid_to_points(points_df, to_crs, order, min_val=min_val)
Preparing input and output
Running interpolations...
Packaging up the output

In [44]: print(interp3)
                                    precip
time       x         y                    
2015-06-18 1515946.0 5249806.0  617.679993
           1581327.0 5279063.0   86.559998
           1437006.0 5066885.0   78.220001
2015-06-19 1515946.0 5249806.0   59.439999
           1581327.0 5279063.0   45.299999
           1437006.0 5066885.0    2.220000

## points to points
In [45]: interp4 = interpc.points_to_points(points_df, to_crs, method, min_val=min_val)
2015-06-18 00:00:00
2015-06-19 00:00:00

In [46]: print(interp4)
                                precip
time       x         y                
2015-06-18 1515946.0 5249806.0    45.0
           1581327.0 5279063.0    12.0
           1437006.0 5066885.0    36.5
2015-06-19 1515946.0 5249806.0    55.5
           1581327.0 5279063.0    37.5
           1437006.0 5066885.0    17.0

Adjust grid from points

There is also a method to adjust a grid based on the point_data (bias correction). And a method to run tests on it’s accuracy.

In [47]: interp5 = interpc.adjust_grid_from_points(grid_res, to_crs)
Preparing input and output
Running interpolations...
Packaging up the output
Preparing input and output
Running interpolations...
Packaging up the output
Prepare input and output data
Run interpolations...
2015-06-18 00:00:00
2015-06-19 00:00:00
Packaging up the output

In [48]: print(interp5)
<xarray.Dataset>
Dimensions:  (time: 2, x: 35, y: 32)
Coordinates:
  * x        (x) float64 1.327e+06 1.337e+06 1.347e+06 ... 1.657e+06 1.667e+06
  * y        (y) float64 5.018e+06 5.028e+06 5.038e+06 ... 5.318e+06 5.328e+06
  * time     (time) datetime64[ns] 2015-06-18 2015-06-19
Data variables:
    precip   (time, x, y) float64 15.05 17.17 15.35 20.94 ... 11.09 11.26 7.764

In [49]: interp6 = interpc.validate_grid_from_points(0.08, grid_res, to_crs)
Preparing input and output
Running interpolations...
Packaging up the output
Preparing input and output
Running interpolations...
Packaging up the output
Prepare input and output data
Run interpolations...
2015-06-18 00:00:00
2015-06-19 00:00:00
Packaging up the output
2015-06-18 00:00:00
2015-06-19 00:00:00
Preparing input and output
Running interpolations...
Packaging up the output

In [50]: print(interp6)
                                precip  point_precip  grid_precip
time       x         y                                           
2015-06-18 1418364.0 5135227.0     5.8         15.22         9.04
           1499654.0 5232199.0    33.0         46.40        34.14
           1510512.0 5187224.0    31.0         23.38        26.83
           1515640.0 5225591.0    49.5         40.46        24.75
           1569400.0 5166693.0     6.5         15.57        17.80
           1674522.0 5337964.0     1.5           NaN          NaN
2015-06-19 1418364.0 5135227.0    14.2          8.42         6.91
           1499654.0 5232199.0    49.5         70.27       112.43
           1510512.0 5187224.0    31.5         49.47        51.22
           1515640.0 5225591.0    41.5         48.58        95.60
           1569400.0 5166693.0    28.0         16.37        18.26
           1674522.0 5337964.0     8.5           NaN          NaN