Data wrangling#

Summary: This notebook loads the different datasets used in the analysis into a single NETCDF4 file, with descriptive attributes maintained for each dataset. Each dataset is regridded to the ICESat2 grid shape [304, 448] (x,y). The datasets used in this notebook are listed below. Data are available on an AWS S3 Bucket as netCDF-4 or S3.

Version history: Version 1 (01/01/2022)

Details on each dataset#

Detailed information about each of the datasets used to compile the final data product are provided below in the order they appear in the notebook workflow. The information provided here was last updated 08-21-2021.

ATLAS/ICESat-2 Monthly Gridded Sea Ice Freeboard#

NOAA/NSIDC Climate Data Record of Passive Microwave Sea Ice Concentration#

  • Variables used: NOAA/NSIDC sea ice concentration CDR

  • Download link: https://nsidc.org/data/g02202

  • Reference: Meier, W. N., F. Fetterer, A. K. Windnagel, and S. Stewart. 2021. NOAA/NSIDC Climate Data Record of Passive Microwave Sea Ice Concentration, Version 4. Mean monthly aggregated, northern hemisphere. Boulder, Colorado USA. NSIDC: National Snow and Ice Data Center https://doi.org/10.7265/efmz-2t65. 08-21-2021.

  • NOTE: This is provided as a data variable in the ICESat2 monthly gridded product

ERA5 monthly averaged data on single levels#

  • Variables used: 2m temperature; Mean surface downward long-wave radiation flux

  • Product type: Monthly averaged reanalysis

  • Download link: https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels-monthly-means

  • Reference: Hersbach, H., Bell, B., Berrisford, P., Biavati, G., Horányi, A., Muñoz Sabater, J., Nicolas, J., Peubey, C., Radu, R., Rozum, I., Schepers, D., Simmons, A., Soci, C., Dee, D., Thépaut, J-N. (2019): ERA5 monthly averaged data on single levels from 1979 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS). (Accessed on 16-08-2021), 10.24381/cds.f17050d7

PIOMAS mean monthly ice thickness#

  • Product Type: Sea ice thickness (Volume per unit Area), monthly mean

  • Variables used: Sea ice thickness (Volume per unit Area) monthly mean; Grid lon and lat for scalar fields (txt file)

  • Download link: http://psc.apl.uw.edu/research/projects/arctic-sea-ice-volume-anomaly/data/model_grid

  • Reference: Zhang, Jinlun and D.A. Rothrock: Modeling global sea ice with a thickness and enthalpy distribution model in generalized curvilinear coordinates, Mon. Wea. Rev. 131(5), 681-697, 2003.

  • NOTE: You’ll want to download the heff format data product, not the text file. For some reason, just clicking on the gzipped file to unzip it raises an error on my computer that the archive is empty (which is not true!). You’ll need to unzip the file using the command line; i.e. gzip -d file.gz

Global Low Resolution Sea Ice Drifts from the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT) Ocean and Sea Ice Satellite Application Facility (OSI SAF)#

Note

Although you’ll see an option to run this notebook in Binder, this notebook is NOT configured to run in Binder. If you want to wrangle the data yourself, each dataset used to compile the final data product can be downloaded from the links above. The final data product produced by this notebook can be downloaded from the google storage bucket associated with this jupyter book.

Import notebook dependencies#

import os
import numpy as np
import xarray as xr
import pandas as pd
from datetime import date
import pyproj 
import scipy.interpolate 
from glob import glob
import matplotlib.pyplot as plt
from utils.read_data_utils import read_book_data # This allows us to read the ICESAT2 data directly from the google storage bucket
import matplotlib as mpl

# Ignore warnings in the notebook to improve display
import warnings
warnings.filterwarnings('ignore')
# Set some plotting parameters
mpl.rcParams.update({
    "text.usetex": False,  # Use LaTeX for rendering
    "font.family": "sans-serif",
    'mathtext.fontset': 'stixsans',
    "lines.linewidth": 1.,
    "font.size": 8,
    #"lines.alpha": 0.8,
    "axes.labelsize": 8,
    "xtick.labelsize": 8,
    "ytick.labelsize": 8,
    "legend.fontsize": 8
})
mpl.rcParams['font.sans-serif'] = ['Arial']

Read in data#

Define filepaths#

Define filepaths to data on your local machine

ERA5_path = ''        # ERA5 monthly averaged data on single levels
PIOMAS_path = ''      # PIOMAS mean monthly ice thickness
drift_path = ''       # Polar Pathfinder Daily 25 km EASE-Grid Sea Ice Motion Vectors

Set date range of interest#

We’ll restrict data to this date range, corresponding to years/months in our later analysis

start_date = "Nov 2018"
end_date = "Apr 2021"
date_range = pd.date_range(start=start_date, end=end_date, freq='MS') # MS indicates a time frequency of start of the month
print(date_range)
DatetimeIndex(['2018-11-01', '2018-12-01', '2019-01-01', '2019-02-01',
               '2019-03-01', '2019-04-01', '2019-05-01', '2019-06-01',
               '2019-07-01', '2019-08-01', '2019-09-01', '2019-10-01',
               '2019-11-01', '2019-12-01', '2020-01-01', '2020-02-01',
               '2020-03-01', '2020-04-01', '2020-05-01', '2020-06-01',
               '2020-07-01', '2020-08-01', '2020-09-01', '2020-10-01',
               '2020-11-01', '2020-12-01', '2021-01-01', '2021-02-01',
               '2021-03-01', '2021-04-01'],
              dtype='datetime64[ns]', freq='MS')

ICESat-2 gridded monthly means#

is2_ds = read_book_data()
is2_ds

Hide code cell output

<xarray.Dataset> Size: 345MB
Dimensions:               (time: 30, y: 448, x: 304)
Coordinates:
  * time                  (time) datetime64[ns] 240B 2018-11-01 ... 2021-04-01
    longitude             (y, x) float32 545kB ...
    latitude              (y, x) float32 545kB ...
    xgrid                 (y, x) float32 545kB ...
    ygrid                 (y, x) float32 545kB ...
Dimensions without coordinates: y, x
Data variables: (12/20)
    projection            (time) float64 240B ...
    ice_thickness         (time, y, x) float32 16MB ...
    ice_thickness_unc     (time, y, x) float32 16MB ...
    num_segments          (time, y, x) float32 16MB ...
    mean_day_of_month     (time, y, x) float32 16MB ...
    snow_depth            (time, y, x) float32 16MB ...
    ...                    ...
    region_mask           (time, y, x) float32 16MB ...
    piomas_ice_thickness  (time, y, x) float32 16MB ...
    t2m                   (time, y, x) float32 16MB ...
    msdwlwrf              (time, y, x) float32 16MB ...
    x_vel                 (time, y, x) float64 33MB ...
    y_vel                 (time, y, x) float64 33MB ...
Attributes:
    contact:      Alek Petty (alek.a.petty@nasa.gov)
    description:  Gridded Oct 2019 Arctic sea ice thickness and ancillary dat...
    reference:    Data doi: 10.5067/CV6JEXEE31HF. Original methodology descri...
    history:      Created 21/01/22

ERA5 climate reanalysis#

# Read data 
ERA5 = xr.open_dataset(ERA5_path)
ERA5 = ERA5.sel(time = date_range) # Select for date range of interest
ERA5 = ERA5.where(ERA5.latitude > is2_ds.latitude.min()) # Restrict to ICESat-2 latitude
ERA5 = ERA5.sel(expver = 1).drop('expver') # Remove weird variable

# Convert t2m temperature from Kelvin to Celcius 
tempCelcius = ERA5['t2m'] - 283.15
tempCelcius.attrs['units'] = 'C' # Change units attribute to C (Celcius)
tempCelcius.attrs['long_name'] = '2 meter temperature'
ERA5 = ERA5.assign(t2m = tempCelcius) #Add to dataset as a new data variable

# Add descriptive attributes 
ERA5.attrs = {'description': 'era5 monthly averaged data on single levels from 1979 to present', 
              'website': 'https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels-monthly-means?tab=overview', 
              'contact': 'copernicus-support@ecmwf.int',
              'citation': 'Copernicus Climate Change Service (C3S) (2017): ERA5: Fifth generation of ECMWF atmospheric reanalyses of the global climate . Copernicus Climate Change Service Climate Data Store (CDS), July 2020. https://cds.climate.copernicus.eu/cdsapp#!/home'}
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File ~/GitHub/is2book_uv_env/lib/python3.9/site-packages/xarray/backends/file_manager.py:211, in CachingFileManager._acquire_with_cache_info(self, needs_lock)
    210 try:
--> 211     file = self._cache[self._key]
    212 except KeyError:

File ~/GitHub/is2book_uv_env/lib/python3.9/site-packages/xarray/backends/lru_cache.py:56, in LRUCache.__getitem__(self, key)
     55 with self._lock:
---> 56     value = self._cache[key]
     57     self._cache.move_to_end(key)

KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('/Users/akpetty/GitHub/icesat2-book/icesat2-book-data/ERA5/ERA5_monthly_reanalysis.nc',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False)), '4ca761ed-33e0-4f6a-a963-37658d9fb14a']

During handling of the above exception, another exception occurred:

FileNotFoundError                         Traceback (most recent call last)
Cell In[7], line 2
      1 # Read data 
----> 2 ERA5 = xr.open_dataset(ERA5_path)
      3 ERA5 = ERA5.sel(time = date_range) # Select for date range of interest
      4 ERA5 = ERA5.where(ERA5.latitude > is2_ds.latitude.min()) # Restrict to ICESat-2 latitude

File ~/GitHub/is2book_uv_env/lib/python3.9/site-packages/xarray/backends/api.py:588, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, chunked_array_type, from_array_kwargs, backend_kwargs, **kwargs)
    576 decoders = _resolve_decoders_kwargs(
    577     decode_cf,
    578     open_backend_dataset_parameters=backend.open_dataset_parameters,
   (...)
    584     decode_coords=decode_coords,
    585 )
    587 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
--> 588 backend_ds = backend.open_dataset(
    589     filename_or_obj,
    590     drop_variables=drop_variables,
    591     **decoders,
    592     **kwargs,
    593 )
    594 ds = _dataset_from_backend_dataset(
    595     backend_ds,
    596     filename_or_obj,
   (...)
    606     **kwargs,
    607 )
    608 return ds

File ~/GitHub/is2book_uv_env/lib/python3.9/site-packages/xarray/backends/netCDF4_.py:645, in NetCDF4BackendEntrypoint.open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, mode, format, clobber, diskless, persist, lock, autoclose)
    624 def open_dataset(  # type: ignore[override]  # allow LSP violation, not supporting **kwargs
    625     self,
    626     filename_or_obj: str | os.PathLike[Any] | BufferedIOBase | AbstractDataStore,
   (...)
    642     autoclose=False,
    643 ) -> Dataset:
    644     filename_or_obj = _normalize_path(filename_or_obj)
--> 645     store = NetCDF4DataStore.open(
    646         filename_or_obj,
    647         mode=mode,
    648         format=format,
    649         group=group,
    650         clobber=clobber,
    651         diskless=diskless,
    652         persist=persist,
    653         lock=lock,
    654         autoclose=autoclose,
    655     )
    657     store_entrypoint = StoreBackendEntrypoint()
    658     with close_on_error(store):

File ~/GitHub/is2book_uv_env/lib/python3.9/site-packages/xarray/backends/netCDF4_.py:408, in NetCDF4DataStore.open(cls, filename, mode, format, group, clobber, diskless, persist, lock, lock_maker, autoclose)
    402 kwargs = dict(
    403     clobber=clobber, diskless=diskless, persist=persist, format=format
    404 )
    405 manager = CachingFileManager(
    406     netCDF4.Dataset, filename, mode=mode, kwargs=kwargs
    407 )
--> 408 return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)

File ~/GitHub/is2book_uv_env/lib/python3.9/site-packages/xarray/backends/netCDF4_.py:355, in NetCDF4DataStore.__init__(self, manager, group, mode, lock, autoclose)
    353 self._group = group
    354 self._mode = mode
--> 355 self.format = self.ds.data_model
    356 self._filename = self.ds.filepath()
    357 self.is_remote = is_remote_uri(self._filename)

File ~/GitHub/is2book_uv_env/lib/python3.9/site-packages/xarray/backends/netCDF4_.py:417, in NetCDF4DataStore.ds(self)
    415 @property
    416 def ds(self):
--> 417     return self._acquire()

File ~/GitHub/is2book_uv_env/lib/python3.9/site-packages/xarray/backends/netCDF4_.py:411, in NetCDF4DataStore._acquire(self, needs_lock)
    410 def _acquire(self, needs_lock=True):
--> 411     with self._manager.acquire_context(needs_lock) as root:
    412         ds = _nc4_require_group(root, self._group, self._mode)
    413     return ds

File /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/contextlib.py:117, in _GeneratorContextManager.__enter__(self)
    115 del self.args, self.kwds, self.func
    116 try:
--> 117     return next(self.gen)
    118 except StopIteration:
    119     raise RuntimeError("generator didn't yield") from None

File ~/GitHub/is2book_uv_env/lib/python3.9/site-packages/xarray/backends/file_manager.py:199, in CachingFileManager.acquire_context(self, needs_lock)
    196 @contextlib.contextmanager
    197 def acquire_context(self, needs_lock=True):
    198     """Context manager for acquiring a file."""
--> 199     file, cached = self._acquire_with_cache_info(needs_lock)
    200     try:
    201         yield file

File ~/GitHub/is2book_uv_env/lib/python3.9/site-packages/xarray/backends/file_manager.py:217, in CachingFileManager._acquire_with_cache_info(self, needs_lock)
    215     kwargs = kwargs.copy()
    216     kwargs["mode"] = self._mode
--> 217 file = self._opener(*self._args, **kwargs)
    218 if self._mode == "w":
    219     # ensure file doesn't get overridden when opened again
    220     self._mode = "a"

File src/netCDF4/_netCDF4.pyx:2521, in netCDF4._netCDF4.Dataset.__init__()

File src/netCDF4/_netCDF4.pyx:2158, in netCDF4._netCDF4._ensure_nc_success()

FileNotFoundError: [Errno 2] No such file or directory: '/Users/akpetty/GitHub/icesat2-book/icesat2-book-data/ERA5/ERA5_monthly_reanalysis.nc'

PIOMAS sea ice thickness#

def get_piomas_data(date_range, data_dir): 
    """ Read in a PIOMAS yearly files and convert to an xr.DataArray object 
    
    Args: 
        date_range (pandas DatetimeIndex): date range to grab data for 
        data_dir (str): directory containing data on local drive 
    
    Returns: 
        PIO_da (xr.DataArray): dataset object containing data for input date range
    
    """

    start_year = date_range[0].year
    end_year = date_range[-1].year

    pio_by_yr = []
    for year in range(start_year, end_year+1): 

        # Most recent year may not have a complete year of data
        # We need to reshape the data to match the number of months available, such that the shape of the numpy array is [month, 120, 360]
        i = 1
        while i <= 12: 
            data = open(data_dir + 'heff.H' + str(year), 'rb') 
            try: 
                pio_np = list(np.fromfile(file = data, dtype='f').reshape([i, 120, 360]))
                break
            except: 
                i += 1
        pio_by_yr += (pio_np)
        
    # Get latitude and longitude 
    gridP = np.loadtxt(data_dir + 'grid.dat.txt')
    lonsP = gridP[0:4320, :].flatten()
    lonsP = np.reshape(lonsP, [120,360])
    latsP = gridP[4320:, :].flatten()
    latsP = np.reshape(latsP, [120,360])

    # Load dataList as an xr.DataArray and add descriptive attributes and coordinates
    time = pd.date_range(start = str(start_year), end = str(end_year) + "-" + str(i), freq = 'MS')
    PIOMAS_da = xr.DataArray(pio_by_yr, 
                             dims = ['time','y','x'], 
                             coords = {'time': time, 'longitude': (('y','x'), lonsP), 'latitude': (('y','x'), latsP)}, 
                             attrs = {'units': 'meters', 
                                      'long_name': 'PIOMAS sea ice thickness', 
                                      'data_download': 'http://psc.apl.uw.edu/research/projects/arctic-sea-ice-volume-anomaly/data/', 
                                      'download_date': '08-2020',
                                       'citation': 'Zhang, J.L. and D.A. Rothrock, “Modeling global sea ice with a thickness and enthalpy distribution model in generalized curvilinear coordinates“, Mon. Weather Rev., 131, 845-861, 2003'}, 
                             name = "piomas_ice_thickness")
    PIOMAS_da = PIOMAS_da.sel(time = date_range)

    return PIOMAS_da
pio_da = get_piomas_data(date_range=date_range, data_dir=PIOMAS_path)
pio_da = pio_da.sel(time=date_range) # Select for date range of interest
pio_da = pio_da.where(pio_da.latitude > is2_ds.latitude.min()) # Restrict to ICESat-2 latitude 
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[9], line 1
----> 1 pio_da = get_piomas_data(date_range=date_range, data_dir=PIOMAS_path)
      2 pio_da = pio_da.sel(time=date_range) # Select for date range of interest
      3 pio_da = pio_da.where(pio_da.latitude > is2_ds.latitude.min()) # Restrict to ICESat-2 latitude 

Cell In[8], line 23, in get_piomas_data(date_range, data_dir)
     21 i = 1
     22 while i <= 12: 
---> 23     data = open(data_dir + 'heff.H' + str(year), 'rb') 
     24     try: 
     25         pio_np = list(np.fromfile(file = data, dtype='f').reshape([i, 120, 360]))

File ~/GitHub/is2book_uv_env/lib/python3.9/site-packages/IPython/core/interactiveshell.py:310, in _modified_open(file, *args, **kwargs)
    303 if file in {0, 1, 2}:
    304     raise ValueError(
    305         f"IPython won't let you open fd={file} by default "
    306         "as it is likely to crash IPython. If you know what you are doing, "
    307         "you can use builtins' open."
    308     )
--> 310 return io_open(file, *args, **kwargs)

FileNotFoundError: [Errno 2] No such file or directory: '../icesat2-book-data/PIOMAS/heff.H2018'

OSI-SAF Sea Ice Drifts#

First, we read in the data and resample it to produce monthly means. Then, we reproject the data to the ICESat-2 projection (EPSG 3411).

def get_projected_vectors(fmonthly, proj):
    """ Project osisaf drifts to map projection (x/y pointing in the new map projection coordinates)
    
    Args: 
        fmonthly (xr.Dataset): monthly OSI-SAF vectors 
        proj (str): map projection to use (i.e. pyproj.Proj("+init=EPSG:3411"))
    
    Returns: 
        fmonthly (xr.Dataset): input data reprojected to proj 
    
    """ 
    
    # Transform to map project coordinates (basemap's axes, assume they have unit of m)
    x0, y0=proj(fmonthly.lon, fmonthly.lat)
    x1, y1=proj(fmonthly.lon1, fmonthly.lat1)

    # Normalize drift components to m/s (x1-x0 is already m, so we just divide by 2-days worth of seconds)
    xt=(x1-x0)/(60*60*24*2.)
    yt=(y1-y0)/(60*60*24*2.)
    
    fmonthly['xpts'] = xr.DataArray(x0, dims=['yc', 'xc']) 
    
    fmonthly['ypts'] = xr.DataArray(y0, dims=['yc', 'xc']) 
    
    fmonthly['x_vel'] = xr.DataArray(xt, dims=['time', 'yc', 'xc']) 
    
    fmonthly['y_vel'] = xr.DataArray(yt, dims=['time', 'yc', 'xc']) 
    
    fmonthly['mag_vel'] = xr.DataArray(np.sqrt(xt**2+yt**2), dims=['time', 'yc', 'xc']) 

    fmonthly = fmonthly.drop(['dt0', 'dt1', 'lon1', 'lat1', 'dX', 'dY'])
    
    # Do some filtering as we're getting some weird values in the Canadian Archipelago
    fmonthly = fmonthly.where(fmonthly.mag_vel<1)

    # Add attributes
    fmonthly.x_vel.attrs = {'description':'along-x component of the ice motion', 'units':'cm/s', 'long_name':'sea ice x velocity'}
    fmonthly.y_vel.attrs = {'description':'along-y component of the ice motion', 'units':'cm/s', 'long_name':'sea ice y velocity'}
    fmonthly.mag_vel.attrs = {'long_name': 'drift vector magnitude', 'units':'cm/s'}
    fmonthly.attrs['citation'] = 'Lavergne, T., Eastwood, S., Teffah, Z., Schyberg, H., and Breivik, L.-A.: Sea ice motion from low-resolution satellite sensors: An alternative method and its validation in the Arctic, J. Geophys. Res., 115, C10032, https://doi.org/10.1029/2009JC005958, 2010.'
    
    return fmonthly
# Read in data 
files = [os.path.join(path, name) for path, subdirs, files in os.walk(drifts_path) for name in files if name.endswith('.nc') and "nh" in name]
files_xr = [xr.open_dataset(file) for file in files]
driftsDaily = xr.concat(files_xr, dim='time').sortby("time")

# Resample monthly 
driftsMonthly = driftsDaily.resample(time='MS', keep_attrs = True).mean() 

# Project to CRS of ICeSat-2 grid
monthlyDrifts_proj = get_projected_vectors(fmonthly=driftsMonthly.copy(), proj=pyproj.Proj("+init=EPSG:3411"))
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
File ~/GitHub/is2book_uv_env/lib/python3.9/site-packages/xarray/core/concat.py:254, in concat(objs, dim, data_vars, coords, compat, positions, fill_value, join, combine_attrs, create_index_for_new_dim)
    253 try:
--> 254     first_obj, objs = utils.peek_at(objs)
    255 except StopIteration:

File ~/GitHub/is2book_uv_env/lib/python3.9/site-packages/xarray/core/utils.py:198, in peek_at(iterable)
    197 gen = iter(iterable)
--> 198 peek = next(gen)
    199 return peek, itertools.chain([peek], gen)

StopIteration: 

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
Cell In[11], line 4
      2 files = [os.path.join(path, name) for path, subdirs, files in os.walk(drifts_path) for name in files if name.endswith('.nc') and "nh" in name]
      3 files_xr = [xr.open_dataset(file) for file in files]
----> 4 driftsDaily = xr.concat(files_xr, dim='time').sortby("time")
      6 # Resample monthly 
      7 driftsMonthly = driftsDaily.resample(time='MS', keep_attrs = True).mean() 

File ~/GitHub/is2book_uv_env/lib/python3.9/site-packages/xarray/core/concat.py:256, in concat(objs, dim, data_vars, coords, compat, positions, fill_value, join, combine_attrs, create_index_for_new_dim)
    254     first_obj, objs = utils.peek_at(objs)
    255 except StopIteration:
--> 256     raise ValueError("must supply at least one object to concatenate")
    258 if compat not in _VALID_COMPAT:
    259     raise ValueError(
    260         f"compat={compat!r} invalid: must be 'broadcast_equals', 'equals', 'identical', 'no_conflicts' or 'override'"
    261     )

ValueError: must supply at least one object to concatenate

Regrid all datasets to ICESat-2 grid#

In order to merge all the datasets into a singe netcdf4 file, they need to be on the same grid.

# Initialize map projection and project data to it
out_proj = 'EPSG:3411'
out_lons = is2_ds.longitude.values
out_lats = is2_ds.latitude.values

mapProj = pyproj.Proj("+init=" + out_proj)
xptsIS2, yptsIS2 = mapProj(out_lons, out_lats)
def regridToICESat2(dataArrayNEW, xptsNEW, yptsNEW, xptsIS2, yptsIS2):  
    """ Regrid new data to ICESat-2 grid 
    
    Args: 
        dataArrayNEW (xarray DataArray): DataArray to be gridded to ICESat-2 grid 
        xptsNEW (numpy array): x-values of dataArrayNEW projected to ICESat-2 map projection 
        yptsNEW (numpy array): y-values of dataArrayNEW projected to ICESat-2 map projection 
        xptsIS2 (numpy array): ICESat-2 longitude projected to ICESat-2 map projection
        yptsIS2 (numpy array): ICESat-2 latitude projected to ICESat-2 map projection
    
    Returns: 
        gridded (numpy array): data regridded to ICESat-2 map projection
    
    """
    gridded = []
    for i in range(len(dataArrayNEW.values)): 
        monthlyGridded = scipy.interpolate.griddata((xptsNEW.flatten(),yptsNEW.flatten()), dataArrayNEW.values[i].flatten(), (xptsIS2, yptsIS2), method = 'nearest')
        gridded.append(monthlyGridded)
    gridded = np.array(gridded)
    return gridded

ERA5 climate reanalysis#

# Choose data variables of interest 
ERA5Vars = ['t2m','msdwlwrf']

#initialize map projection and project data to it
mapProj = pyproj.Proj("+init=" + out_proj)
xptsERA, yptsERA = mapProj(*np.meshgrid(ERA5.longitude.values, ERA5.latitude.values))
xptsIS2, yptsIS2 = mapProj(out_lons, out_lats)

ERA5_list = []
for var in ERA5Vars: 
    ERA5gridded = regridToICESat2(ERA5[var], xptsERA, yptsERA, xptsIS2, yptsIS2) 
    ERAArray = xr.DataArray(data = ERA5gridded, 
                            dims = ['time', 'y', 'x'], 
                            coords = {'latitude': (('y','x'), out_lats), 'longitude': (('y','x'), out_lons), 'time':ERA5.time.values}, 
                            name = var)
    ERAArray.attrs = ERA5[var].attrs # Maintain descriptive attributes
    ERAArray = ERAArray.assign_attrs(ERA5.attrs)
    ERA5_list.append(ERAArray)
ERA5_regridded = xr.merge(ERA5_list)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[14], line 6
      4 #initialize map projection and project data to it
      5 mapProj = pyproj.Proj("+init=" + out_proj)
----> 6 xptsERA, yptsERA = mapProj(*np.meshgrid(ERA5.longitude.values, ERA5.latitude.values))
      7 xptsIS2, yptsIS2 = mapProj(out_lons, out_lats)
      9 ERA5_list = []

NameError: name 'ERA5' is not defined

PIOMAS sea ice thickness#

#project data to ICESat-2 map projection
xptsPIO, yptsPIO = mapProj(pio_da.longitude.values, pio_da.latitude.values)

#regrid data 
pio_regridded = regridToICESat2(pio_da, xptsPIO, yptsPIO, xptsIS2, yptsIS2)
pio_regridded = xr.DataArray(data = pio_regridded, 
                             dims = ['time', 'y', 'x'], 
                             coords = {'latitude': (('y','x'), out_lats), 'longitude': (('y','x'), out_lons), 'time': pio_da.time.values}, 
                             name = pio_da.name)
pio_regridded = pio_regridded.assign_attrs(pio_da.attrs)
pio_regridded = pio_regridded.to_dataset()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[15], line 2
      1 #project data to ICESat-2 map projection
----> 2 xptsPIO, yptsPIO = mapProj(pio_da.longitude.values, pio_da.latitude.values)
      4 #regrid data 
      5 pio_regridded = regridToICESat2(pio_da, xptsPIO, yptsPIO, xptsIS2, yptsIS2)

NameError: name 'pio_da' is not defined

OSI-SAF Sea Ice Drifts#

#project data to ICESat-2 map projection
xptsDRIFTS, yptsDRIFTS = mapProj(monthlyDrifts_proj.lon.values, monthlyDrifts_proj.lat.values)

# Loop through variables of interest and regrid 
drifts_list = []
for var in ["x_vel","y_vel"]: 
    driftsGridded = regridToICESat2(monthlyDrifts_proj[var], xptsDRIFTS, yptsDRIFTS, xptsIS2, yptsIS2)

    driftsArray = xr.DataArray(data = driftsGridded, 
                               dims = ['time', 'y', 'x'], 
                               coords = {'latitude': (('y','x'), out_lats), 'longitude': (('y','x'), out_lons), "time": monthlyDrifts_proj.time.values}, 
                               name = var)

    driftsArray.attrs = monthlyDrifts_proj[var].attrs
    driftsArray = driftsArray.assign_attrs(monthlyDrifts_proj.attrs)
    drifts_list.append(driftsArray)

drifts_regridded = xr.merge(drifts_list)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[16], line 2
      1 #project data to ICESat-2 map projection
----> 2 xptsDRIFTS, yptsDRIFTS = mapProj(monthlyDrifts_proj.lon.values, monthlyDrifts_proj.lat.values)
      4 # Loop through variables of interest and regrid 
      5 drifts_list = []

NameError: name 'monthlyDrifts_proj' is not defined

Compile and save final dataset#

Now that all the data is on the same grid, we can use xarray to merge all the datasets.

Combine datasets#

final_ds = xr.merge([is2_ds, pio_regridded, ERA5_regridded, drifts_regridded])
final_ds = final_ds.sel(time=slice("Nov 2018",final_ds.time.values[-1])) # Remove Sep & Oct 2018, which have no data from ICESat-2
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[17], line 1
----> 1 final_ds = xr.merge([is2_ds, pio_regridded, ERA5_regridded, drifts_regridded])
      2 final_ds = final_ds.sel(time=slice("Nov 2018",final_ds.time.values[-1])) # Remove Sep & Oct 2018, which have no data from ICESat-2

NameError: name 'pio_regridded' is not defined

Save data to local machine as a netcdf4 file#

We also uploaded this same file to the google storage bucket.

filename = './data/IS2_jbook_dataset_201811-202104.nc'
save_file = True

if (save_file == True):
    try: 
        final_ds.to_netcdf(path=filename, format='NETCDF4', mode='w')
        print('File ' + '"%s"' % filename + ' saved to directory ' + '"%s"' % os.getcwd())
    except: 
        print("Cannot save file because file by same name already exists")

else: 
    pass
Cannot save file because file by same name already exists