Task Object

class GridmetTask(context: GridMETContext, year: int, variable: GridmetVariable)[source]

Defines a task to download and process data for a single year and variable Instances of this class can be used to parallelize processing

Parameters
  • context – Configuration object for the pipeline

  • year – year

  • variable – gridMET band (variable)

classmethod destination_file_name(context: GridMETContext, year: int, variable: GridmetVariable)[source]

Constructs a file name for a given set of parameters

Parameters
  • context – Configuration object for the pipeline

  • year – year

  • variable – Gridmet band (variable)

Returns

variable_geography_year.csv[.gz]

classmethod find_shape_file(context: GridMETContext, year: int, shape: Shape)[source]

Finds shapefile for a given type of geographies for the closest available year

Parameters
  • context – Configuration object for the pipeline

  • year – year

  • shape – Shape type

Returns

a shape file for a given year if it exists or for the latest year before the given

execute()[source]

Executes the task. First the download subtask is executed unless the corresponding file has already been downloaded. Then the compute tasks are executed

Returns

None

Subtasks

Downloading

class DownloadGridmetTask(year: int, variable: GridmetVariable, destination: str)[source]

Task to download source file in NCDF4 format

Parameters
  • year – year

  • variable – Gridmet band (variable)

  • destination – Destination directory for all downloads

classmethod get_url(year: int, variable: GridmetVariable) str[source]

Constructs URL given a year and band

Parameters
  • year – year

  • variable – Gridmet band (variable)

Returns

URL for download

target()[source]
Returns

File path for downloaded data

execute()[source]

Executes the task :return: None

Compute

class ComputeGridmetTask(year: int, variable: GridmetVariable, infile: str, outfile: str, date_filter=None, ram: int = 0)[source]

An abstract class for a computational task that processes data in Unidata netCDF (Version 4) format

Parameters
  • ram

  • date_filter

  • year – year

  • variable – Gridemt band (variable)

  • infile – File with source data in NCDF4 format

  • outfile – Resulting CSV file

execute(mode: str = 'wt')[source]

Executes computational task

Parameters

mode (str) – mode to use opening result file

Returns

abstract compute_one_day(writer: Collector, day, layer)[source]

Computes required statistics for a single day. This method is called by execute() and is implemented in specific subclasses

Parameters
  • writer – CSV Writer to output the result

  • day – day

  • layer – layer, corresponding to the day

Returns

Nothing

class ComputeShapesTask(year: int, variable: GridmetVariable, infile: str, outfile: str, strategy: RasterizationStrategy, shapefile: str, geography: Geography, date_filter=None, ram=0)[source]

Class describes a compute task to aggregate data over geography shapes

The data is expected in .. _Unidata netCDF (Version 4) format: https://www.unidata.ucar.edu/software/netcdf/

Parameters
  • ram

  • date_filter

  • year – year

  • variable – gridMET band (variable)

  • infile – File with source data in NCDF4 format

  • outfile – Resulting CSV file

  • strategy – Rasterization strategy to use

  • shapefile – Shapefile for used collection of geographies

  • geography – Type of geography, e.g. zip code or county

compute_one_day(writer: Collector, day, layer)[source]

Computes required statistics for a single day. This method is called by execute() and is implemented in specific subclasses

Parameters
  • writer – CSV Writer to output the result

  • day – day

  • layer – layer, corresponding to the day

Returns

Nothing

class ComputePointsTask(year: int, variable: GridmetVariable, infile: str, outfile: str, points_file: str, coordinates: List, metadata: List, date_filter=None, ram=0)[source]

Class describes a compute task to assign data to a collection of points

The data is expected in .. _Unidata netCDF (Version 4) format: https://www.unidata.ucar.edu/software/netcdf/

Parameters
  • ram

  • year – year

  • variable – Gridemt band (variable)

  • infile – File with source data in NCDF4 format

  • outfile – Resulting CSV file

  • points_file – path to a file containing coordinates of points in csv format.

  • coordinates – A two element list of column names in csv corresponding to coordinates

  • metadata – A list of column names in csv that should be interpreted as metadata (e.g. ZIP, site_id, etc.)

execute(mode: str = 'w') None[source]

Executes computational task

Parameters

mode (str) – mode to use opening result file

Returns

compute_one_day(writer: Collector, day, layer)[source]

Computes required statistics for a single day. This method is called by execute() and is implemented in specific subclasses

Parameters
  • writer – CSV Writer to output the result

  • day – day

  • layer – layer, corresponding to the day

Returns

Nothing