model_catalogs.select_date_range#

model_catalogs.select_date_range(cat_or_source, start_date, end_date=None, model_source=None, use_forecast_files=None, override=False)[source]#

For NOAA OFS unaggregated models: Update urlpath locations in Source.

For other models, set up so that start_date and end_date are used to filter resulting Dataset in time. For all models, save start_date and end_date in the Source metadata.

NOAA OFS model sources that require aggregation (currently for model_sources “coops-forecast-noagg” and “ncei-archive-noagg”) need to have the specific file paths found for each file that will be read in. This function does that, based on the desired date range, and returns a Source with file locations in the urlpath. This function can also be used with any model that does not require this (because the model paths are either static or deterministic) but in those cases it does not need to be used; they will have the start and end dates applied to filter the resulting model output after to_dask() is called.

Parameters:
  • cat_or_source (Intake catalog or source) – Catalog containing model_source sources, or single Source.

  • start_date (datetime-interpretable str or pd.Timestamp) – Date (and possibly time) of start to desired model date range. If input date does not include a time, times will be included from the start of the day. If a time is input in start_date, it is used to narrow the time range of the results.

  • end_date (datetime-interpretable str, pd.Timestamp, or None; optional) –

    Date (and possibly time) of start to desired model date range. If input date does not include a time, times will be included from the start of the day. If a time is input in start_date, it is used to narrow the time range of the results. end_date can be None which indicates the user wants all available model output after start_date; this optional is not available for unaggregated historical NOAA OFS models which do not contain forecast files (i.e., model_source “ncei-archive-noagg”).

    There are several use cases to specify:

    • if start_date == end_date, the full day of model output from the date is selected. If the date specified is today and all times for today are not yet available, output from forecast files will be used to fill out the day after the nowcast files end.

    • If end_date is None, all available model output will be retrieved starting at start_date. This option doesn’t work for archival unaggregated NOAA OFS models currently.

    • If end_date is in the future, use_forecast_files is set to True and the forecast is read in, but stopped at end_date.

    • User can set use_forecast_files=True with an end_date in the past to get old forecast model results for end_date for unaggregated NOAA OFS models. This case is probably not well-used and is not regularly tested. The results from using this combination of inputs does not align with the results of mc.find_availability() since the forecast is not the latest.

  • model_source (str, optional) – Which model_source to use. If mc.find_availability() has been run, the code will determine which model_source in the Catalog to use based on start_date and end_date. Otherwise a single model_source can be provided, or find_availability() will be run if needed. An exception is if there is only one model_source available for cat, that one will be used without specifying it.

  • use_forecast_files (bool or None, optional) – This parameter is typically set by the code and is not used by the user. However, in one use case the user can input use_forecast_files=True: when they want to read in a forecast from the past for a NOAA OFS model. Otherwise do not use this parameter directly.

  • override (boolean, optional) – Use override=True to find catrefs regardless of freshness.

Returns:

Intake Source associated with the catalog entry which now contains source.metadata[‘start_date’] and source.metadata[‘end_date’]. The values of source.metadata[‘start/end_date’] will not necessarily be the same as the input start_date and end_date, but may be changed to return the desired output time range. For unaggregated NOAA OFS models, the returned Source will have updated source.urlpath to reflect the newly-found file paths of the selected date range.

Return type:

Intake Source

Examples

Find model ‘LMHOFS’ urlpaths for all of today through all available forecast, directly from source catalog without first searching for availability with mc.find_availability():

>>> main_cat = mc.setup()
>>> today = pd.Timestamp.today()
>>> source = mc.select_date_range(main_cat["LMHOFS"]["coops-forecast-noagg"], start_date=today, end_date=None)

Find urlpaths with select_date_range and have it run find_availability():

>>> source = mc.select_date_range(main_cat['LMHOFS'], start_date=today, end_date=today)