Skip to content

Ubc wrf inclusion#51

Open
SBeairsto wants to merge 5 commits intomainfrom
UBC_WRF_inclusion
Open

Ubc wrf inclusion#51
SBeairsto wants to merge 5 commits intomainfrom
UBC_WRF_inclusion

Conversation

@SBeairsto
Copy link
Copy Markdown
Collaborator

Add UBC WRF Data Loading Support

This PR adds support for loading UBC WRF metgrid output files, which require special handling due to their non-standard time coordinate structure.

Key Changes

New UBC WRF loader module (ubc_wrf_io.py)

  • Time coordinate reconstruction: Parses dates from filenames to replace dummy time coordinates (all zeros)
  • Spin-up timestep handling: Automatically drops the last timestep of each month (corresponding to next month's 00:00:00)
  • Optimized for large datasets: Uses minimal coordinate checking and NFS-friendly file handling for 1700+ file datasets
  • Recursive file discovery: Handles nested directory structures (e.g., YYYY/metgrid_YYYY_MM/*.nc)

Core infrastructure updates

  • Added loader parameter to ClimateModel class for data-source-specific loading strategies
  • Updated load_grid() to dispatch to specialized loaders (e.g., ubc_wrf) based on configuration
  • Enhanced coordinate name recognition for WRF conventions (XLAT, XLONG, Times)

Configuration changes

  • HR model: Updated to use loader: "ubc_wrf" and point to UBC WRF data directory structure
  • New invariant fields: Added land_mask, land_use, and surface_roughness for downscaling applications
  • Surface pressure support: Added PSFC variable configuration for both HR and LR models
  • Enabled all LR variables: Activated uas, vas, tas, and ps by default
  • Updated reference paths: Changed to shared accessible locations on Venus cluster

Bug fixes

  • Fixed is_west_negative flag for temperature and wind variables
  • Corrected standardization/normalization settings for consistency across variables
  • Fixed indentation in lr_emulation metadata paths

Technical Details

UBC WRF file structure handled:

  • Monthly files with pattern: metgrid_YYYY_MM.nc
  • Dummy Time coordinate (all zeros) replaced with proper pd.date_range
  • Hourly frequency with automatic month-boundary handling

Performance considerations:

  • File discovery over NFS: ~22-30 seconds for full dataset
  • Lazy loading with Dask for memory efficiency
  • Minimal coordinate alignment for reduced overhead

Testing

✅ Tested with 2014-2017 subset of UBC WRF dataset (~1700 monthly files) on Venus
✅ Successful lazy loading and metadata computation
✅ Verified proper time coordinate generation and spin-up removal

Example Usage

# config/climate_models/hr.yaml
_target_: nc2pt.climatedata.ClimateModel
name: hr
info: "High Resolution UBC WRF, Western Canada"
loader: "ubc_wrf"  # Activates UBC WRF-specific handling

start_date = f"{year}-{month}-01"

# Drop last timestep (spin-up for next month)
ds = ds.isel(Times=slice(None, -1))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note, but the time coordinate in the COMPRESSED_RAIN ubc wrf files is just 'time', as opposed to the COMPRESSED_SUBSETTED and COMPRESSED_SNOW files where the time coordinate is 'Times'. Maybe it's possible to check the name of the time coordinate and adjust accordingly, so that it can process ubc wrf precip?

(Note that the precip and snow variables sometimes found in COMPRESSED_SUBSETTED are not necessarily correct and are to be ignored, as per chatting with Tim.)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just another quick note on the precip files should we choose to consider them--they have a different number of time steps per file than the SUBSETTED files.

For instance, COMPRESSED_RAIN_d03_metgrid_1999_09.nc has 720 timesteps, ranging from 1999-09-01_01:00:00 to 1999-10-01_00:00:00 (no Sept 00:00:00 time step!). As per Tim, this is because the precipitation values are valid for the preceding hour.

Copy link
Copy Markdown
Contributor

@bobby-payne bobby-payne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Added one minor comment about UBC-WRF's time coordinate (in its current state it won't be able to process the precipitation variables), but it may not be a big priority depending on our needs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants