Skip to content

Issue with time-averaged History output with mismatched ref_time #4331

@mathomp4

Description

@mathomp4

What is happening

@michellefrazer and @sdrabenh (cc @wmputman) encountered an interesting bug(?) today in GEOS History output. They had a History where the geosgcm_prog collection was time-averaged, but the ref_time was not set to 210000 as one usually does for runs of GEOS starting at 21z. Instead, it (mistakenly) had no ref_time which means MAPL does its default which is ref_time: 000000.

They saw that when the history didn't have a ref_time set the monthly SLP was about 8 hPa too low. Why would this be?

Well, it turns out if you do this, then the first of the 00z-ref-time collections in a segment will be "empty". CDO sees it as:

> cdo infon stock-v12-2026Jan23-1day-c24-ProgTimeAve.geosgcm_prog_tavg_00zref.20000414_2100z.nc4
    -1 :       Date     Time   Level Gridsize    Miss :     Minimum        Mean     Maximum : Parameter name
     1 : 2000-04-14 21:00:00    1000     4704    4704 :                     nan             : DIVG
     2 : 2000-04-14 21:00:00     975     4704    4704 :                     nan             : DIVG
     3 : 2000-04-14 21:00:00     950     4704    4704 :                     nan             : DIVG
     4 : 2000-04-14 21:00:00     925     4704    4704 :                     nan             : DIVG
...
   240 : 2000-04-14 21:00:00    0.02     4704    4704 :                     nan             : OMEGA
   241 : 2000-04-14 21:00:00       0     4704       0 :      0.0000      0.0000      0.0000 : PHIS
   242 : 2000-04-14 21:00:00       0     4704       0 :      0.0000      0.0000      0.0000 : PS
   243 : 2000-04-14 21:00:00    1000     4704    4704 :                     nan             : QG
   244 : 2000-04-14 21:00:00     975     4704    4704 :                     nan             : QG
...
   578 : 2000-04-14 21:00:00    0.02     4704    4704 :                     nan             : RH
   579 : 2000-04-14 21:00:00       0     4704       0 :      0.0000      0.0000      0.0000 : SLP
   580 : 2000-04-14 21:00:00    1000     4704    4704 :                     nan             : T
...

So 2d variables seem to be 0.0 and 3d variables are nan which I think CDO's way of saying all MAPL_UNDEF.1 But it's essentially an empty collection. After that, everyone is filled out fine. If you then start a new segment, the first is empty and then all good.

So, if we say we have 30 days in a segment that means a monthly mean would average over 120 collections, and 1000 hPa/120 is about 8 hPa. (Depending on how time_ave.x would handle an empty collection...)

What should we do

The question now for @tclune, @atrayano, and @bena-nasa is: what is "correct" here if a user asks in History for 6 hours of time-averaging, but we only have run 3 hours of model.

  1. Write out an empty file
  2. Write out the three hours of what we have
  3. Error out when we process HISTORY.rc and tell the user we think they are using a ref_time that doesn't match their cap_restart start time

Number 1 is what current MAPL does and it seems to have bad consequences.

Number 2 seems nice in some ways in that users get data for the time they've run, but it also in a way is not what HISTORY.rc asked for. They asked for 6 hours of data averaged and we didn't have 6 hours average.

Number 3 might be safest of all as we know ref_time not matching the start time seems to do bad things. But, maybe there is some scenario a user wants and is depending on the current behavior? 🤷🏼

Footnotes

  1. I believe this is due to our very bad valid_range handling in GEOS/MAPL, see https://github.com/GEOS-ESM/MAPL/issues/1886

Metadata

Metadata

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions