fix(CII): replace hard-cap HAPI fallback with log scale + finer displacement tiers#2577
fix(CII): replace hard-cap HAPI fallback with log scale + finer displacement tiers#2577fuleinist wants to merge 4 commits intokoala73:mainfrom
Conversation
- Create seed-climate-zone-normals.mjs to fetch 1991-2020 historical monthly means from Open-Meteo archive API per zone - Update seed-climate-anomalies.mjs to use WMO normals as baseline instead of climatologically meaningless 30-day rolling window - Add 7 new climate-specific zones: Arctic, Greenland, WestAntarctic, TibetanPlateau, CongoBasin, CoralTriangle, NorthAtlantic - Register climateZoneNormals cache key in cache-keys.ts - Add fallback to rolling baseline if normals not yet cached Fixes: koala73#2467
- seed-climate-zone-normals.mjs: Now fetches normals for ALL 22 zones (15 original geopolitical + 7 new climate zones) instead of just the 7 new climate zones. The 15 original zones were falling through to the broken rolling fallback. - seed-climate-anomalies.mjs: Fixed rolling fallback to fetch 30 days of data when WMO normals are not yet cached. Previously fetched only 7 days, causing baselineTemps slice to be empty and returning null for all zones. Now properly falls back to 30-day rolling baseline (last 7 days vs. prior 23 days) when normals seeder hasn't run. - cache-keys.ts: Removed climateZoneNormals from BOOTSTRAP_CACHE_KEYS. This is an internal seed-pipeline artifact (used by the anomaly seeder to read cached normals) and is not meant for the bootstrap endpoint. Only climate:anomalies:v1 (the final computed output) should be exposed to clients. Fixes greptile-apps P1 comments on PR koala73#2504.
…acement tiers Fixes algorithmic bias where China scores comparably to active conflict states due to Math.min(60, linear) compression in HAPI fallback. Changes: - HAPI fallback: Math.min(60, events * 3 * mult) → Math.min(60, log1p(events * mult) * 12) Preserves ordering: Iran (1549 events) now scores >> China (46 events) - Displacement tiers: 2 → 6 tiers (10K/100K/500K/1M/5M/10M thresholds) Adds signal for Syria's 5.65M outflow vs China's 332K Addresses koala73#2457 (point 1 and 3 per collaborator feedback)
|
Someone is attempting to deploy a commit to the Elie Team on Vercel. A member of the Team first needs to authorize it. |
Greptile SummaryThis PR fixes an algorithmic compression bug in the Country Instability Index where China and Iran both capped at the same HAPI fallback score (60) despite a 33× difference in raw events, and separately adds finer displacement tiers to better separate humanitarian crises by magnitude. It also introduces a companion climate-scoring improvement: a new monthly seeder ( Key changes:
One P1 finding: The new normals seeder's Confidence Score: 4/5Safe to merge after fixing the normals-seeder validate threshold; the CII scoring fix itself is correct and well-scoped. The CII scoring changes in country-instability.ts are correct: math checks out, both call-sites updated identically, and the existing test suite covers floor/cap/ordering invariants. The climate-seeder work has one P1: a too-permissive validate function in the new normals seeder can cause a weeks-long degraded state for climate anomaly freshness. No data corruption occurs (old data is preserved), but the issue is a real operational defect on the changed path. Fixing the threshold to >= ceil(ALL_ZONES.length * 2/3) resolves it. scripts/seed-climate-zone-normals.mjs — validate function and zone-list duplication need attention before deploying the monthly cron. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Monthly cron
seed-climate-zone-normals.mjs] -->|Fetches 1991-2020 archive
for all 22 zones| B[Open-Meteo Archive API]
B --> C{validate:
zones.length > 0
⚠️ too weak}
C -->|passes| D[(Redis
climate:zone-normals:v1
TTL 30 days)]
C -->|fails| E[Abort — no write
old data preserved]
F[3h cron
seed-climate-anomalies.mjs] --> G{fetchZoneNormalsFromRedis}
G -->|normals found| H[daysToFetch = 7
hasNormals = true]
G -->|not found| I[daysToFetch = 30
hasNormals = false]
H --> J[fetchZone per zone
7 days current data]
I --> J
J --> K{zoneNormal found
for this zone?}
K -->|yes| L[Compare vs WMO monthly mean
baselineSource: wmo-30y-normals]
K -->|no + hasNormals=true| M[Fallback: slice 0..-7
⚠️ empty — returns null]
K -->|no + hasNormals=false| N[Fallback: 30-day rolling
baselineSource: rolling-30d-fallback]
L --> O{MIN_ZONES check
ceil 22 × 2/3 = 15}
M --> O
N --> O
O -->|enough zones| P[(Redis
climate:anomalies:v1
TTL 3h)]
O -->|too few| Q[Throw — skip write
preserve stale data]
Reviews (1): Last reviewed commit: "fix(CII): replace hard-cap HAPI fallback..." | Re-trigger Greptile |
| ttlSeconds: CACHE_TTL, | ||
| sourceVersion: 'open-meteo-archive-wmo-normals', | ||
| }).catch((err) => { |
There was a problem hiding this comment.
Validate threshold too weak — partial failure silently breaks anomalies seeder
The current validate function accepts any non-zero zone count. If the normals seeder succeeds for only a few zones (e.g., 3 out of 22 due to transient API failures), the weak predicate passes, and those incomplete normals are written to Redis with a 30-day TTL.
The anomalies seeder then reads this cache, sees normals.length > 0, and sets daysToFetch = 7. But the 19 zones without normals enter the fallback path where temps.slice(0, -7) is empty (only 7 days were fetched), triggering the baselineTemps.length < 7 guard and returning null. The anomalies seeder's MIN_ZONES = ceil(22 * 2/3) = 15 check then fails on every run until the normals seeder re-runs (up to 30 days later), so users receive stale climate anomaly data for the rest of the cache window.
The anomalies seeder already applies a 2/3 zone threshold — mirror that here:
| ttlSeconds: CACHE_TTL, | |
| sourceVersion: 'open-meteo-archive-wmo-normals', | |
| }).catch((err) => { | |
| const MIN_ZONES = Math.ceil(ALL_ZONES.length * 2 / 3); | |
| function validate(data) { | |
| return Array.isArray(data?.zones) && data.zones.length >= MIN_ZONES; | |
| } |
There was a problem hiding this comment.
Your comment has been addressed: The validate function now requires at least MIN_ZONES = ceil(22*2/3) = 15 zones (imported from _climate-zones.mjs). Previously it accepted any non-zero zone count, which could write an incomplete cache and cause the anomalies seeder to throw on every run for up to 30 days.
| // Geopolitical zones (original 15 — must be kept in sync with seed-climate-anomalies.mjs) | ||
| const ZONES = [ | ||
| { name: 'Ukraine', lat: 48.4, lon: 31.2 }, | ||
| { name: 'Middle East', lat: 33.0, lon: 44.0 }, | ||
| { name: 'Sahel', lat: 14.0, lon: 0.0 }, | ||
| { name: 'Horn of Africa', lat: 8.0, lon: 42.0 }, | ||
| { name: 'South Asia', lat: 25.0, lon: 78.0 }, | ||
| { name: 'California', lat: 36.8, lon: -119.4 }, | ||
| { name: 'Amazon', lat: -3.4, lon: -60.0 }, | ||
| { name: 'Australia', lat: -25.0, lon: 134.0 }, | ||
| { name: 'Mediterranean', lat: 38.0, lon: 20.0 }, | ||
| { name: 'Taiwan Strait', lat: 24.0, lon: 120.0 }, | ||
| { name: 'Myanmar', lat: 19.8, lon: 96.7 }, | ||
| { name: 'Central Africa', lat: 4.0, lon: 22.0 }, | ||
| { name: 'Southern Africa', lat: -25.0, lon: 28.0 }, | ||
| { name: 'Central Asia', lat: 42.0, lon: 65.0 }, | ||
| { name: 'Caribbean', lat: 19.0, lon: -72.0 }, | ||
| ]; | ||
|
|
||
| // Climate-specific zones (7 new zones) | ||
| const CLIMATE_ZONES = [ | ||
| { name: 'Arctic', lat: 70.0, lon: 0.0 }, // sea ice proxy | ||
| { name: 'Greenland', lat: 72.0, lon: -42.0 }, // ice sheet melt | ||
| { name: 'WestAntarctic', lat: -78.0, lon: -100.0 }, // Antarctic Ice Sheet | ||
| { name: 'TibetanPlateau', lat: 31.0, lon: 91.0 }, // third pole | ||
| { name: 'CongoBasin', lat: -1.0, lon: 24.0 }, // largest tropical forest after Amazon | ||
| { name: 'CoralTriangle', lat: -5.0, lon: 128.0 }, // reef bleaching proxy |
There was a problem hiding this comment.
Zone list duplicated — no enforcement of sync with
seed-climate-anomalies.mjs
ZONES, CLIMATE_ZONES, and ALL_ZONES are defined identically in both seeder files, with only a comment instructing that they "must be kept in sync." If a zone is added or renamed in one file but not the other, the normals lookup in fetchZone() silently returns null for that zone (since normals?.find((n) => n.zone === zone.name) finds no match). The zone then falls back to the short 7-day rolling window when WMO normals are otherwise available, producing a climatologically incorrect anomaly.
Consider extracting the zone definitions into a shared file (e.g., scripts/_climate-zones.mjs) and importing it in both seeders, so a single edit keeps everything consistent.
There was a problem hiding this comment.
Your comment has been addressed: Zone definitions (ZONES, CLIMATE_ZONES, ALL_ZONES, MIN_ZONES) have been extracted into scripts/_climate-zones.mjs as a single source of truth. Both seeders now import from it, so any zone add/rename/remove is always consistent across both files.
|
|
||
| const tempDelta = Math.round((currentTempMean - baselineTempMean) * 10) / 10; | ||
| const precipDelta = Math.round((currentPrecipMean - baselinePrecipMean) * 10) / 10; |
There was a problem hiding this comment.
currentMonth uses local system time — can be off by one at month boundaries
new Date().getMonth() + 1 resolves the month in the Railway container's local timezone. If the container timezone differs from UTC, the month lookup could be off by one during the first few hours of each month, causing monthNormal to not be found and silently returning null for those zones.
Consider using UTC explicitly:
| const tempDelta = Math.round((currentTempMean - baselineTempMean) * 10) / 10; | |
| const precipDelta = Math.round((currentPrecipMean - baselinePrecipMean) * 10) / 10; | |
| const currentMonth = new Date().getUTCMonth() + 1; // 1-12, UTC |
There was a problem hiding this comment.
Your comment has been addressed: currentMonth now uses new Date().getUTCMonth() + 1 instead of getMonth() + 1, eliminating the risk of off-by-one errors at month boundaries when the Railway container's local timezone differs from UTC.
- P1: seed-climate-zone-normals validate now requires >= ceil(22*2/3)=15 zones instead of >0. Partial seeding (e.g. 3/22) was passing validation and writing a 30-day TTL cache that would cause the anomalies seeder to throw on every run until cache expiry. - P2: Extract shared zone definitions (ZONES, CLIMATE_ZONES, ALL_ZONES, MIN_ZONES) into scripts/_climate-zones.mjs. Both seeders now import from the same source, eliminating the risk of silent divergence. - P2: seed-climate-anomalies currentMonth now uses getUTCMonth() instead of getMonth() to avoid off-by-one at month boundaries when the Railway container's local timezone differs from UTC. Reviewed-by: greptile-apps
- P1: seed-climate-zone-normals validate now requires >= ceil(22*2/3)=15 zones instead of >0. Partial seeding (e.g. 3/22) was passing validation and writing a 30-day TTL cache that would cause the anomalies seeder to throw on every run until cache expiry. - P2: Extract shared zone definitions (ZONES, CLIMATE_ZONES, ALL_ZONES, MIN_ZONES) into scripts/_climate-zones.mjs. Both seeders now import from the same source, eliminating the risk of silent divergence. - P2: seed-climate-anomalies currentMonth now uses getUTCMonth() instead of getMonth() to avoid off-by-one at month boundaries when the Railway container's local timezone differs from UTC. Reviewed-by: greptile-apps
Summary
Fixes algorithmic bias in the Country Instability Index (CII) where China scores comparably to active conflict states due to
Math.min(60, linear)compression in the HAPI fallback conflict score.Root Cause
In
calcConflictScore(), the HAPI fallback used:With CN's multiplier of 2.5, China's 46 HAPI events hit:
Math.min(60, 46 * 3 * 2.5) = 60(capped).Iran's 1549 events also hit:
Math.min(60, 1549 * 3 * 2.0) = 60(capped).Result: CN=60, IR=60 — indistinguishable despite 33x raw event difference.
Fix
1. HAPI fallback: log scale instead of linear
log1p(115) * 12 ≈ 56log1p(3098) * 12 ≈ 97→ capped at 602. Finer displacement tiers (2 → 6 tiers)
Syria's 5.65M outflow now scores +10 instead of +8, widening the gap vs China's 332K (+4).
Evidence
Reproduced by issue reporter (@zouyonghe):
hapiPoliticalViolence = 46,displacementOutflow = 332,007→score = 25hapiPoliticalViolence = 1,549,displacementOutflow = 214,271→score = 31hapiPoliticalViolence = 21,displacementOutflow = 5,640,785→score = 36Post-fix, Iran's higher conflict signal should pull it further above China.
Not Addressed (per collaborator feedback)
Closes #2457.