Skip to content

feat(kalshi): add market_details and market_trades gold layer#9549

Open
los-xyz wants to merge 13 commits intomainfrom
feat/kalshi-market-details-trades
Open

feat(kalshi): add market_details and market_trades gold layer#9549
los-xyz wants to merge 13 commits intomainfrom
feat/kalshi-market-details-trades

Conversation

@los-xyz
Copy link
Copy Markdown
Contributor

@los-xyz los-xyz commented Apr 10, 2026

Summary

Adds two new Kalshi prediction market spells, transforming API-sourced bronze tables into a clean gold layer:

  • kalshi.market_details (TABLE): Market reference table joining markets_raw with event metadata from market_details_raw. Filtered to markets with >= 100 contracts traded.
  • kalshi.market_trades (VIEW): Trade-level table enriched with market metadata via inner join to market_details.

Design choices (bronze → gold)

  • >= 100 contracts filter: drops 85% of markets (dust/empty), keeps 99.7% of volume
  • 12 columns dropped: universally null (functional_strike, mve_*, is_provisional, fee_waiver_expiration_time), constant (response_price_units, notional_value_dollars, price_ranges), always zero (liquidity_dollars), borderline sparse (rules_secondary, primary_participant_key, settlement_timer_seconds), internal (created_hour)
  • INNER join on trades: ensures only trades for meaningful markets flow to gold layer
  • Pricing snapshot kept: despite being latest-state-only, the orderbook/OI columns are well-populated on active markets and useful for current-state analysis

Test plan

  • dbt compile passes for both models
  • CI builds and tests pass
  • Verify market_details unique on ticker
  • Verify market_trades unique on trade_id
  • Spot-check join completeness (NULL rates on enriched columns)

🤖 Generated with Claude Code

Add two new Kalshi prediction market spells built from API bronze tables:

- kalshi.market_details: market reference table joining markets_0003
  with event metadata from market_details_0003. Filtered to markets
  with >= 100 contracts traded (6.5M of 39.8M markets, 99.7% of volume).
  Drops 12 universally null/constant columns (55 → 43).

- kalshi.market_trades: trade-level view enriched with market metadata
  via inner join to market_details, filtering out dust market trades.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added WIP work in progress dbt: daily covers the Daily dbt subproject labels Apr 10, 2026
los-xyz and others added 2 commits April 10, 2026 11:39
- Rename sources from _0003 to _raw (market_trades_raw, markets_raw, market_details_raw)
- Update contributor from dpettas to allelosi

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@los-xyz los-xyz requested a review from jeff-dude April 10, 2026 10:44
los-xyz and others added 4 commits April 10, 2026 13:45
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… in details

- market_trades: add amount_usd (yes_price_dollars * count_fp) and _updated_at
- market_details: extract category from product_metadata JSON

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jeff-dude
Copy link
Copy Markdown
Member

pushed changes to make the models incremental, since larger tables.
would be good if you could test the output build in CI here. note that CI tables get dropped over the weekend, so you may need to rerun the actions to rebuild if looking later.
plz ensure these outputs still match what you expect

@los-xyz los-xyz marked this pull request as ready for review April 12, 2026 16:07
@cursor
Copy link
Copy Markdown

cursor bot commented Apr 12, 2026

PR Summary

Medium Risk
Adds new incremental/merge dbt models and sources for Kalshi, including join logic and rolling-window incremental predicates that can impact backfill size and correctness if source timestamps or keys behave unexpectedly.

Overview
Introduces a new Kalshi gold layer with two incremental merge models: kalshi.market_details (market reference data) and kalshi.market_trades (trade facts enriched with market metadata).

market_details joins markets_raw with market_details_raw, filters to markets with volume_fp >= 100, and adds a watermark_ts to drive incremental refreshes; market_trades is partitioned by block_month, merges on (block_month, trade_id), and reprocesses trades for tickers whose market_details.watermark_ts changed to keep dimension fields current.

Also registers three new Kalshi raw sources (markets_raw, market_details_raw, market_trades_raw) and adds schema docs/tests enforcing non-null and uniqueness constraints for the new models.

Reviewed by Cursor Bugbot for commit 4f4ca39. Configure here.

@github-actions github-actions bot added ready-for-review this PR development is complete, please review and removed WIP work in progress labels Apr 12, 2026
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Missing deduplication on event_details CTE causes row fan-out
    • Added QUALIFY with ROW_NUMBER() to event_details CTE to keep only the latest row per event_ticker, preventing join fan-out and duplicate ticker rows.

Create PR

Or push these changes by commenting:

@cursor push 491ff46c63
Preview (491ff46c63)
diff --git a/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_details.sql b/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_details.sql
--- a/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_details.sql
+++ b/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_details.sql
@@ -54,6 +54,7 @@
 		strike_period,
 		last_updated_ts
 	from {{ source('kalshi', 'market_details_raw') }}
+	qualify row_number() over (partition by event_ticker order by last_updated_ts desc) = 1
 )
 
 select

You can send follow-ups to the cloud agent here.

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 4f4ca39. Configure here.

los-xyz and others added 5 commits April 12, 2026 18:17
Adds QUALIFY ROW_NUMBER() to keep only the latest row per event_ticker
from market_details_raw, preventing potential duplicate ticker rows if
the raw source ever contains multiple snapshots per event.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…_details

category (from $.category) is NULL until ingestion adds the field.
competition and competition_scope are available now from product_metadata
and give useful values (e.g., "Pro Football", "College Basketball (M)", "Game").

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
QUALIFY is not supported in Trino/DuneSQL. Rewrote event_details
deduplication as a subquery with ROW_NUMBER() + WHERE rn = 1.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dbt: daily covers the Daily dbt subproject ready-for-review this PR development is complete, please review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants