feat(console): summarize batch notifications by table#1323
Open
absorbb wants to merge 1 commit into
Open
Conversation
When a batch connection writes to multiple tables, a connection-level problem (e.g. broken destination) used to produce one email/Slack per table — clients with many tables got tons of duplicates. Now batch notifications aggregate across tables and emit a single PARTIAL notification per channel when some tables are failing and others are succeeding (FAILED when all fail, RECOVERED when all return to SUCCESS). Behavior is controlled by a new "Summarize Batch Notifications by Table" toggle, enabled by default, surfaced both per-user (in My Email Notifications) and per Slack channel. Per-table StatusChange rows are still written as before — only the notification delivery is aggregated, so the bulker dashboard view is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| if (view.aggStatus === "SUCCESS") { | ||
| // Only notify on SUCCESS if it represents a recovery from a prior failure. | ||
| try { | ||
| const prevRow = await db.prisma().statusChange.findUnique({ where: { id: state.statusChangeId } }); |
There was a problem hiding this comment.
state.statusChangeId stores view.aggMaxId (a real per-table StatusChange id), not the previous aggregate status. That means recovery detection can be wrong: if the previous aggregate was PARTIAL but the max-id table row was SUCCESS, this branch suppresses RECOVERED (doNotify = false) even though the connection actually recovered.
| aggIncidentDetails, | ||
| aggMaxId: maxId, | ||
| aggTimestamp: maxIdEntity.timestamp!, | ||
| aggStartedAt: earliestIncidentStart ?? maxIdEntity.startedAt!, |
There was a problem hiding this comment.
For failedCount === 0 (the recovery case), earliestIncidentStart is always undefined, so aggStartedAt falls back to maxIdEntity.startedAt (the latest success timestamp). Later RECOVERED uses this as incidentStartedAt, so the email reports the incident as starting at recovery time instead of when failures began.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PARTIALstatus when only some tables are failing (instead of one notification per table).summarizeBatchNotificationsByTabletoggle that defaults totrueand is surfaced both per-user (My Email Notifications) and per Slack channel.StatusChangerows are unchanged — only the notification routing is aggregated, so the bulker dashboard view is preserved.Why
Many clients push to multiple tables through one connection. When a problem occurs at the connection level (e.g. destination broken), the cron at
/api/admin/notificationsproduced one email/Slack per table — flooding inboxes for a single underlying issue.How
UserNotificationsPreferencesandNotificationChannelschemas gainsummarizeBatchNotificationsByTable: boolean(defaulttrue); a Switch is added to the per-user settings and to the workspace Slack channel editor.processStatusChanges(pages/api/admin/notifications.ts):SUCCESS→SUCCESS(rendered asFIRST_RUN/RECOVEREDdepending on prior state),SUCCESS→FAILED,PARTIALwithstreamsFailed="N of M"and a per-table breakdown.NotificationStatefor the aggregate is stored againsttableName="", so flapping/recurring de-dup keeps working at the connection grain.RECOVEREDis detected via one Prisma lookup of the previously-notifiedstatusChangeId.Test plan
/api/admin/notifications?dryRun=trueagainst a workspace whose batch connection has tables in mixedSUCCESS/FAILEDstate — verify exactly onePARTIALnotification per channel.SUCCESS— verify oneRECOVEREDnotification per channel.loadNotificationsChannels).🤖 Generated with Claude Code