DB refresh: Decrease number of hard failures & notifications

It is a bit of a burden to see like an average of 1-3 DB refresh GitHub action failures in the inbox each day.

These are usually due to:
- Threshold for draft cset finalization expansion being resolved: We set to 2 hours, but sometimes it takes longer than that. It fails if it sees unresolved after 2 hours.
- Timeouts: #1019 
- Other misc, temp infrastructural errors, like temp enclave [err 406 ](https://github.com/jhu-bids/TermHub/actions/runs/16086157159/job/45397517405), or GitHub or PyPi having a random blip.

What we should do:
- Log every refresh error (and probably success, too) in the DB. Only raise an error in the GitHub action if the refresh is _persistently failing_. E.g. maybe failing for like 6 refreshes in a row (2 hours).
- At that point, throw an error.
  - When thrown, look at the DB and collect all the errors that have occurred since the last time an error was reported, and print them all in the log. Maybe show them as a table, with one column for the datetime, another for the type of error / err name, and another for details. Sort by datetime or type --> datetime.
- Then, throw no errors except for 1x/day. That is, if the refresh fails, and it sees that it has thrown an error in the last <24 hours, just exit quietly ("success"), because it's already thrown an error and we already know about it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DB refresh: Decrease number of hard failures & notifications #1020

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

DB refresh: Decrease number of hard failures & notifications #1020

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions