When spirit is invoked via the API, it is possible to call the status API:
|
// Progress is returned as a struct because we may add more to it later. |
|
// It is designed for wrappers (like a GUI) to be able to summarize the |
|
// current status without parsing log output. |
|
type Progress struct { |
|
CurrentState State // current state, i.e. CopyRows |
|
Summary string // text based representation, i.e. "12.5% copyRows ETA 1h 30m" |
|
|
|
// Tables contains per-table progress for multi-table migrations. |
|
// For single-table migrations, this will have one entry. |
|
Tables []TableProgress |
|
} |
|
|
|
// TableProgress tracks progress for a single table in the migration. |
|
type TableProgress struct { |
|
TableName string // name of the table being migrated |
|
RowsCopied uint64 // rows copied so far |
|
RowsTotal uint64 // total rows expected |
|
IsComplete bool // true if this table's copy is complete |
|
} |
We use this internally, but noticed two limitations we would like to address:
-
On recovery the state is effectively "reset" because it must walk through copyRows, checksum, etc - even if these states are brief. For GUI clients if they had a pod failure while waiting on the sentinel this can look confusing. What might be easier is advertising the CurrentState as "recovering". We can leave the Summary and TableProgress unchanged -- it's just a string change of the human readable state.
-
We should provide context on if the migration is throttled. Currently this is only available by looking at the logs.
For an implementation of (1) the thoughts are roughly:
if usedResumeFromCheckpoint == true and the copier progress && checksum progress is very high.
When spirit is invoked via the API, it is possible to call the status API:
spirit/pkg/status/progress.go
Lines 3 to 21 in a1c422e
We use this internally, but noticed two limitations we would like to address:
On recovery the state is effectively "reset" because it must walk through copyRows, checksum, etc - even if these states are brief. For GUI clients if they had a pod failure while waiting on the sentinel this can look confusing. What might be easier is advertising the CurrentState as "recovering". We can leave the Summary and TableProgress unchanged -- it's just a string change of the human readable state.
We should provide context on if the migration is throttled. Currently this is only available by looking at the logs.
For an implementation of (1) the thoughts are roughly:
if usedResumeFromCheckpoint == true and the copier progress && checksum progress is very high.