Skip to content

Apply a fixed window before writing row metrics#590

Merged
feast-ci-bot merged 1 commit into
feast-dev:masterfrom
davidheryanto:apply-windowing-before-write-row-metrics
Mar 31, 2020
Merged

Apply a fixed window before writing row metrics#590
feast-ci-bot merged 1 commit into
feast-dev:masterfrom
davidheryanto:apply-windowing-before-write-row-metrics

Conversation

@davidheryanto
Copy link
Copy Markdown
Collaborator

What this PR does / why we need it:
Apply a fixed window and send the aggregate Feature Row metrics vs sending all the Feature Row metrics directly. This is so that the metrics collector is not overwhelmed and start dropping metrics.

Which issue(s) this PR fixes:

Fixes #528

Does this PR introduce a user-facing change?:
If Telegraf is currently used to export the StatsD metric to Prometheus metric, the names of the Promethes metrics generated are changed:

- feast_ingestion_feature_row_lag_ms_90_percentile ->  
  feast_ingestion_feature_row_lag_ms_percentile_90
- feast_ingestion_feature_row_lag_ms_99_percentile ->  
  feast_ingestion_feature_row_lag_ms_percentile_99
...

This is so that it is consistent with metric name for the feature value: feature_value_percentile_90, feature_value_percentile_99 i.e. percentile_x rather than x_percentile

feature_row_event_time_epoch_ms metric is no longer written to StatsD since this metrics is rarely used from our experience, the lag metrics seems to suffice. This also helps reduce the amount of metrics sent.

In summary these are the Feature Row StatsD metrics written at every fixed window:

Gauge:

  • feature_row_lag_ms_min
  • feature_row_lag_ms_max
  • feature_row_lag_ms_mean
  • feature_row_lag_ms_percentile_90
  • feature_row_lag_ms_percentile_95
  • feature_row_lag_ms_percentile_99
  • feature_value_lag_ms_min
  • feature_value_lag_ms_max
  • feature_value_lag_ms_mean
  • feature_value_lag_ms_percentile_90
  • feature_value_lag_ms_percentile_95
  • feature_value_lag_ms_percentile_99

Count:

  • feature_row_ingested_count
  • feature_value_missing_count

@feast-ci-bot
Copy link
Copy Markdown
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: davidheryanto

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@woop
Copy link
Copy Markdown
Member

woop commented Mar 31, 2020

/lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Prevent Beam jobs from overloading StatsD

3 participants