Skip to content

Add security violations filter processor, and plug it in to the default OTLP logs pipeline when TcplogReceiver is configured#1576

Open
kamalchaturvedi wants to merge 3 commits intonginx:mainfrom
kamalchaturvedi:security_violations_filter_processor
Open

Add security violations filter processor, and plug it in to the default OTLP logs pipeline when TcplogReceiver is configured#1576
kamalchaturvedi wants to merge 3 commits intonginx:mainfrom
kamalchaturvedi:security_violations_filter_processor

Conversation

@kamalchaturvedi
Copy link
Copy Markdown
Contributor

@kamalchaturvedi kamalchaturvedi commented Mar 25, 2026

Proposed changes

COMMIT 1:

Add a new security violations filter processor that validates the first message as a one-time gate — on the very first log record, it checks:
- The body is a string (not int, bytes, etc.).
- The body has exactly 28 pipe-separated fields matching the secops-dashboard-log profile format.

If either check fails, the gate closes permanently — all subsequent messages are dropped with zero overhead (early return before any iteration) until the OTel collector is restarted. If it passes, the gate opens permanently and all future string-bodied records flow through.

COMMIT 2:

Add support for NAP V5 (containerized mode) by verifying syslog IP configured against docker0 interface IP as well (along with 127.0.0.1)

COMMIT 3:

Plug security violation filter processor into the default logs pipeline, replacing the:
- logsgzipprocessor: This processor was introduced for NGINX One for a deprecated project, which is not in scope anymore.
- securityviolationsprocessor: This processor was converting the sylog security violation into a JSON body log record format, and then forwarding it. This deterministix JSON log body conversion is an overhead for the agent to process for every single log record.

Verification

  • Default OTEL pipeline now, once TcplogReceiver is configured by the agent (when app_protect_security_log directive has syslog:server=127.0.0.1:1514 configured)
  pipelines:
    metrics/default:
      receivers:
        - hostmetrics
      processors:
        - batch/default_metrics
        - resource/default
      exporters:
        - otlp/default
    logs/default:
      receivers:
        - tcplog/nginx_app_protect
      processors:
        - securityviolationsfilter/default
        - batch/default_logs
        - resource/default
      exporters:
        - otlp/default
  • In the first log the processor encounters, if valid number of CSV separated fields are sent as per the log-profile configured, the violation is forwarded to the management plane (as per the batch settings). Any subsequent violation is also forwarded.

  • In the first log the processor encounters, if the expected CSV separated fields are not sent as per the log-profile configured , the violation is filtered out. Any subsequent violation is also filtered out, and not forwarded by agent.

Here is the error log that can be verified in the /var/log/nginx-agent/opentelemetry-collector-agent.log file for that error case:

2026-03-25T21:44:42.233Z	error	securityviolationsfilterprocessor/processor.go:102	Security violation log does not appear to be CSV format. Ensure the NAP logging profile uses the secops-dashboard-log format. All security violation logs will be dropped until the collector is restarted.	{"resource": {"service.instance.id": "3ff23ae7-7713-4745-bc28-cfc2198f8eb4", "service.name": "otel-nginx-agent", "service.version": "v3.8.0"}, "otelcol.component.id": "securityviolationsfilter/default", "otelcol.component.kind": "processor", "otelcol.pipeline.id": "logs/default", "otelcol.signal": "logs", "expected_fields": 27, "actual_fields": 1}
github.com/nginx/agent/v3/internal/collector/securityviolationsfilterprocessor.(*securityViolationsFilterProcessor).ConsumeLogs.(*securityViolationsFilterProcessor).ConsumeLogs.ResourceLogsSlice.All.func2.(*securityViolationsFilterProcessor).ConsumeLogs-range1.(*securityViolationsFilterProcessor).ConsumeLogs.(*securityViolationsFilterProcessor).ConsumeLogs.ResourceLogsSlice.All.func2.(*securityViolationsFilterProcessor).ConsumeLogs-range1.ScopeLogsSlice.All.func5.(*securityViolationsFilterProcessor).ConsumeLogs.(*securityViolationsFilterProcessor).ConsumeLogs.ResourceLogsSlice.All.func2.(*securityViolationsFilterProcessor).ConsumeLogs-range1-range4.func6.1

Checklist

Before creating a PR, run through this checklist and mark each as complete.

  • I have read the CONTRIBUTING document
  • I have run make install-tools and have attached any dependency changes to this pull request
  • If applicable, I have added tests that prove my fix is effective or that my feature works
  • If applicable, I have checked that any relevant tests pass after adding my changes
  • If applicable, I have updated any relevant documentation (README.md)
  • If applicable, I have tested my cross-platform changes on Ubuntu 22, Redhat 8, SUSE 15 and FreeBSD 13

@kamalchaturvedi kamalchaturvedi requested a review from a team as a code owner March 25, 2026 03:13
@kamalchaturvedi kamalchaturvedi force-pushed the security_violations_filter_processor branch from 9ce64b2 to 482dc34 Compare March 26, 2026 05:47
@github-actions github-actions bot added chore Pull requests for routine tasks dependencies labels Mar 26, 2026
@kamalchaturvedi kamalchaturvedi changed the title Draft: Add security violations filter processor, and plug it in to the default OTLP logs pipeline when TcplogReceiver is configured Add security violations filter processor, and plug it in to the default OTLP logs pipeline when TcplogReceiver is configured Mar 26, 2026
}
}

func enforceSafeLogBatchProcessors(col *Collector) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this function should be here. Hardcoding the values of the batch processor in the code means that we can't change them in the future. Also means if a user wants to send logs to a different exporter they cant use their own log batch processor as it would be overridden by this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Trying to see what we can do to ensure customer does not set huge values for batch processor when talking to N1C.
In our testing, we found that larger values lead to rpc error: code = ResourceExhausted desc = grpc: received message larger than max message returned, which is a permanent error for otel collector (export request is not retried by it).

Any suggestions to handle it safely ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our documentation, we do warn users that if they are customising their OTel configuration, overriding default processors can cause issues with sending telemetry to their management plane. https://docs.nginx.com/nginx-one-console/agent/configure-otel-metrics/#example-usage

In the future, we would like to implement a mechanism where the management plane can configure the otel collector remotely so that even if a user overrides a default processor the management plane can undo it by sending their desired configuration when agents connects.
This would allow management planes to change these settings in the future without requiring updates to agent code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do you suggest we remove this safety check, or is it fine as is ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest removing this check

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay removed the commit that introduced that check.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 1, 2026

Codecov Report

❌ Patch coverage is 85.81560% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.07%. Comparing base (4a95e53) to head (a944b3e).

Files with missing lines Patch % Lines
internal/config/config.go 81.81% 4 Missing and 4 partials ⚠️
internal/datasource/config/nginx_config_parser.go 71.42% 5 Missing and 3 partials ⚠️
...ector/securityviolationsfilterprocessor/factory.go 42.85% 3 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1576      +/-   ##
==========================================
+ Coverage   85.02%   85.07%   +0.04%     
==========================================
  Files         103      105       +2     
  Lines       13589    13719     +130     
==========================================
+ Hits        11554    11671     +117     
- Misses       1518     1529      +11     
- Partials      517      519       +2     
Files with missing lines Coverage Δ
internal/collector/factories.go 100.00% <100.00%> (ø)
...tor/securityviolationsfilterprocessor/processor.go 100.00% <100.00%> (ø)
internal/config/types.go 87.09% <ø> (ø)
...ector/securityviolationsfilterprocessor/factory.go 42.85% <42.85%> (ø)
internal/config/config.go 88.41% <81.81%> (+0.70%) ⬆️
internal/datasource/config/nginx_config_parser.go 79.25% <71.42%> (-0.22%) ⬇️

... and 2 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4a95e53...a944b3e. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…st message as a one-time gate — on the very first log record, it checks:

- The body is a string (not int, bytes, etc.).
- The body has exactly 27 pipe-separated fields matching the secops_dashboard-log profile format.

If either check fails, the gate closes permanently — all subsequent messages are dropped with zero overhead (early return before any iteration) until the OTel collector is restarted. If it passes, the gate opens permanently and all future string-bodied records flow through.
…nfigured against docker0 interface IP as well (along with 127.0.0.1)
@kamalchaturvedi kamalchaturvedi force-pushed the security_violations_filter_processor branch from 8f02abb to a944b3e Compare April 2, 2026 02:20
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Apr 2, 2026
…ne, replacing the:

- logsgzipprocessor: This processor was introduced for NGINX One for a deprecated project, which is not in scope anymore.
- securityviolationsprocessor: This processor was converting the sylog security violation into a JSON body log record format, and then forwarding it. This deterministed JSON log body parsing is an overhead for the agent to process for every single log record.
@kamalchaturvedi kamalchaturvedi force-pushed the security_violations_filter_processor branch from a944b3e to 1f88bd7 Compare April 3, 2026 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chore Pull requests for routine tasks dependencies documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants