Best-practice: anonymize secrets in VCR test cassettes (+ minor SDK jinja2 trust-boundary comment)

Hi Traceloop team,

While testing [AI PatchLab](https://github.com/elfrost/ai-patchlab) (an open-source local-first security scanner) on a few mid-popularity Python AI projects, I scanned openllmetry at approximately `72fc45e` and wanted to flag one best-practice improvement plus one minor SDK code-clarity note. Filing as a single courtesy issue.

Full curated write-up of the scan (with FP analysis, methodology, and the findings AI PatchLab got wrong): https://elfrost.github.io/ai-patchlab/scans/traceloop-openllmetry.html

## 1. Anonymize secrets in VCR cassettes before recording

Of 26 high-severity findings on the scan, 25 are Gitleaks matches in `packages/**/tests/cassettes/**.yaml`:

- 11× `aws-access-token` matches in `opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_*/`
- 8× `jwt` matches in `opentelemetry-instrumentation-watsonx/tests/`
- 6× `generic-api-key` matches (including PostHog `phc_…` public keys in haystack cassettes)

**None of these are credential leaks today**: the AWS findings are access key IDs without their corresponding secret keys (the Sigv4 signature in the cassette is only valid for that one already-replayed request), the JWTs have transparently placeholder claims (`sub: noone@ibm.com`, `account.bss: abc123`), and the PostHog `phc_` keys are public write-only event-ingestion identifiers by design.

But this is still worth addressing because:
- **Cassettes leak metadata**: which AWS account, which Bedrock model, which day, which API surface. For an observability SDK that ships to enterprises, that's worth scrubbing.
- **One bad re-record away from a real secret**: if VCR isn't configured to anonymize, the next contributor recording a cassette with a real prod key against a different provider will accidentally land it. Unfiltered cassettes are a recurring source of real-world key leaks across Python OSS.

Recommended fix: configure VCR's `filter_headers`, `filter_query_parameters`, and `before_record_response` in the test base (probably in each package's `conftest.py` or a shared `tests/common/`):

```python
import vcr

vcr_config = vcr.VCR(
    filter_headers=[
        ('authorization', 'REDACTED'),
        ('x-api-key', 'REDACTED'),
    ],
    filter_query_parameters=[
        ('api_key', 'REDACTED'),
    ],
    # Optional: response body scrub for tokens/JWTs returned from auth endpoints
    before_record_response=lambda response: response,  # add custom redaction if needed
)
```

This single change would zero out 25 of the 26 high-severity findings on a re-scan and reduce the per-re-record drift risk to near-zero.

## 2. `packages/traceloop-sdk/traceloop/sdk/prompts/client.py:44` — a comment on the `jinja2.Environment()` use

```python
obj._jinja_env = Environment()
```

A Semgrep rule (`direct-use-of-jinja2`) flags this because `Environment()` defaults to `autoescape=False`, which would be a real concern when rendering to HTML. Here the Environment is used to render LLM prompts, where `autoescape=True` would actively damage the output (escaping `<`, `>`, `&` etc. that may be intentional in the prompt).

So the current code is correct — just suggesting a one-line comment so future contributors and security scanners don't keep flagging this:

```python
# autoescape disabled: rendered output goes to an LLM as a prompt, not to HTML
obj._jinja_env = Environment()
```

---

Both items are low-priority. Happy to open separate PRs if useful. Thanks for openllmetry — the rest of the scan turned up only false positives or by-design patterns (token-count logger calls, plugin-discovery dynamic imports, sample-app calculator with whitelisted `eval`), which is a good sign about the codebase overall.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Best-practice: anonymize secrets in VCR test cassettes (+ minor SDK jinja2 trust-boundary comment) #4150

1. Anonymize secrets in VCR cassettes before recording

2. `packages/traceloop-sdk/traceloop/sdk/prompts/client.py:44` — a comment on the `jinja2.Environment()` use

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Best-practice: anonymize secrets in VCR test cassettes (+ minor SDK jinja2 trust-boundary comment) #4150

Description

1. Anonymize secrets in VCR cassettes before recording

2. packages/traceloop-sdk/traceloop/sdk/prompts/client.py:44 — a comment on the jinja2.Environment() use

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

2. `packages/traceloop-sdk/traceloop/sdk/prompts/client.py:44` — a comment on the `jinja2.Environment()` use