Skip to content

Silent value corruption: YAML scalars with underscore-separated digits are mangled on round-trip #2161

@joelgriffiths

Description

@joelgriffiths

What happens:
A value like USERNAME: 12345_54321 comes back as 1234554321 after sops -e followed by sops -d. The underscore is gone. No error, no warning, nothing in the output to tell me anything changed.

To Reproduce:

❯ cat test.yaml
apiVersion: v1
kind: ConfigMap
metadata:
    name: test_cm
data:
    USERNAME: 12345_54321
❯ sops -e --age age12ua3m79hvhg8kc0p8lkzzu78fac9rtgy424retgc58prztrfccyqm0urdg test.yaml | \
sops -d --filename-override test.yaml /dev/stdin
apiVersion: v1
kind: ConfigMap
metadata:
    name: test_cm
data:
    USERNAME: 1234554321
~/tmp

This matters more than the other round-trip issues because it isn't just a formatting problem like #864 or a quoting problem like #760 or #949. Those are annoying but the value survives. This one changes the actual decrypted value of a secret. For a tool whose entire job is preserving secret values through encryption and decryption, silently mutating the value is a real problem. There's no signal to the operator that anything happened until something downstream fails to authenticate, and at that point you have no idea why.

Root cause
go-yaml parses 12345_54321 as the integer 1234554321, treating the underscore as a YAML 1.1 digit separator. YAML 1.2 dropped underscore separators from the integer schema, but go-yaml documents that it intentionally retains some 1.1 behaviors for backward compatibility (the README calls this out explicitly for octals, and the same posture covers this case). SOPS operates on the parsed data model, so when it serializes back out, it emits the canonical integer form and the underscore is gone forever.

I understand that you can't just reject unquoted values like this, because the same byte sequence is genuinely ambiguous:

A: 12345_54321   # could be the string "12345_54321"
B: 12345_54321   # could be an int 1234554321 with 1.1-style separators

There's no way for SOPS to tell those apart from the file alone. So I'm not asking for a behavior change that breaks existing users.

What I'm asking for is warning. Before SOPS writes output, parse and reserialize the plaintext and compare to the original. If they differ in ways beyond whitespace, print something to stderr like:

warning: value at .USERNAME will not round-trip: "12345_54321" -> 1234554321

That's it. A simple h heads up so people will understand that their value may have been mutated before they ship a corrupted secret into production.

A --strict flag that hard-errors on the same condition would also be great as an opt-in for people who are using SOPS for credentials and want guarantees, but the warning alone could help.

Related issues
#760, #864, #949, #1068. Same general class of round-trip fidelity problems. This one is worse because it changes value, not formatting.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementnot-a-bugSOPS behaves as designed, though apparently not as the user expectedstores/yaml

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions