Skip to content

protocgen/proto2pydantic

proto2pydantic

CI SLSA 3 Go Report Card

A protoc plugin that generates Pydantic v2 BaseModel classes from Protocol Buffer definitions — with google.api.field_behavior support.

Why this exists

Proto3 has no native required keyword. Every field silently accepts zero values, which means proto-generated models can't enforce that critical fields like message_id or url are actually provided.

Google's solution is google.api.field_behavior — annotations that mark fields as REQUIRED, OPTIONAL, OUTPUT_ONLY, etc:

string url = 1 [(google.api.field_behavior) = REQUIRED];  // must be provided
string tenant = 2;                                         // optional, zero-value OK
optional int32 history_length = 3;                         // explicitly optional

The problem: no existing proto-to-Pydantic tool reads these annotations.

Tool Reads field_behavior?
protobuf-to-pydantic ❌ Uses PGV or custom p2p: comments
protoc-gen-pydantic ❌ Uses optional keyword only
protobuf-pydantic-gen ❌ Basic mapping only
python-betterproto ❌ Uses optional keyword only
proto2pydantic

This means projects like A2A that use field_behavior annotations extensively have had to go through an intermediate JSON Schema step to get proper validation in Pydantic. proto2pydantic eliminates that indirection.

Features

  • 🔒 field_behavior supportREQUIRED → required Pydantic field, OUTPUT_ONLYexclude=True
  • buf/validate support — proto validation rules → Pydantic Field() constraints
  • 🐍 Idiomatic Python — snake_case fields, str Enums, oneof → union types
  • 📦 Well-known typesStructdict[str, Any], Timestampdatetime
  • 🔌 buf native — works as a local or remote buf plugin
  • ⚙️ Configurable — custom base class, camelCase aliases, output filename (see CONFIG.md for full reference)
  • 🔄 Topological sort — models ordered so dependencies are defined before use
  • 📋 __all__ exports — generated files include a clean public API list
  • 🅰️ A2A / ProtoJSON presetpreset=a2a for full ProtoJSON compatibility (camelCase, raw enums, to_proto_json(), RFC 3339 timestamps, base64 bytes)

Install

go install github.com/protocgen/proto2pydantic@latest

Usage

With buf

# buf.gen.yaml
version: v2
plugins:
  - local: protoc-gen-proto2pydantic
    out: src/types
    opt:
      - preset=a2a
      - base_class=a2a._base.A2ABaseModel
      - output_file=types.py
buf generate

With protoc

protoc --proto2pydantic_out=./output \
       --proto2pydantic_opt=base_class=myapp.BaseModel \
       your_service.proto

How it works

.proto file → protoc/buf → proto2pydantic → .py with Pydantic models

Given:

message AgentInterface {
  string url = 1 [(google.api.field_behavior) = REQUIRED];
  string protocol_binding = 2 [(google.api.field_behavior) = REQUIRED];
  string tenant = 3;
  string protocol_version = 4 [(google.api.field_behavior) = REQUIRED];
}

Generates:

class AgentInterface(BaseModel):
    model_config = ConfigDict(
        populate_by_name=True,
        alias_generator=to_camel,
    )

    url: str = Field(..., description='The URL where this interface is available.')
    protocol_binding: str = Field(..., description='The protocol binding supported at this URL.')
    tenant: str = Field(default='', description='Tenant ID.')
    protocol_version: str = Field(..., description='The version of the A2A protocol.')
  • Field(...) = required — Pydantic raises ValidationError if missing
  • Field(default='') = proto3 zero-value default — field is optional

Validation with buf/validate

proto2pydantic reads buf/validate (the successor to protoc-gen-validate) and maps constraints to Pydantic Field() arguments:

import "buf/validate/validate.proto";

message CreateUserRequest {
  string email = 1 [
    (google.api.field_behavior) = REQUIRED,
    (buf.validate.field).string.email = true
  ];
  string name = 2 [
    (buf.validate.field).string = {min_len: 1, max_len: 100}
  ];
  int32 age = 3 [
    (buf.validate.field).int32 = {gte: 0, lte: 150}
  ];
  repeated string tags = 4 [
    (buf.validate.field).repeated = {min_items: 1, max_items: 10}
  ];
}

Generates:

class CreateUserRequest(BaseModel):
    email: str = Field(...)
    name: str = Field(default='', min_length=1, max_length=100)
    age: int = Field(default=0, ge=0, le=150)
    tags: list[str] | None = Field(default=None, min_length=1, max_length=10)
buf/validate rule Pydantic Field()
required Field(...) — no default, field is required
string.min_len min_length=
string.max_len max_length=
string.pattern pattern=
int32.gte / float.gte ge=
int32.lte / float.lte le=
int32.gt / float.gt gt=
int32.lt / float.lt lt=
repeated.min_items min_length=
repeated.max_items max_length=

field_behavior annotations

Annotation Effect
REQUIRED No default value → Pydantic requires the field
OUTPUT_ONLY Field(exclude=True) → excluded from model_dump()
OPTIONAL Treated as proto3 default (zero-value)
(none) proto3 zero-value default

Options

Option Description Example
preset Preset configuration. a2a auto-sets alias_generator=camel + enum_style=raw for ProtoJSON a2a
base_class Custom base class for models a2a._base.A2ABaseModel
alias_generator Add model_config with Pydantic's to_camel for ProtoJSON-compatible lowerCamelCase aliases (populate_by_name=True allows both snake_case and camelCase input) camel
enum_style Enum generation style. raw preserves original proto names (e.g., TASK_STATE_COMPLETED) and includes UNSPECIFIED values for ProtoJSON compatibility. Default strips prefix and lowercases raw
output_file Override output filename types.py
strip_proto_suffix Use foo.py instead of foo_pb2_pydantic.py true
description Override module-level docstring A2A type definitions

Type Mapping

Proto Python
string str
int32, int64, etc. int
float, double float
bool bool
bytes bytes (with base64 @field_serializer for ProtoJSON)
repeated T list[T]
map<K, V> dict[K, V]
optional T T | None
oneof T1 | T2 | ... | None
google.protobuf.Struct dict[str, Any]
google.protobuf.Timestamp datetime (with RFC 3339 @field_serializer for ProtoJSON)
google.protobuf.Value Any
Enum (default) str Enum (prefix-stripped, lowercase)
Enum (enum_style=raw) str Enum (original proto names, e.g., TASK_STATE_COMPLETED)

ProtoJSON / A2A Support

For projects following the ProtoJSON specification (like A2A per ADR-001), use the preset=a2a option:

# buf.gen.yaml
plugins:
  - local: protoc-gen-proto2pydantic
    out: src/types
    opt:
      - preset=a2a

This enables:

ProtoJSON Requirement What preset=a2a does
camelCase field names alias_generator=to_camel + populate_by_name=True
SCREAMING_SNAKE_CASE enums enum_style=raw preserves original proto names
UNSPECIFIED enum values Included (not skipped)
Null omission to_proto_json() method on every model
Timestamp → RFC 3339 @field_serializer emitting "2025-01-01T10:00:00.000Z"
bytes → base64 @field_serializer emitting base64-encoded strings

Round-trip example

from generated_types import Task, TaskState

# Deserialize ProtoJSON → Pydantic (camelCase keys accepted)
task = Task.model_validate({
    "id": "task-123",
    "contextId": "ctx-456",
    "status": {"state": "TASK_STATE_WORKING"}
})

# Pythonic access
assert task.context_id == "ctx-456"
assert task.status.state == TaskState.TASK_STATE_WORKING

# Serialize back to ProtoJSON (camelCase keys, no None values)
proto_json = task.to_proto_json()
# {"id": "task-123", "contextId": "ctx-456", "status": {"state": "TASK_STATE_WORKING"}}

Contributing

See CONTRIBUTING.md for development setup, PR process, and commit signing requirements.

Security & Supply Chain

All release binaries include SLSA Level 3 provenance. Verify a downloaded binary:

# Install the verifier
go install github.com/slsa-framework/slsa-verifier/v2/cli/slsa-verifier@latest

# Verify
slsa-verifier verify-artifact proto2pydantic_0.2.0_linux_amd64.tar.gz \
  --provenance-path multiple.intoto.jsonl \
  --source-uri github.com/protocgen/proto2pydantic

Additional security measures:

Measure Details
SLSA L3 provenance Signed build attestations for every release
CodeQL Semantic code analysis on every PR + weekly scan
govulncheck Go vulnerability database checks in CI
gosec Go security linter in CI
Signed commits Required on main via repository ruleset
Immutable tags Release tags cannot be deleted or force-pushed
Dependabot Automated dependency updates (Go modules + GitHub Actions)

See SECURITY.md for reporting vulnerabilities.

License

Apache-2.0

About

protoc/buf plugin → Pydantic v2 models with google.api.field_behavior and buf/validate support

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages