Add intpack FOR encoding and packfile offset index codec#649
Add intpack FOR encoding and packfile offset index codec#649tamirms wants to merge 1 commit intofeature/full-historyfrom
Conversation
36fb67c to
2532614
Compare
There was a problem hiding this comment.
Pull request overview
Adds foundational packfile building blocks for the “full history RPC v2” effort: a Frame-of-Reference (FOR) integer codec and a FOR-based offset index codec with CRC32C integrity checking.
Changes:
- Introduces FOR encoding/decoding for
[]uint32groups with a fixed tail metadata layout. - Implements an offset index encoder/decoder that stores per-record byte sizes in FOR-128 groups and validates integrity via CRC32C and structural checks.
- Adds unit tests covering round-trips, corruption detection, and key error cases for both codecs.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| cmd/stellar-rpc/internal/packfile/for.go | Adds FOR group encoder/decoder and bit-pack/unpack helpers. |
| cmd/stellar-rpc/internal/packfile/for_test.go | Tests FOR codec roundtrip, layout, buffer reuse, and error paths. |
| cmd/stellar-rpc/internal/packfile/index.go | Adds FOR-128 offset index codec with CRC32C and sanity checks. |
| cmd/stellar-rpc/internal/packfile/index_test.go | Tests index roundtrip, CRC corruption detection, and sanity check failures. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func decodeIndex(buf []byte, recordCount int, indexSize int, indexBase int64) ([]int64, error) { | ||
| if indexSize < 4 { | ||
| return nil, fmt.Errorf("%w: index too small (%d bytes)", ErrCorrupt, indexSize) | ||
| } | ||
|
|
||
| // Sanity-check recordCount against indexSize to prevent OOM from crafted trailers. | ||
| // Each FOR group of up to 128 records requires at least 6 bytes (1B packed + 1B width + 4B min). | ||
| maxGroups := (indexSize - 4) / 6 // subtract CRC, divide by min group size | ||
| maxRecords := maxGroups * groupSize | ||
| if recordCount > maxRecords { | ||
| return nil, fmt.Errorf("%w: recordCount %d implausible for indexSize %d", ErrCorrupt, recordCount, indexSize) | ||
| } |
There was a problem hiding this comment.
decodeIndex should reject negative recordCount. As written, make([]uint32, recordCount) will panic for recordCount < 0, which could be triggered by corrupt/untrusted metadata. Add an early check like if recordCount < 0 { return nil, ... }.
2532614 to
e5a9c20
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func encodeIndex(offsets []int64) ([]byte, error) { | ||
| recordCount := len(offsets) - 1 | ||
| if recordCount > math.MaxUint32 { | ||
| return nil, fmt.Errorf("packfile: record count %d exceeds uint32 max", recordCount) | ||
| } |
There was a problem hiding this comment.
encodeIndex doesn’t validate its input slice. If len(offsets)==0, recordCount becomes -1 and the function returns a CRC-only index, which is not a valid encoding. Also consider validating that offsets[0] == 0, since decodeIndex always reconstructs offsets starting from 0 (non-zero starts won’t round-trip correctly).
e5a9c20 to
6036c01
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
6036c01 to
a625401
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
7cd92b5 to
edc3ca7
Compare
edc3ca7 to
2aa5deb
Compare
First in a series of PRs checking in the packfile implementation for full history RPC v2. This PR adds: - intpack: standalone Frame of Reference (FOR) integer encoding/decoding package (intpack.go, intpack_test.go) - packfile: offset index codec using FOR groups with CRC32C (index.go, index_test.go) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2aa5deb to
2749f73
Compare
Summary
First in a series of PRs checking in the packfile implementation for full history RPC v2. See #633 for the full design doc.
This PR adds two packages:
intpack (
intpack.go) — A standalone Frame of Reference integer codec. Encodes groups of uint32 values by subtracting the group minimum and bit-packing residuals at the minimum bit width needed. Each encoded group ends with a 1-byte width and 4-byte minimum, so callers can locate the metadata from the tail without knowing the packed size. Used by the packfile offset index (this PR), per-record item size indexes (later PR), and the event store app data encoding.Offset index codec (
index.go) — Builds on intpack to encode/decode the file-level index that maps record numbers to byte positions on disk. Rather than storing absolute offsets (which grow with file size), the index stores record byte sizes as FOR groups of up to 128 values with a trailing CRC32C. On open, all groups are decoded into a flat[]int64offset table for O(1) lookup. Includes an OOM guard that validates record count against index size before allocating, and a structural invariant check (prefix-sum must equal the known index base offset).Test plan
intpack_test.go— roundtrip encoding/decoding, full uint32 range (width 32), error cases (invalid width, truncated payload, n<=0), buffer reuse, binary layout verification, decoding with prefix data, crafted width-0 inputindex_test.go— roundtrip with various sizes (single record, multi-group, partial last group), CRC corruption detection, implausible record count rejection, index base mismatch detection, negative record count, buffer too small, non-monotonic offsets, zero records, empty offsets, non-zero start, delta exceeding uint32, large deltas near MaxUint32🤖 Generated with Claude Code