Skip to content

Add intpack FOR encoding and packfile offset index codec#649

Open
tamirms wants to merge 1 commit intofeature/full-historyfrom
add-packfile-core
Open

Add intpack FOR encoding and packfile offset index codec#649
tamirms wants to merge 1 commit intofeature/full-historyfrom
add-packfile-core

Conversation

@tamirms
Copy link
Copy Markdown
Contributor

@tamirms tamirms commented Apr 1, 2026

Summary

First in a series of PRs checking in the packfile implementation for full history RPC v2. See #633 for the full design doc.

This PR adds two packages:

intpack (intpack.go) — A standalone Frame of Reference integer codec. Encodes groups of uint32 values by subtracting the group minimum and bit-packing residuals at the minimum bit width needed. Each encoded group ends with a 1-byte width and 4-byte minimum, so callers can locate the metadata from the tail without knowing the packed size. Used by the packfile offset index (this PR), per-record item size indexes (later PR), and the event store app data encoding.

Offset index codec (index.go) — Builds on intpack to encode/decode the file-level index that maps record numbers to byte positions on disk. Rather than storing absolute offsets (which grow with file size), the index stores record byte sizes as FOR groups of up to 128 values with a trailing CRC32C. On open, all groups are decoded into a flat []int64 offset table for O(1) lookup. Includes an OOM guard that validates record count against index size before allocating, and a structural invariant check (prefix-sum must equal the known index base offset).

Test plan

  • intpack_test.go — roundtrip encoding/decoding, full uint32 range (width 32), error cases (invalid width, truncated payload, n<=0), buffer reuse, binary layout verification, decoding with prefix data, crafted width-0 input
  • index_test.go — roundtrip with various sizes (single record, multi-group, partial last group), CRC corruption detection, implausible record count rejection, index base mismatch detection, negative record count, buffer too small, non-monotonic offsets, zero records, empty offsets, non-zero start, delta exceeding uint32, large deltas near MaxUint32

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings April 1, 2026 15:48
@tamirms tamirms force-pushed the add-packfile-core branch from 36fb67c to 2532614 Compare April 1, 2026 15:52
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds foundational packfile building blocks for the “full history RPC v2” effort: a Frame-of-Reference (FOR) integer codec and a FOR-based offset index codec with CRC32C integrity checking.

Changes:

  • Introduces FOR encoding/decoding for []uint32 groups with a fixed tail metadata layout.
  • Implements an offset index encoder/decoder that stores per-record byte sizes in FOR-128 groups and validates integrity via CRC32C and structural checks.
  • Adds unit tests covering round-trips, corruption detection, and key error cases for both codecs.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
cmd/stellar-rpc/internal/packfile/for.go Adds FOR group encoder/decoder and bit-pack/unpack helpers.
cmd/stellar-rpc/internal/packfile/for_test.go Tests FOR codec roundtrip, layout, buffer reuse, and error paths.
cmd/stellar-rpc/internal/packfile/index.go Adds FOR-128 offset index codec with CRC32C and sanity checks.
cmd/stellar-rpc/internal/packfile/index_test.go Tests index roundtrip, CRC corruption detection, and sanity check failures.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +25 to +36
func decodeIndex(buf []byte, recordCount int, indexSize int, indexBase int64) ([]int64, error) {
if indexSize < 4 {
return nil, fmt.Errorf("%w: index too small (%d bytes)", ErrCorrupt, indexSize)
}

// Sanity-check recordCount against indexSize to prevent OOM from crafted trailers.
// Each FOR group of up to 128 records requires at least 6 bytes (1B packed + 1B width + 4B min).
maxGroups := (indexSize - 4) / 6 // subtract CRC, divide by min group size
maxRecords := maxGroups * groupSize
if recordCount > maxRecords {
return nil, fmt.Errorf("%w: recordCount %d implausible for indexSize %d", ErrCorrupt, recordCount, indexSize)
}
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decodeIndex should reject negative recordCount. As written, make([]uint32, recordCount) will panic for recordCount < 0, which could be triggered by corrupt/untrusted metadata. Add an early check like if recordCount < 0 { return nil, ... }.

Copilot uses AI. Check for mistakes.
@tamirms tamirms force-pushed the add-packfile-core branch from 2532614 to e5a9c20 Compare April 1, 2026 15:56
@tamirms tamirms changed the base branch from main to feature/full-history April 1, 2026 15:58
@tamirms tamirms requested a review from Copilot April 1, 2026 15:59
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +106 to +110
func encodeIndex(offsets []int64) ([]byte, error) {
recordCount := len(offsets) - 1
if recordCount > math.MaxUint32 {
return nil, fmt.Errorf("packfile: record count %d exceeds uint32 max", recordCount)
}
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

encodeIndex doesn’t validate its input slice. If len(offsets)==0, recordCount becomes -1 and the function returns a CRC-only index, which is not a valid encoding. Also consider validating that offsets[0] == 0, since decodeIndex always reconstructs offsets starting from 0 (non-zero starts won’t round-trip correctly).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@tamirms tamirms force-pushed the add-packfile-core branch 6 times, most recently from 7cd92b5 to edc3ca7 Compare April 1, 2026 19:57
@tamirms tamirms requested a review from a team April 1, 2026 23:37
@tamirms tamirms force-pushed the add-packfile-core branch from edc3ca7 to 2aa5deb Compare April 2, 2026 10:20
@tamirms tamirms changed the title Add packfile FOR encoding and offset index codec Add intpack FOR encoding and packfile offset index codec Apr 2, 2026
First in a series of PRs checking in the packfile implementation for
full history RPC v2. This PR adds:

- intpack: standalone Frame of Reference (FOR) integer encoding/decoding
  package (intpack.go, intpack_test.go)
- packfile: offset index codec using FOR groups with CRC32C
  (index.go, index_test.go)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tamirms tamirms force-pushed the add-packfile-core branch from 2aa5deb to 2749f73 Compare April 2, 2026 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants