Skip to content

feature: Masked() and MaskedToBuf() — apply bitmask to keys and merge colliding containers#32

Open
aliszka wants to merge 2 commits intooptimizations_vol4from
mask
Open

feature: Masked() and MaskedToBuf() — apply bitmask to keys and merge colliding containers#32
aliszka wants to merge 2 commits intooptimizations_vol4from
mask

Conversation

@aliszka
Copy link
Copy Markdown
Member

@aliszka aliszka commented Mar 13, 2026

Motivation

Bitmap keys encode dimensions in the upper bits. Callers need to project a bitmap onto a sub-dimension by masking out the upper bits and merging containers whose keys collide under the mask — without materialising an intermediate bitmap per dimension.

Approach

Masked(mask uint64) iterates over all source containers, applies the mask to each key, and OR-merges containers that collapse to the same masked key into the result. The lowest 16 bits of the mask are forced to zero since keys never use them.

Two optimisations in the merge loop keep allocations minimal:

  • Key space is pre-sized upfront (expandConditionally(an, 0)) so the loop never shifts container data to make room for new keys.
  • When an inline OR fails (container grew beyond its current allocation) and the old container sits at the tail of b.data, the slice is trimmed and regrown in place, avoiding a dead container fragment.

MaskedToBuf(mask, buf) is the same operation but writes into a caller-supplied byte slice via NewBitmapToBuf, avoiding heap allocation when the caller can reuse a pre-allocated buffer.

Key areas for review

  • bitmap_opt.go:maskedInto — the trim-and-regrow path (b.data = b.data[:boff]) is only safe when boff+len(bc) == len(b.data); verify the condition is tight
  • bitmap_opt.go:maskedIntoexpandConditionally(an, 0) pre-sizes for the worst case (no key collisions); with heavy collisions the key space is over-allocated, but this is bounded by the source key count

Risks and mitigations

  • Trim-and-regrow leaves dead space when container is not at tail: accepted trade-off; sequential key iteration almost always produces containers in append order, so the tail condition hits in the common case
  • Mask low-bit truncation is silent: callers passing a mask with low 16 bits set get them silently cleared — documented on the method

Testing

Tests cover nil input, no key overlap, key collision with OR merge (including multi-key collapse under zero mask), mask boundary conditions (zero mask, identity mask, low-16-bits ignored), and the buffer-swap logic in the OR accumulation loop for non-overlapping arrays, overlapping arrays, array-to-bitmap conversion, and three-container chains.

Copy link
Copy Markdown

@orca-security-eu orca-security-eu bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orca Security Scan Summary

Status Check Issues by priority
Passed Passed Infrastructure as Code high 0   medium 0   low 0   info 0 View in Orca
Passed Passed SAST high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Secrets high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Vulnerabilities high 0   medium 0   low 0   info 0 View in Orca

@aliszka aliszka changed the title feature: introduces masked method feature: introduces masked methods Mar 13, 2026
@aliszka aliszka force-pushed the mask branch 3 times, most recently from 52b58bb to 4a51c7d Compare March 17, 2026 17:46
@aliszka aliszka changed the base branch from main to optimizations_vol4 April 7, 2026 10:09
aliszka and others added 2 commits April 7, 2026 12:13
Creates a bitmap backed by a caller-supplied byte slice instead of a
heap allocation. The _ptr field retains a GC reference to the original
[]byte so the underlying memory is not collected while the bitmap is live.

Uses clear() to zero the buffer before initialisation — consistent with
the rest of the codebase after Memclr was removed.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Masked() applies a bitmask to every key and returns a new bitmap where
keys that collapse to the same masked value have their containers merged
via container-level OR operations.

MaskedToBuf() is the same but uses a caller-provided byte slice as the
underlying buffer, avoiding heap allocation when the buffer is large
enough.

Optimizations:
- Pre-size key space upfront to avoid repeated key expansion and
  container memmoves during the merge loop.
- When an inline merge fails and the replaced container is at the end
  of b.data, trim and regrow in place to eliminate dead container space.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@orca-security-eu orca-security-eu bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orca Security Scan Summary

Status Check Issues by priority
Passed Passed Secrets high 0   medium 0   low 0   info 0 View in Orca

@aliszka aliszka changed the title feature: introduces masked methods feature: Masked() and MaskedToBuf() — apply bitmask to keys and merge colliding containers Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant