Add context option to .import for passing custom data by dimitrikochnev · Pull Request #996 · toptal/chewy

dimitrikochnev · 2026-01-29T11:47:37Z

Motivation

When importing objects, the data needed for indexing (e.g., embeddings, computed attributes) is often already available in memory at the call site. However, currently, crutch blocks are forced to re-fetch (even with direct_import) this data from the database, resulting in redundant queries and wasted resources.

There is currently no mechanism to pass this pre-computed data from the import call site down into crutch blocks or field value procs.

This design aligns with established patterns in the Ruby ecosystem, such as graphql-ruby and ActiveModelSerializers. In those libraries, a context object is similarly passed through the resolution or serialization stack to share request-level state (like current user, auth tokens, or pre-loaded data) without relying on global state.

Solution

Add an optional context: keyword argument to import/import! that flows through the entire indexing pipeline. Context is an arbitrary hash, defaulting to {}.

# Pass pre-computed data to avoid redundant DB queries
MyIndex.import!(objects, context: { embeddings: precomputed_embeddings })

Context in crutch blocks (2nd argument)

crutch :embeddings do |collection, context|
  # Use pre-computed data if available, otherwise fetch from DB
  context[:embeddings] || load_embeddings(collection)
end

Context in field value procs (3rd argument)

field :embedding, value: ->(object, crutches, context) {
  context[:override] || crutches.embeddings[object.id]
}

Both are fully backward-compatible — existing 1-arg crutch blocks and 1-2 arg field procs continue to work unchanged via arity-based dispatch.

bbatsov · 2026-02-25T13:16:46Z

Master has been updated with CI fixes and compatibility changes (#998) — we now target Ruby 3.2+ and Rails 7.2+. Could you rebase this PR on top of master so CI can run properly? Thanks!

dimitrikochnev · 2026-02-25T16:42:35Z

Let me find there is it :-)

dimitrikochnev · 2026-02-25T16:50:49Z

@bbatsov yep, done

bbatsov · 2026-02-25T20:05:46Z

Interesting feature. The design is clean — flowing context: through the pipeline with arity-based backward compatibility is the right approach.

A couple of questions:

Could you rebase onto current master? (I've made quite a few changes today, sorry about that!)
The PR is well-tested but could use a changelog entry.

Will review the code in detail after rebase.

… field values

dimitrikochnev · 2026-02-26T08:37:21Z

Done.

I will give you a practical example storing video embeddings:

class Video::EncodeSemanticProcessor < ProcessorMan::Processor
  include MediaProcessing

  option :model, :string, default: "auto"

  def process
    upstream_media, upstream_video = require_upstreams(0 => :media, 1 => :video)

    result = with_progress(0.1, 0.9) do |progress|
      Ferment::Engine.stream("/v1/video/encode-semantic", video_in: upstream_video.cook.blob.url, **options.to_params) do |event|
        progress.call(event[:progress]) if event[:type] == :progress
      end
    end

    MultiTenancy.with(network.user) do
      visual_slices, base64_embeddings = build_visual_slices(result)

      media.update!(visual_slices: visual_slices)

      Media::Embedding.bulk_upsert(media, base64_embeddings)  # <--- Store raw embeddings in the db

      Chewy.strategy(:atomic) do
        context = { embeddings: Media::Embedding.to_floats(base64_embeddings) }

        MediaIndex.delete_for(media, type: 'Media::Data::VisualSlice')
        MediaIndex.import!(Array.wrap(media.visual_slices), direct_import: true, context: context) # <--- Don't perform a huge round trip to the db, classic reindex still works since the embeddings will be fetched from the origin db
      end
    end

    succeed(extract_data(result, strip_nested: true))
  end

  private

    def build_visual_slices(result)
      embeddings = {}

      slices = result.fetch(:slices).map do |slice_data|
        slice = Media::Data::VisualSlice.new(
          id: media_cid(:visual_slice, slice_data.fetch(:position)),
          blob_id: media.id,
          blob_type: media.class.name,
          user_id: media.user_id,
          position: slice_data.fetch(:position),
          start_time: slice_data.fetch(:start_time),
          duration: slice_data.fetch(:duration),
          first_frame: slice_data.fetch(:first_frame),
          frame_count: slice_data.fetch(:frame_count),
          semantic_embedding_model: result.fetch(:model), 
          semantic_embedding_dim: result.fetch(:embedding_dim),
        )

        embeddings[slice.embedding_key] = slice_data.fetch(:embedding) # <--- HUGE

        slice
      end

      [slices, embeddings]
    end
end

dimitrikochnev requested a review from a team as a code owner January 29, 2026 11:47

dimitrikochnev force-pushed the context branch from de2c8b4 to dc8e025 Compare February 25, 2026 16:49

Add context: option to import for passing custom data to crutches and…

c33f5de

… field values

dimitrikochnev force-pushed the context branch from dc8e025 to c33f5de Compare February 26, 2026 08:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add context option to .import for passing custom data #996

Add context option to .import for passing custom data #996
dimitrikochnev wants to merge 1 commit intotoptal:masterfrom
lockvoid:context

dimitrikochnev commented Jan 29, 2026

Uh oh!

bbatsov commented Feb 25, 2026

Uh oh!

dimitrikochnev commented Feb 25, 2026

Uh oh!

dimitrikochnev commented Feb 25, 2026

Uh oh!

bbatsov commented Feb 25, 2026 •

edited

Loading

Uh oh!

dimitrikochnev commented Feb 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dimitrikochnev commented Jan 29, 2026

Motivation

Solution

Context in crutch blocks (2nd argument)

Context in field value procs (3rd argument)

Uh oh!

bbatsov commented Feb 25, 2026

Uh oh!

dimitrikochnev commented Feb 25, 2026

Uh oh!

dimitrikochnev commented Feb 25, 2026

Uh oh!

bbatsov commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dimitrikochnev commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bbatsov commented Feb 25, 2026 •

edited

Loading

dimitrikochnev commented Feb 26, 2026 •

edited

Loading