Skip to content

Batched shot image generation

Generates N images for a shot in a single user action. Used in three flows:

  1. Initial generation (sketch → image, via POST /workbench/generateShotImage)
  2. Edit (via the shot-image-edit guru chat, edit_shot_image tool)
  3. Restart from sketch (via guru or direct api, restart_from_sketch tool)

count == 1 keeps the existing single-image code paths byte-identical. count >= 2 opts into the batch path. Capped at SHOT_IMAGE_GEN_MAX_BATCH_COUNT (sharedConstants.ts).

How count gets to the backend

Flow Source Path
Initial-gen count field on POST /workbench/generateShotImage body workbench.validator.ts
Edit / restart-from-sketch referenceData.imageGenerationCount on the user chat message guruChat.validator.ts

For the chat flows the LLM never sees the count — it's read from the latest user message's referenceData (shotImageEditGuru.getLatestImageGenerationCount) so a tampering tool call can't change it. FE owns entitlement gating.

Two emission patterns

Flow When backend HTTP returns Socket events
Initial-gen batch Immediately, with placeholder shot One consolidated SHOT_UPDATED after all N slots settle
Edit / restart-from-sketch batch Immediately, tool result includes { status: 'pending', batchId, batchSize, placeholderJobIds, shot } N progressive SHOT_IMAGE_BATCH_SLOT_UPDATED, one per slot as it lands

Initial-gen is consolidated because most providers are slow (Replicate / Fal can be 20–60s) and the FE renders one place. Edit-batch is progressive because edits are fast (Gemini, ~5–10s parallel) and live in chat where progressive feedback feels better.

Socket event payloads

SHOT_UPDATED (existing event, unchanged shape — used by initial-gen):

{ shot, job, jobs }   // job=jobs[0] for legacy handlers; jobs=full list

SHOT_IMAGE_BATCH_SLOT_UPDATED (new event, edit/restart only):

{ batchId, batchIndex, batchSize, jobId, status, images?, error? }
Slot events carry only images[] (not the full shot). The full shot is delivered once via the synchronous tool result before any slot events arrive — so FE picks up metadata changes (description, cameraAngle, descriptionHistory) up front.

Both events are wrapped in the standard JaduSpine envelope: { topic, data: { event, status, isSuccess, message, data: <payload>, timestamp } }.

How batched images are stored

Each batched image is an ImageMedia row in shot.images[]. Cohort identity lives in imageParams:

imageParams: {
  assetGenJobId: string;
  assetGenModelConfig: ModelConfigType;
  assetGenJobStatus: JobStatus;     // 'processing' | 'completed' | 'failed'
  batchId: string;                  // shared across the cohort
  batchSize: number;                // total expected
  batchIndex: number;               // 0..N-1, stable order
  // ...optional fields like generationRefs from sketch-to-shot
}

For count <= 1 none of batchId / batchSize / batchIndex are written — backward-compatible with FE that doesn't know about batches.

FE groups by batchId to render the cohort and sorts by batchIndex for stable display order. While slots are pending, imageParams.assetGenJobStatus === 'processing' and imageURL === '' — render those as skeleton tiles.

Selection lifecycle

The contract: isSelected: true means displayable image. Empty placeholders are never selected.

  1. Placeholder insertaddShotImageBatch inserts N entries with isSelected: false. Any pre-existing displayable selection (image with non-empty URL) is preserved.
  2. First slot to land successfullypatchShotImageJobResult runs a guarded atomic update: if no slot in this batch is already selected, deselect everything else and select this slot. Concurrent landings can't race because the guard is in the doc-level filter.
  3. Subsequent slots — leave selection alone. User can swap deliberately via the existing updateShotImageSelection endpoint.

This matches the single-image gen UX: a freshly generated image takes over selection from the prior one.

Concurrency

N slot runs happen in parallel. patchShotImageJobResult uses Mongo arrayFilters keyed on imageParams.assetGenJobId so each runner writes only its own slot. The selection promotion in step 2 above is also atomic — see shotImageBatch.service.test.ts and the storyVideosModel concurrency tests for the regression guards.

Activity logging

One GENERATE_SHOT_IMAGE (or EDIT_SHOT_IMAGE) project activity row per slot — the activity feed reflects every generated image, not just one. Dispatch creates N rows in PROCESSING (each keyed to a placeholder job id); each slot's runner flips its own row to COMPLETED / FAILED when its result lands. Cohort grouping is available via imageParams.batchId on the resulting shot images if a consumer wants it.

Defensive validation

Three layers enforce the count cap: 1. Validator (SHOT_IMAGE_GEN_MAX_BATCH_COUNT = 4) 2. getLatestImageGenerationCount falls back to 1 for any out-of-range value 3. The dispatch*Batch methods throw if count is outside [2, MAX] — last-resort guard if the upper layers ever regress.

Future work

  • Conditional batch dispatch based on edit magnitude (skip batch for small touch-ups). See @todo on runBatchedShotImageEdit.
  • Per-slot retry on transient provider failures.
  • Optional heterogeneous batches (different models per slot) — the API already accepts an array of configs, FE just feeds it [config, config, ...] today.