AssetGen Module¶

Voice Settings¶

Character voice assignment via ElevenLabs. Voices are stored as lean VoiceEntry[] on the project asset -- display data is resolved at runtime from source collections.

Generated Voice Flow¶

User describes a voice -> POST /elevenLabs/text-to-voice/design -> returns 3 previews with audio_base_64
User picks one -> POST /elevenLabs/text-to-voice -> permanent voice created in ElevenLabs + saved to voices collection

Library Voice Flow¶

User picks a marketplace voice -> POST /elevenLabs/addVoice -> voice saved to voices collection
TTS job runs with character's dialog text -> POST /createJob -> custom audio stored in AssetGenJob.outputLinks.result

Pitch Variation Flow (Remix)¶

User selects pitch (high/low) + strength -> POST /elevenLabs/remix -> returns 3 remixed previews
User picks one -> POST /elevenLabs/text-to-voice -> permanent voice saved to voices collection

Data Resolution (on voice settings open)¶

Endpoint	Source	Returns
`POST /voices/getByIds`	`voices` collection	name, preview_url, user_id, created_at
`POST /getJobsByIds`	`AssetGenJob` collection	outputLinks.result (library voice audio)
`POST /users/resolveNames`	`users` collection	userId -> display name map

All batch endpoints capped at max 50 per request (Zod validated). Frontend chunks larger arrays to match.

Audio Resolution Logic¶

Generated voice -> voices collection preview_url (custom dialog audio from design step)
Library voice -> AssetGenJob outputLinks.result (custom dialog TTS on B2)
Variation voice -> voices collection preview_url (remix audio)

Collections¶

Collection	Purpose
`voices` (MongoDB)	Local cache of ElevenLabs voice metadata (premade + user voices)
`AssetGenJob`	TTS job results -- library voice custom audio lives here, and other jobs also
Project asset `voices[]`	Lean VoiceEntry array: `voiceId`, `selected`, `source`, `assetGenJobId?`, `variations?`

Dialog Preview¶

Each character has a dialogPreview string (min 100 chars) generated on character creation. Template: "My name is {title} and I'm the {role} in {project}." Padded with character description if under 100 chars. Used as the spoken text for all voice previews.

Voice Selection (Assign / Restore)¶

After creating a voice via the endpoints above, the FE selects it on the character via a dedicated endpoint in the projects module:

PATCH /projects/:projectId/assets/:assetId/voices/select
Body: { voiceId, source?, assetGenJobId?, variation?: { voiceId, pitch, strength } }

The BE determines the action automatically:

`source` present?	`variation` present?	voiceId exists?	Action	Activity Logged
yes	no	—	Add new VoiceEntry, select it	`ASSIGN_CHARACTER_VOICE`
no	no	yes	Select existing voice	`RESTORE_CHARACTER_VOICE`
—	yes (new)	—	Add new Variation on parent	`ASSIGN_CHARACTER_VOICE_VARIATION`
—	yes (existing)	yes	Select existing variation	`RESTORE_CHARACTER_VOICE_VARIATION`

Activity logging is fire-and-forget. Display data (voiceName, audioUrl) is resolved from source collections before logging:

Voice type	voiceName from	audioUrl from
Generated	`VoicesModel.getVoicesByIds()` -> `name`	-> `preview_url`
Library	`AssetJobModel.getJobsByIds()` -> `modelConfig.modelTitle`	-> `outputLinks.result`
Variation	`VoicesModel.getVoicesByIds()` -> `name`	-> `preview_url`

Key files: src/projects/projects.service.ts (selectVoice, handleBaseVoice, handleVariation), src/projects/projects.validator.ts (validateSelectVoiceRequest), src/projects/projects.controller.ts (selectVoice). See src/projectActivities/README.md for activity logging conventions.