AssetGen Module¶
Voice Settings¶
Character voice assignment via ElevenLabs. Voices are stored as lean VoiceEntry[] on the project asset -- display data is resolved at runtime from source collections.
Generated Voice Flow¶
- User describes a voice ->
POST /elevenLabs/text-to-voice/design-> returns 3 previews withaudio_base_64 - User picks one ->
POST /elevenLabs/text-to-voice-> permanent voice created in ElevenLabs + saved tovoicescollection
Library Voice Flow¶
- User picks a marketplace voice ->
POST /elevenLabs/addVoice-> voice saved tovoicescollection - TTS job runs with character's dialog text ->
POST /createJob-> custom audio stored inAssetGenJob.outputLinks.result
Pitch Variation Flow (Remix)¶
- User selects pitch (high/low) + strength ->
POST /elevenLabs/remix-> returns 3 remixed previews - User picks one ->
POST /elevenLabs/text-to-voice-> permanent voice saved tovoicescollection
Data Resolution (on voice settings open)¶
| Endpoint | Source | Returns |
|---|---|---|
POST /voices/getByIds |
voices collection |
name, preview_url, user_id, created_at |
POST /getJobsByIds |
AssetGenJob collection |
outputLinks.result (library voice audio) |
POST /users/resolveNames |
users collection |
userId -> display name map |
All batch endpoints capped at max 50 per request (Zod validated). Frontend chunks larger arrays to match.
Audio Resolution Logic¶
- Generated voice ->
voicescollectionpreview_url(custom dialog audio from design step) - Library voice ->
AssetGenJoboutputLinks.result(custom dialog TTS on B2) - Variation voice ->
voicescollectionpreview_url(remix audio)
Collections¶
| Collection | Purpose |
|---|---|
voices (MongoDB) |
Local cache of ElevenLabs voice metadata (premade + user voices) |
AssetGenJob |
TTS job results -- library voice custom audio lives here, and other jobs also |
Project asset voices[] |
Lean VoiceEntry array: voiceId, selected, source, assetGenJobId?, variations? |
Dialog Preview¶
Each character has a dialogPreview string (min 100 chars) generated on character creation. Template: "My name is {title} and I'm the {role} in {project}." Padded with character description if under 100 chars. Used as the spoken text for all voice previews.
Voice Selection (Assign / Restore)¶
After creating a voice via the endpoints above, the FE selects it on the character via a dedicated endpoint in the projects module:
PATCH /projects/:projectId/assets/:assetId/voices/select
Body: { voiceId, source?, assetGenJobId?, variation?: { voiceId, pitch, strength } }
The BE determines the action automatically:
source present? |
variation present? |
voiceId exists? | Action | Activity Logged |
|---|---|---|---|---|
| yes | no | — | Add new VoiceEntry, select it | ASSIGN_CHARACTER_VOICE |
| no | no | yes | Select existing voice | RESTORE_CHARACTER_VOICE |
| — | yes (new) | — | Add new Variation on parent | ASSIGN_CHARACTER_VOICE_VARIATION |
| — | yes (existing) | yes | Select existing variation | RESTORE_CHARACTER_VOICE_VARIATION |
Activity logging is fire-and-forget. Display data (voiceName, audioUrl) is resolved from source collections before logging:
| Voice type | voiceName from | audioUrl from |
|---|---|---|
| Generated | VoicesModel.getVoicesByIds() -> name |
-> preview_url |
| Library | AssetJobModel.getJobsByIds() -> modelConfig.modelTitle |
-> outputLinks.result |
| Variation | VoicesModel.getVoicesByIds() -> name |
-> preview_url |
Key files: src/projects/projects.service.ts (selectVoice, handleBaseVoice, handleVariation), src/projects/projects.validator.ts (validateSelectVoiceRequest), src/projects/projects.controller.ts (selectVoice). See src/projectActivities/README.md for activity logging conventions.