Sign In

videoLab - LTX 2.3

Updated: May 13, 2026

toolmoviespeechvideovoicetts

Download

1 variant available

Config Other

209.75 KB

Verified:

Type

Workflows

Stats

71

Reviews

Published

May 13, 2026

Base Model

LTXV 2.3

Hash

AutoV2
6C903AE4B7
default creator card background decoration
jbj151's Avatar

jbj151

*When copy+pasting the workflow Audio VAEs disconnect and Math Expression #74 has to be reconnected from it's port b to Frame Rate

*Check out my other workflows for swap and upscale techniques

LTX 2.3 Super speaks — but now it speaks in your voice.

  • Your Custom Voice

  • LTX audio merged in

  • Lip sync stability

This workflow combines voice cloning with LTX 2.3's native audio generation. One prompt produces two synchronized audio tracks — cloned dialogue and AI-generated background sound — merged and conditioned into the video latent.

Your voice. Your script. LTX's world. One prompt.

LTX 2.3 — AI Video Workflow (part 4 of Lab series)

Four inputs. One cinematic video.

Drop in an image, a voice sample, your scene description, and your character's dialog — the workflow takes it from there.

✍️ Script-Driven Dialog Write exactly what you want your character to say in the Dialog Script node. Your words, your story — the workflow handles all the technical execution automatically.

🎬 Scene Description Describe the scene, the mood, the environment. The AI reads this alongside your dialog and builds everything else — no prompt engineering knowledge needed.

🤖 Fully Automatic Prompting Two local AI enhancers run silently in the background. One writes a complete structured video generation prompt from your scene and dialog. The other crafts precise voice delivery direction — pace, tone, emotion, texture — fed directly into the voice engine. You never write a prompt manually.

🎙️ VoX Voice Cloning Provide any short voice sample and VoxCPM2 clones it. Your character speaks your script in that voice, with the AI-generated delivery direction shaping how it's performed — not just what is said, but how.

🔁 Dual Voice Reinforcement The video and audio pipelines are locked together. The video prompt instructs the model on lipsync behavior while the audio direction drives matching emotional delivery through VoX — a unified performance from both sides simultaneously.

⚙️ GGUF & Safetensors Compatible Run whichever model format your hardware prefers — GGUF quantized or full safetensors, both supported out of the box.