Sign In

ComfyUI Flash Head Workflow: Ultrafast Head Lip-Sync

Updated: Mar 9, 2026

base model

Download

1 variant available

Archive Other

18.04 KB

Verified:

Type

Workflows

Stats

62

Reviews

Published

Mar 9, 2026

Base Model

Qwen

Hash

AutoV2
ED8A3B84FA

Video Introdution:

 

 

 

Click here to try workflow online:

(Notice:Some nodes are biulding by Runninghub ,if you downloading and running offline , may not work!)

 

Open Source Address: https://github.com/Soul-AILab/SoulX-FlashHead

 

(Workflows can be downloaded via the links below—click the link and find the download button in the top right corner. Due to limited VRAM on my local machine, I haven't been able to test these myself. So if you're not familiar with running ComfyUI locally, it's best to run them online. The FlashHead node is built on RH.)

 

Workflow: AA--Ultra-Fast Digital Human FlashHead

Experience Link: https://www.runninghub.ai/post/2030340894288781313/?inviteCode=rh-v1401

 

Workflow: AA--Emotion Control Digital Human - Ultra-Fast FlashHead + Index Voice Cloning (8 Emotion Controls)

Experience Link: https://www.runninghub.ai/post/2030585203055398914/?inviteCode=rh-v1401

 

Workflow: AA--Preset Voice Ultra-Fast Digital Human - FlashHead + QwenTTS - One Image, 9 Voices

Experience Link: https://www.runninghub.ai/post/2030658043398070273/?inviteCode=rh-v1401

 

Workflow: AA--Fully Automatic Ultra-Fast Digital Human - FlashHead + Qwen Sound Design - Auto-Prompt from One Image - Digital Human Card Pull!

Experience Link: https://www.runninghub.ai/post/2030589859588476930/?inviteCode=rh-v1401

 

### Introduction to Flash Head Digital Human Workflows

 

Flash Head is a digital human generation project running on ComfyUI, focused on speed. It achieves extreme video generation speed by only driving the head region for lip-sync, sacrificing dynamics in other parts of the body.

 

#### Core Features:

 

*   Ultimate Speed: At 512p resolution, generating a 5-second video takes only about 30 seconds.

*   Two Models: Offers Pro and Light versions. The Light version is three times faster than Pro but compromises on quality, suitable for quick validation.

*   Image Requirement: Must use a facial close-up image; otherwise, the model cannot recognize the head and lips.

 

#### Main Workflows:

 

The following workflows are introduced to meet different application scenarios:

 

1.  Basic Workflow

    *   The simplest version, containing only 6 core nodes.

 

2.  Voice Cloning Digital Human

    *   Allows you to upload an image and reference audio to clone the voice and drive the digital human.

 

3.  Voice Preset Digital Human

    *   Similar to cloning, but uses pre-set voices within the workflow, eliminating the need for user uploads.

 

4.  Sound Design Digital Human

    *   Fully Automatic Workflow: You only need to upload an image. The model analyzes the image via a VQA prompting node, automatically generates a voice prompt, and then a TTS model designs and generates the sound based on that prompt.

 

#### Summary:

 

Overall, the Flash Head series of workflows performs well in scenarios that demand ultimate speed (such as real-time interaction, rapid prototyping) and are "worth trying out." However, there is still a gap in generation quality and stability compared to more mature solutions like Infinite Talk, so currently, they are "not recommended for productivity."