Sign In

MiniMax Remover Authorized Video Watermark Cleanup Workflow

Updated: May 9, 2026

character

Download

1 variant available

Archive Other

6.99 KB

Verified:

Type

Workflows

Stats

23

Reviews

Published

May 9, 2026

Base Model

Wan Video 14B t2v

Hash

AutoV2
61FB3C1856
default creator card background decoration
AIKSK's Avatar

AIKSK

This ComfyUI workflow is designed for authorized video watermark cleanup, masked video object removal, and region-based video restoration using the MiniMax Remover model route. The main purpose of this workflow is to let creators load a video, define the unwanted region with a mask, reconstruct the masked area through a WanVideo MiniMax Remover pipeline, and export a cleaned video while preserving the original audio track.

This workflow is not a normal video upscale or frame interpolation graph. It focuses on masked video inpainting. Instead of changing the entire video, the workflow targets a selected area and asks the model to repair that region across frames. This makes it useful for removing unwanted overlays, temporary marks, test labels, UI elements, unwanted objects, or visual distractions from videos that you own or have permission to edit.

The workflow is built around WanVideo MiniMax Remover, using Wan2_1-MiniMaxRemover_1_3B_fp16.safetensors as the main video restoration model. It also uses Wan2_1_VAE_bf16.safetensors as the Wan video VAE, umt5_xxl_fp16.safetensors as the Wan text encoder, and a LightX2V distill LoRA route for faster generation. The graph combines video loading, frame resizing, mask preparation, latent encoding, remover embedding generation, WanVideo sampling, decoding, and final video recombination.

The first stage uses VHS_LoadVideo to import the source video. This node extracts the video frames, frame count, audio, and video information. In the included setup, the video is loaded at 16 fps, every frame is selected, and no frames are skipped. This gives the workflow a stable frame sequence for the restoration process. The original audio is also passed forward so the final cleaned video can keep the source sound.

After loading, the workflow uses ImageResizeKJv2 to resize the video frames into a controlled working size. In the uploaded setup, the frame size is prepared around 1536 x 1536 with Lanczos scaling, center crop behavior, and divisible-by-16 alignment. This is important because video restoration models are sensitive to resolution, aspect ratio, and alignment. A controlled input size helps the model process the frames more consistently.

The workflow then extracts a frame from the batch using JWImageExtractFromBatch. This is useful for preparing and checking the mask. Since video watermark or object cleanup depends heavily on the mask, users need a clear reference frame to mark the unwanted area. The workflow includes preview nodes so the source frame and mask area can be inspected before running the restoration model.

The mask section is one of the most important parts of the workflow. A mask image is loaded through LoadImage, then repeated across the video frame batch with MaskRepeatBatch. This means one mask can be applied consistently across the entire video sequence. This is useful when the unwanted object or watermark stays in the same screen position throughout the clip.

The workflow also uses GrowMaskWithBlur. This expands the mask by around 40 pixels and applies blur around the mask edge. This is essential for video cleanup because a mask that is too tight can leave leftover edges, ghosting, or visible boundaries. A slightly expanded and softened mask gives the model enough room to blend the repaired area into the surrounding frame.

The mask is converted into image form through MaskToImage nodes for preview and internal processing. The workflow also uses SolidMask and ImageCompositeMasked to create a visual mask-check output. This helps users confirm whether the selected removal area is correct before sending it into the WanVideo model. Good mask preparation is often more important than changing sampling settings.

The video frames and mask are encoded into latent form through WanVideoEncode. The graph contains two WanVideoEncode routes: one for the video frames and one for the mask-related input. These latents are then passed into WanVideoMiniMaxRemoverEmbeds. This node prepares the special image embeddings required by the MiniMax Remover model. In the included setup, the remover embedding node uses width, height, and frame count information from the processed video, allowing the model to understand the spatial and temporal structure of the input.

The prompt section uses CR Prompt Text and WanVideoTextEncode. The example prompt is simple, describing the target restored content as “Japanese street houses.” This prompt gives the model a scene direction for filling the masked area. For best results, the prompt should describe what should naturally exist behind the removed area. For example, if the mask covers a watermark on a wall, the prompt should describe the wall texture. If it covers a street sign, the prompt should describe the background street or building surface.

The negative prompt includes “blurry” style suppression, helping the model avoid unclear reconstruction. For video cleanup, the negative prompt does not need to be overly complex. The most important control is still the mask quality and prompt relevance. A clean prompt that describes the background behind the removed object is usually better than a long unrelated artistic prompt.

The main restoration stage is WanVideoSampler. It receives the MiniMax Remover model, image embeddings, text embeddings, and sampling settings. In the uploaded setup, the sampler uses 6 steps, CFG 1, shift around 13, UniPC scheduler, and randomized seed. This makes the workflow relatively fast and suitable for iterative testing. Users can test several seeds if the repaired area does not blend well in the first result.

After sampling, WanVideoDecode converts the restored latent video back into image frames. These frames represent the cleaned video result. The workflow then uses VHS_VideoCombine to combine the output frames back into an MP4 file. The original audio from VHS_LoadVideo is connected into the combine node, so the final video keeps the source sound, timing, narration, music, or effects.

The final video output uses H.264 MP4, yuv420p pixel format, CRF 19, and 16 fps. This makes the result easy to preview, share, and upload to platforms such as RunningHub, YouTube, Bilibili, or Civitai. The workflow also includes separate video combine outputs for mask preview and cleaned video preview, which is useful when demonstrating the before-mask-after logic.

Main features:

- MiniMax Remover video cleanup workflow

- Mask-based video inpainting

- Uses Wan2_1-MiniMaxRemover_1_3B_fp16

- WanVideoWrapper pipeline

- Wan2.1 VAE support

- UMT5 text encoder support

- LightX2V distill LoRA acceleration

- VHS_LoadVideo video import

- Original audio preservation

- ImageResizeKJv2 frame preprocessing

- Manual mask input support

- MaskRepeatBatch for applying one mask across video frames

- GrowMaskWithBlur for smoother mask boundaries

- WanVideoMiniMaxRemoverEmbeds for remover conditioning

- WanVideoSampler restoration stage

- WanVideoDecode frame output

- VHS_VideoCombine MP4 export

Recommended use cases:

Authorized video watermark cleanup, unwanted overlay removal, temporary label cleanup, UI element removal, test mark removal, object removal, region-based video restoration, AI video post-production, social media video cleanup, RunningHub workflow demonstration, Civitai video workflow showcase, Bilibili tutorial preparation, and YouTube video polishing.

Suggested workflow:

Start by loading the source video. Use a video where the unwanted area stays relatively stable in the frame. This workflow works best when the watermark, label, overlay, or object does not move too much. If the unwanted element moves across the screen, you may need a moving mask or split the video into sections.

Prepare the mask carefully. The mask should fully cover the unwanted region, including its edges and any shadow or glow around it. If the mask is too small, parts of the watermark or object may remain. If the mask is too large, the model may change more of the video than necessary.

Use GrowMaskWithBlur to expand and soften the mask. The included setup expands the mask and applies blur, which helps reduce hard seams. For small text overlays, a moderate expansion is usually enough. For larger logos or objects, increase the mask size slightly so the model has room to reconstruct the background.

Use the preview nodes to check the mask before generation. Do not skip this step. If the mask is wrong, the final output will likely be wrong. Check whether the mask aligns with the unwanted region after resizing and batching.

Write a prompt that describes the background behind the removed area. If the covered area should become sky, write sky. If it should become wall, write wall texture. If it should become street, write street background. The prompt should help the model reconstruct what should exist in the masked region.

Run a short test first. Video restoration can be heavy, so test a short clip or a small frame range before processing a long video. Check whether the repaired area blends naturally and whether there is flickering across frames.

If the cleaned area flickers, try improving the mask, simplifying the prompt, or testing another seed. If the model creates strange objects in the repaired region, make the prompt more specific and less creative. For restoration tasks, the prompt should usually be practical rather than artistic.

If the result looks blurry, check the source quality and mask size. Very large masked regions are harder to reconstruct than small overlays. If the area is too large, the model has to invent too much missing content, which can reduce consistency.

After the cleaned frames are decoded, use VHS_VideoCombine to export the final video. Keep the original audio connected if you want the output to preserve the source soundtrack. Check the final video for audio sync, frame rate, and visual consistency.

Responsible use note: use this workflow only on videos you own, have permission to edit, or are legally allowed to modify. Do not use it to remove ownership marks, creator credits, licensing marks, or platform watermarks from content you do not have rights to use.

This workflow is designed as a practical masked video restoration pipeline for ComfyUI users. It combines video loading, frame resizing, mask preparation, WanVideo MiniMax Remover embeddings, video inpainting, audio preservation, and MP4 export into one graph. It is useful for creators who need controlled video cleanup and region repair before publishing authorized content.

🎥 YouTube Video Tutorial

Want to know what this workflow actually does and how to start fast?

This video explains what the tool is, how to launch the workflow instantly, and shares my core design logic — no local setup, no complicated environment.

Everything starts directly on RunningHub, so you can experience it in action first.

👉 YouTube Tutorial: https://youtu.be/yp0dcysS4XY

Before you begin, I recommend watching the video thoroughly — getting the full context helps you understand the tool faster and avoid common detours.

⚙️ RunningHub Workflow

Try the workflow online right now — no installation required.

👉 Workflow: https://www.runninghub.ai/post/2021475821034151938/?inviteCode=rh-v1111

If the results meet your expectations, you can later deploy it locally for customization.

🎁 Fan Benefits: Register to get 1000 points + daily login 100 points — enjoy 4090 performance and 48 GB super power!

📺 Bilibili Updates (Mainland China & Asia-Pacific)

If you’re in the Asia-Pacific region, you can watch the video below to see the workflow demonstration and creative breakdown.

📺 Bilibili Video: https://www.bilibili.com/video/BV1REcbzdEdn/

☕ Support Me on Ko-fi

If you find my content helpful and want to support future creations, you can buy me a coffee ☕.

Every bit of support helps me keep creating — just like a spark that can ignite a blazing flame.

👉 Ko-fi: https://ko-fi.com/aiksk

💼 Business Contact

For collaboration or inquiries, please contact aiksk95 on WeChat.

🎥 YouTube 视频教程

想了解这个工作流到底是怎样的工具,以及如何快速启动?

视频主要介绍 工具定位、快速启动方法 和 我的构筑思路。

我们会直接在 RunningHub 上进行演示,让你第一时间看到实际效果。

👉 YouTube 教程: https://youtu.be/yp0dcysS4XY

开始前建议尽量完整地观看视频 —— 把握整体思路会更快上手,也能少走常见弯路。

⚙️ 在线体验工作流

现在就可以在线体验,无需安装。

👉 工作流: https://www.runninghub.ai/post/2021475821034151938/?inviteCode=rh-v1111

打开上方链接即可直接运行该工作流,实时查看生成效果。

如果觉得效果理想,你也可以在本地进行自定义部署。

🎁 粉丝福利: 注册即送 1000 积分,每日登录 100 积分,畅玩 4090 体验 48 G 超级性能!

📺 Bilibili 更新(中国大陆及南亚太地区)

如果你在中国大陆或南亚太地区,可以通过下方视频查看该工作流的实测效果与构思讲解。

📺 B站视频: https://www.bilibili.com/video/BV1REcbzdEdn/

我会在 夸克网盘 持续更新模型资源:

👉 https://pan.quark.cn/s/20c6f6f8d87b

这些资源主要面向本地用户,方便进行创作与学习。