Sign In

Foot Anime Yolo11m

Download

1 variant available

PickleTensor

38.64 MB

Verified:

Type

Detection

Stats

67

Reviews

Published

Jun 13, 2026

Base Model

Other

Hash

AutoV2
F43D21E5BE

Anime foot detector (YOLO11m) — ADetailer / Impact Pack


Also on Hugging Face (same files + ONNX, I can't upload ONNX files in this repo for some reason): https://huggingface.co/Claquasse/foot_anime_yolo

A small YOLO11m detector that finds feet in anime and illustration images (single class foot). Drop it into an ADetailer or Impact Pack pass to auto-fix feet, which diffusion models often render badly. Built for the Anima model, but it works on anime-style art in general, so it should transfer to other anime or illustration generators.


Three versions are provided. v3 is the one to use — best box accuracy and the widest coverage. v2 is the previous best and a little better at plain, clearly visible feet. v1 is the first and weakest, kept for reference.


Files


Each version ships as .pt (load directly in ComfyUI or Ultralytics) and .onnx (non-pickle, for ONNX Runtime).


- foot_anime_yolo11m_v3.pt — production (recommended)

- foot_anime_yolo11m_v2.pt — previous production

- foot_anime_yolo11m_v1.pt — reference


Install


ComfyUI (Impact Pack): put the .pt in ComfyUI/models/ultralytics/bbox/, load it with UltralyticsDetectorProvider, and feed the bounding box into a detail or inpaint pass. A bbox threshold near 0.45 is a sensible default.


A1111 / Forge (ADetailer): put the .pt in stable-diffusion-webui/models/adetailer/ and select it as the ADetailer model.


Benchmark


Held-out set of 100 generated anime images (185 feet), none of which the models trained on. Scores are mAP50 / mAP50-95.


| model | mAP50 | mAP50-95 |

|---|---|---|

| v1 | 0.28 | 0.08 |

| v2 | 0.81 | 0.50 |

| v3 | 0.81 | 0.59 |


v3 has the tightest boxes in every image type and matches or beats v2 at finding feet. Open-toe footwear is the hardest case for all of them.


The preview images show all three versions plus a generic YOLOv8x foot detector run on the same frame at once (red = v3, green = v2, blue = v1, yellow = generic YOLOv8x reference), so you can see how they compare.


Notes and scope


Trained on bare anime feet, mined from Danbooru and labeled with DWPose keypoints, plus the public-domain ANFDet set, a few hundred hand-labeled images, and feet-free images as hard negatives. v3 was trained on roughly 286k images. Footwear, sandals, and stockings sit outside the primary case, though v3 generalizes to them noticeably better than v1 or v2. Tuned for anime and illustration, not photographs.


The boxes are meant to feed a refiner, not to stand alone. v2 and v3 draw slightly looser boxes that wrap the whole foot, which is what you want for an inpaint pass.


License: AGPL-3.0 (inherited from Ultralytics YOLO). If you serve these weights over a network, AGPL's source-availability terms apply. The AGPL license is the authoritative one regardless of the toggles on this page.


Support

Building these means mining and labeling hundreds of thousands of images and renting GPUs to train on them, which takes real time and money. If the models are useful to you and you want to chip in, it is appreciated and never expected: https://ko-fi.com/claquasse