home models images videos articles comics bounties challenges updates shop

Dataset Tools

Name: Dataset Tools
Rating: 5 (30 reviews)
Author: ktiseos_nyx

246

Updated: Apr 14, 2026

tool

tools toolkit dataset python dataset prepration

Download

1 variant available

Archive Other

17.53 MB

Verified: 2 months ago

Download (17.53 MB)

Dataset Tools for Imaging and Captioning

Collection - 5 items

Details

Type

Other

Stats

Reviews

Positive

(3)

Published

Apr 14, 2026

Base Model

Other

Hash

AutoV2

4EE201CB1B

About this version

ktiseos_nyx

Dataset Tools: An AI Metadata Viewer

For the old PYQT6 branch instructions please scroll down OR visit this branch

English Readme • Wiki • Discussions • Notices • License

Dataset Tools NextJS Edition is a local-first web application for browsing AI image datasets with comprehensive metadata extraction. Built from the ground up in TypeScript — no Python dependencies, no OpenCV duct tape, no NumPy startup tax. Running on Next.js 16, React 19, and shadcn/ui components.

Community-Driven Development

This project is inspired by stable-diffusion-prompt-reader and thrives on community contributions. Found a bug? Have a workflow that won't parse? Want to add support for a new tool? We welcome forks, fixes, and pull requests!

Navigation Features • Supported Formats • Installation • Usage • Contributing

Installation

Clone repo

git clone https://github.com/Ktiseos-Nyx/Dataset-Tools.git
cd Dataset-Tools

Install dependencies (Node.js 18+ required)

npm install

Start dev server (For Local Testing)

npm run dev

For production:

npm run build && npm start

Usage

Start the app: npm run dev → open http://localhost:3000
Browse files: Use the file tree sidebar, or click the folder icon to pick any directory.
Drag & drop: Drop an image anywhere in the app — it'll find the folder and load thumbnails.
Inspect metadata: Click any image → metadata panel shows prompts, parameters, LoRAs, and workflow info.
Customize: Settings panel has theme, accent colors, font size, thumbnail size, and file display options.

When metadata fails to parse

Check browser console for parser logs.
Note the workflow structure (ComfyUI? A1111? Custom nodes?).
File an issue with:
- Console error snippet
- Workflow type + custom nodes used
- Minimal repro image (if shareable)

Current Capabilities

| Metadata Parsing | ✅ | 90% success rate. Graph-tracing engine for ComfyUI, field-based detection for A1111/Forge/NovelAI. |

| Image Viewing | ✅ | PNG, JPG, JPEG, WebP. Zoom (25-400%), rotation, fit-to-container. |

| File Browsing | ✅ | Recursive lazy-loading file tree. Browse any folder on your system. |

| Drag & Drop | ✅ | Drop an image to auto-detect its folder and extract metadata. |

| Thumbnails | ✅ | Sharp-powered WebP thumbnails with disk cache .thumbcache/). |

| Sorting | ✅ | Sort by name, date modified, or file size. (Thumbnail Sorting Coming Soon) |

| Accent Colors | ✅ | 7 color themes (zinc, red, orange, green, blue, violet, pink) with dark mode support. |

| WebP Metadata | ⚠️ | Viewing works. Metadata extraction in development.(Not all webP will have metadata, but some animated ones will)|

| ComfyUI Workflows | ✅ | 3-phase extraction: field-based scan → graph trace → type-match fallback. Handles custom nodes, service detection and more. |

| Github Lookup | ✅ | If a node isn't found the first time, search, search again. |

| Custom Workflow Graph Viewer | ✅ | View an image's embedded workflow inside!. |

Supported Formats

A1111 / Forge — PNG tEXt chunks, JPEG EXIF
ComfyUI — JSON workflow with node graph resolution
NovelAI — PNG metadata
Civitai — UTF-16-LE JPEG UserComment
Standard EXIF/IPTC/XMP — All image formats
Png as Jpeg - Magic Byte Detection.

Tech Stack

Framework: Next.js 16 (App Router), React 19, TypeScript 5
UI: shadcn/ui + Radix UI, Lucide icons, Tailwind CSS v4 (OKLch color space)
Thumbnails: Sharp (libvips) with WebP disk caching
Metadata: Pure JS parsing — PNG chunks, JPEG EXIF (exif-parser), ComfyUI graph traversal

Service Detection

As we had in the python edition our journey is marked with making sure as many of the popular and niche sites with generation services are tagged for easy detection. If your workflow or website has a custom pattern for resource detection, our tool is likely to be able to find it. If it hasn't been hashed and dashed through to the tool yet, just flick an issue up and we'll hook it up ASAP.

Examples of this are Tensorart, Forge, ArcEncCiel, CivitAI and Yodayo.

Many tools desktop or remote based have patterns, so the key is either in the metadata handling, or the resource identification.

Why 90% > 65% Matters

The Python edition relied on fragile heuristics. This engine uses deterministic graph traversal with proper node relationship mapping. It follows wires backwards from sampler nodes to find prompts, identifies nodes by their data (not just class_type), and handles platform-wrapped node names via substring matching. When it fails, logs show exactly why.

Why this exists

The Python edition worked at 65% success rate with heuristic spaghetti. This NextJS engine hits 90% success rate on complex ComfyUI workflows using deterministic graph traversal. Metadata is parsed in pure JavaScript — no waiting for Python to boot, no OpenCV overhead.

Development Stage

While this is working 99% better than our original python app, please be aware that as we move this into "ALPHA TESTING" that there will be more bugs, we can't provide enough pre-catching for bugs as we tried for the original python. So we're hoping that the amount of work we've put into porting this into a much easier format you can help test.

Contributing

Found a parsing failure?

Open an issue with the details above. Real-world edge cases are how we push past 90%.

Want to improve the parser?

Fork repo → npm install → npm run dev
Metadata extraction lives in app/api/metadata/route.ts
Test with images from the Metadata Samples/ folder
Submit a PR with before/after evidence

Ideas for contributors

WebP metadata chunk parser
Editable metadata (write back to files)
SQLite indexing for faster folder browsing
ComfyUI workflow visualization
Batch metadata export (CSV/JSON)
Parser debugger panel showing traversal steps
[ ]

Q&A

Q: Are you working on putting this into an executable format?

A: Yes, eventually we'll port this to Electron or see what Tauri needs, we're likely for ease of use likely going to use Electron, as for this we're not sure how Tauri would effect the rendering of metadata or otherwise.

Q: Are you aware that NextJS/Node has more CVE's and is the Number ONE VIBE CODED language this side of the moon?

A: Yes, but just like our python version and anything else we build, we're not like other "VIBE CODED TOOLS" we demand security, peace of mind and a way through the mess. Unlike the trainer which uses major ML stacks to survive, we can promise you A LOT more security with this. As long as you're not installing this on an OpenClaw instance you're fine.

Q: But Chrome Sucks!

A: Duskfallcrew personally reccomends the use of Vivaldi which is a CHROMIUM fork, Electron is only as "BLOATED" as the packages you port with it, along with web trackers and unmitigated cache, image sizes and lazy loading issues. When we port our executable mode, we'll make sure the thing runs on every flipping potato machine out there!

Q: Why is this not already 100% Done?

A: Because I have 20 projects on the go? Plus i'm currently the solo dork with AI assistance, yes Joel does supervise and add things to the code, his ComfyUI lookup tool is what powers this - Exception: Claude translated it to node because Joel's actually a real life developer, he's not got time to do EVERYTHING. Plus when he's not at work or with his family: He's an A+ Smexy FFXIV player! He beats our Miqo'te's poor fashion choices just by existing!

License

GNU General Public License v3.0

Acknowledgements

Core Parsing Logic & Inspiration: This project incorporates and significantly adapts parsing functionalities from Stable Diffusion Prompt Reader by receyuki . Our sincere thanks for this foundational work. Original Repository: stable-diffusion-prompt-reader The original MIT license for this vendored code is included in the NOTICE.md file.
Traugdor For the supervision, the memes and this: Python ComfyUI Node Finder
Everyone at Arc En Ciel for your continued driven support.
Anthropic - Pls Keep Sending us Free Credits, we're broke!
Anzhc for continued support and motivation.
Our peers and the wider AI and open-source communities for their continuous support and inspiration.
Mempalace for Neurodivergent Memory Support for Local Development Mempalace @ Github
AI Language Models (like those from Google, OpenAI, Anthropic) for assistance with code generation, documentation, and problem-solving during development.
...and many more!

SPECIAL THANKS

Supervised by: traugdor
Special Thanks to contributors: Open Source Community, Whitevamp, Exdysa, and so many more.
Special Thanks to Anthropic for the numerous amounts of insanely valuable free credits during marketing ploys. While we're only on the USD 20 a month plan, anything helps so throw more of these our way because development has become sort of a job for us!

Support Development

PYTHON INSTRUCTIONS:

Dataset-Tools is a desktop application designed to help users browse and manage their image and text datasets, particularly those used with AI art generation tools like Stable Diffusion. Developed using PyQt6, it provides a simple and intuitive graphical interface for browsing images, viewing metadata, and examining associated text prompts. This project is inspired by tools within the AI art community (receyuki/stable-diffusion-prompt-reader) and aims to empower users in improving their dataset curation workflow.

Daily updates are here: https://github.com/Ktiseos-Nyx/Dataset-Tools

How to Use Dataset-Tools

Requirements

To run the program, you will need the following software:

Python:

Python.org or Try uv

Git:

Launching the Application

Open your terminal shell console of choice. (ie: powershell, cmd, zsh, bash, etc.)
git clone or download the Dataset-Tools repository from GitHub.
```
git clone https://github.com/Ktiseos-Nyx/Dataset-Tools.git
```
Move into Dataset-Tools folder and pip install the required dependencies:
```
cd Dataset-Tools
pip install .
```
NOTE: uv users
```
cd Dataset-Tools
uv pip install .
```
Run the application with dataset-tools command:
```
dataset-tools
```

You're in!

User Interface Overview

The application window has the following main components:

Current Folder: Displays the path of the currently loaded folder.
Open Folder: A button to select a folder containing images and text files.
Image List: Displays a list of images and text files found in the selected folder.
Image Preview: An area to display a selected image.
Metadata Box: A text area to display the extracted metadata from the selected image (including Stable Diffusion prompt, settings, etc.).
Prompt Text: A text label to display the prompt from the selected image.
Text File Content Area: A text area to display the content of any associated text files.

Managing Images and Text

Selecting Images: Click on an image or text file in the list to display its preview, metadata, and associated text content.
Viewing Metadata: Metadata associated with the selected image is displayed on the text area, such as steps, samplers, seeds, and more.
Viewing Text: The content of any text file associated with the selected image is displayed on the text box.

Key Features

Graphical User Interface (GUI): Built with PyQt6 for a modern and cross-platform experience.
Image Previews: Quickly view images in a dedicated preview area.
Metadata Extraction: Extract and display relevant metadata from PNG image files, especially those generated from Stable Diffusion.
Text Viewing: Display the content of text files.
Clear Layout: A simple and intuitive layout, with list view on the left, and preview on the right.

Future Developments

Thumbnail Generation: Implement thumbnails for faster browsing.
JPEG Metadata: Add support for extracting metadata from JPEG files.
Themes: Introduce customizable themes for appearance.
Filtering/Sorting: Options to filter and sort files.
Better User Experience: Test on different operating systems and screen resolutions to optimize user experience.
Video Tutorials: Create video tutorials to show users how to use the program.
Text Tutorials: Create detailed tutorials in text and image to show the user how to use the program.

Am available for commissions.

Recipe:

1 cup sass

3 cups WHOOPS.

5 cups WHERES THE. CHEETOS?

Bake at 450 for 24 hours and then call the nearest fire department.

Read a few memes while you're in the ER for severe burns

(KIDDING.)

Our Discord: https://discord.gg/HhBSvM9gBY

Earth & Dusk Media https://discord.gg/5t2kYxt7An

Backups: https://huggingface.co/EarthnDusk

Send a Pizza: https://ko-fi.com/duskfallcrew/

WE ARE PROUDLY SUPPORTED BY: https://yodayo.com/ / https://moescape.ai/

JOIN OUR DA GROUP: https://www.deviantart.com/diffusionai

JOIN OUR SUBREDDIT: https://www.reddit.com/r/earthndusk/