Every ComfyUI-generated PNG contains something remarkable: the complete workflow that created it, embedded directly in the file. The seed, the sampler settings, the prompt text, every model reference—it is all there, stored in PNG text chunks most tools ignore.
I have spent months parsing thousands of these files while building Numonic's metadata extraction pipeline. This guide shares what I have learned about how ComfyUI encodes workflow data, where to find specific parameters, and how to extract them programmatically—whether you are building tools, organizing your generations, or just trying to figure out what settings produced that one perfect image.
Part of our The Complete Guide to ComfyUI Asset Management
PNG Structure Basics
Before diving into ComfyUI specifics, let us understand how PNG files store data. A PNG file is composed of a signature followed by a series of chunks. Each chunk contains a specific type of data.
PNG Signature
Every valid PNG file begins with an 8-byte signature:
89 50 4E 47 0D 0A 1A 0A
| P N G \r \n ^Z \n
In decimal: 137 80 78 71 13 10 26 10This signature serves multiple purposes: it identifies the file as PNG, detects transmission errors that might corrupt line endings, and stops display of the file in text-only viewers.
Chunk Structure
After the signature, the file contains a sequence of chunks:
+-------------+--------------+-------------+----------+
| Length | Type | Data | CRC |
| (4 bytes) | (4 bytes) | (N bytes) | (4 bytes)|
| big-endian | ASCII | payload | checksum |
+-------------+--------------+-------------+----------+- Length: 4 bytes, big-endian unsigned integer specifying the length of the data field
- Type: 4 ASCII characters identifying the chunk type (e.g., IHDR, IDAT, tEXt)
- Data: The chunk payload, length specified by the Length field
- CRC: 4-byte CRC32 checksum of Type + Data
Critical chunks that every PNG must have:
- IHDR: Image header (dimensions, bit depth, color type)
- IDAT: Image data (compressed pixel data)
- IEND: Image end marker
ComfyUI stores its metadata in ancillary text chunks—chunks that can be safely ignored by decoders that do not understand them.
Text Chunk Types: tEXt, zTXt, and iTXt
PNG defines three types of text chunks for storing keyword-value pairs. ComfyUI can use any of these depending on the workflow size and configuration.
tEXt (Uncompressed Text)
The simplest text chunk format:
+------------------+-------------+--------------------+
| Keyword | Null byte | Text |
| (1-79 bytes) | (0x00) | (0+ bytes) |
| Latin-1 | separator | Latin-1 |
+------------------+-------------+--------------------+
Example:
Keyword: "prompt"
Null: 0x00
Text: "masterpiece, highly detailed, portrait"Limitations: Latin-1 encoding only (no Unicode), uncompressed (inefficient for large workflows), keyword max 79 characters.
zTXt (Compressed Text)
Compressed text chunk using zlib/deflate compression. This is what ComfyUI typically uses for workflow data because workflows can be very large.
+--------------+-----------+--------------------+-----------------+
| Keyword | Null | Compression | Compressed |
| (1-79 bytes)| (0x00) | Method (1 byte) | Text |
| Latin-1 | separator | (0 = deflate) | (zlib deflate) |
+--------------+-----------+--------------------+-----------------+
Compression method 0 is the only defined method (zlib deflate).Key point: When extracting zTXt chunks, you must decompress the text using zlib's inflate function before parsing.
iTXt (International Text)
The most flexible format, supporting UTF-8 encoding and optional compression:
+----------+------+-----------+-----------+----------+------+--------------+------+----------+
| Keyword | Null | Compress | Compress | Language | Null | Translated | Null | Text |
| (Latin-1)|(0x00)| Flag | Method | Tag |(0x00)| Keyword |(0x00)| (UTF-8) |
+----------+------+-----------+-----------+----------+------+--------------+------+----------+
Compression flag: 0 = uncompressed, 1 = compressed
Compression method: 0 = deflate (only when flag = 1)Advantage: Full Unicode support for prompts containing non-Latin characters.
ComfyUI Keywords: workflow and prompt
ComfyUI uses two primary keywords to store metadata:
Keyword: "workflow"
Contains the UI/visual workflow - the full node graph as displayed in the ComfyUI interface. Includes node positions, colors, and all visual layout information.
Format: JSON object with nodes array and links array
Keyword: "prompt"
Contains the execution workflow - the actual parameters used during generation. This is what you need for reproducibility.
Format: JSON object with node IDs as keys, each containing class_type and inputs
Important distinction: The "prompt" keyword does not contain just the text prompt—it contains the entire execution state. The name is historical from early ComfyUI development.
Workflow JSON Structure
Execution Format (prompt keyword)
This is the critical format for parameter extraction. Each node is keyed by its ID:
{
"1": {
"class_type": "CheckpointLoaderSimple",
"inputs": {
"ckpt_name": "dreamshaper_8.safetensors"
}
},
"2": {
"class_type": "CLIPTextEncode",
"inputs": {
"text": "masterpiece, highly detailed portrait",
"clip": ["1", 1]
},
"_meta": {
"title": "Positive Prompt"
}
},
"3": {
"class_type": "CLIPTextEncode",
"inputs": {
"text": "blurry, low quality, distorted",
"clip": ["1", 1]
},
"_meta": {
"title": "Negative Prompt"
}
},
"4": {
"class_type": "EmptyLatentImage",
"inputs": {
"width": 512,
"height": 768,
"batch_size": 1
}
},
"5": {
"class_type": "KSampler",
"inputs": {
"seed": 42,
"steps": 20,
"cfg": 7.5,
"sampler_name": "euler_ancestral",
"scheduler": "normal",
"denoise": 1.0,
"model": ["1", 0],
"positive": ["2", 0],
"negative": ["3", 0],
"latent_image": ["4", 0]
}
}
}Node Structure Explained
- class_type: The node type (e.g., KSampler, CLIPTextEncode)
- inputs: Object containing all node parameters
- _meta: Optional metadata including custom node titles
Connection Format
Connections between nodes are represented as arrays with two elements:
"model": ["1", 0]
| |
| +-- Output slot index (0-based)
+------- Source node ID (string)This means "connect to output slot 0 of node 1". Literal values (not connections) are stored directly.
Where Parameters Are Stored
Here is a comprehensive reference for finding specific parameters:
Generation Parameters (KSampler / KSamplerAdvanced)
| Parameter | Input Key | Type | Typical Values |
|---|---|---|---|
| Seed | seed or noise_seed | integer | 0 to 2^64-1 |
| Steps | steps | integer | 10-150 |
| CFG Scale | cfg | float | 1.0-30.0 |
| Sampler | sampler_name | string | euler, dpmpp_2m, etc. |
| Scheduler | scheduler | string | normal, karras, etc. |
| Denoise | denoise | float | 0.0-1.0 |
Image Dimensions (EmptyLatentImage)
| Parameter | Input Key | Notes |
|---|---|---|
| Width | width | Usually divisible by 8 or 64 |
| Height | height | Usually divisible by 8 or 64 |
| Batch Size | batch_size | Number of images per batch |
Model References
| Model Type | Node Class | Input Key |
|---|---|---|
| Checkpoint | CheckpointLoaderSimple | ckpt_name |
| LoRA | LoraLoader | lora_name, strength_model, strength_clip |
| VAE | VAELoader | vae_name |
| ControlNet | ControlNetLoader | control_net_name |
Text Prompts (CLIPTextEncode)
Prompts are stored in CLIPTextEncode nodes. Finding positive vs negative prompts requires tracing connections:
- Positive prompt: Connected to KSampler's
positiveinput - Negative prompt: Connected to KSampler's
negativeinput - Alternative: Check
_meta.titlefor "Positive" or "Negative" labels
Why Extraction Is Harder Than It Looks
Extracting ComfyUI metadata sounds straightforward until you try it at scale. A quick check with exiftool confirms the data exists:
# See if metadata exists
exiftool -PNG:all ComfyUI_00001.pngBut getting from "the data exists" to "I can search by seed value across 10,000 images" is where the real complexity begins. Here is what I discovered building extraction at scale:
The Compression Problem
ComfyUI uses zTXt (compressed) chunks for large workflows. Your code must detect the chunk type and decompress accordingly—tEXt chunks are plain text, zTXt requires zlib inflation, and iTXt adds language tags and optional compression on top of that.
Simple libraries like PIL hide this complexity for basic cases, but fail silently on edge cases. We have seen workflows where the compression method byte is malformed, or where the decompressed output exceeds expected bounds.
The JSON Sanitization Nightmare
ComfyUI workflows often contain non-standard JSON that breaks parsers: NaN values, Infinity, trailing commas, single quotes, even JavaScript-style comments. Each of these requires special handling before you can parse the workflow.
We have identified over a dozen distinct JSON malformation patterns in the wild. Miss one and your extraction silently fails on a subset of files.
Schema Versioning
ComfyUI's workflow format has evolved. Version 0.4 uses six-element arrays for connections. Version 1.0 uses object-based link structures. Your extraction code needs to handle both—and detect which version you are dealing with, since many files lack explicit version markers.
Custom Node Chaos
The ComfyUI ecosystem has thousands of custom nodes, each with unique parameter names and structures. Finding "the model" requires pattern matching across dozens of possible field names. Finding "the prompt" requires graph traversal to trace connections back to the sampler node.
We maintain a database of node signatures and continuously update it as new custom nodes emerge.
Skip the parsing headaches
I have spent months solving these edge cases so you do not have to. Numonic handles compression, schema versions, custom nodes, and malformed JSON automatically. Just sync your ComfyUI output folder and search by any parameter.
See how it worksThe Scale Problem
Even if you solve extraction, you still need to make the data searchable. Which means parsing every file, normalizing the schema, indexing by parameter, handling updates when you regenerate images—and doing it fast enough that search feels instant.
I have seen teams spend weeks building extraction code only to discover it fails on 15% of their files due to edge cases they did not anticipate. And that is before they tackle search, organization, or lineage tracking.
Common Issues That Break DIY Solutions
These are the issues I see most often when teams try to build their own extraction pipelines:
1. Metadata Disappears After Editing
Photoshop, GIMP, Canva, and most online tools strip PNG metadata when saving. One round of editing and your workflow data is gone forever. This is the most common way teams lose provenance—and they often do not realize it until months later when they need to reproduce something.
The challenge: You need to extract and store metadata separately before any editing, then maintain the association between the edited file and its original provenance. This is not a parsing problem—it is an asset management problem.
2. Inconsistent Chunk Types
ComfyUI uses tEXt (uncompressed) for small workflows and zTXt (compressed) for large ones. But the threshold is not documented, and some custom node packs override this behavior. Your code must detect and handle both—plus iTXt for international characters—without failing silently.
3. Malformed JSON Everywhere
ComfyUI workflows regularly contain invalid JSON: NaN values, Infinity, trailing commas, single quotes, JavaScript comments, and more. Standard JSON parsers reject these. You need robust sanitization that handles every edge case without corrupting legitimate data.
The tricky part: some of these "invalid" values are meaningful. NaN in a seed field might indicate randomization. Simply replacing with null loses information.
4. Custom Nodes Use Non-Standard Parameters
The ComfyUI ecosystem has thousands of custom nodes. Each one invents its own parameter names: ckpt_name, model_name, checkpoint, unet_name—all meaning the same thing. Finding "the model" requires pattern matching across dozens of possible field names and node types.
We maintain a continuously updated database of node signatures to handle this. Building your own means constantly chasing new custom nodes.
5. Multiple Samplers, No Clear Primary
Complex workflows have multiple KSampler nodes: initial generation, refinement passes, upscaling, inpainting. Which one is "the" sampler? The answer requires graph traversal—tracing connections backward from the SaveImage node to find the generation chain.
Simple heuristics (first sampler found) fail on real-world workflows. Proper extraction requires understanding workflow topology.
6. Lineage Tracking Across Variations
When you upscale an image, or create a variation, or use img2img, the output is related to the input—but ComfyUI does not track this relationship in metadata. Building lineage requires matching images by workflow structure, seed values, and timing heuristics.
This is where DIY solutions usually give up. It is not an extraction problem anymore—it is a graph database problem.
Key Takeaways
- 1.ComfyUI stores complete workflow data in PNG text chunks using "workflow" (UI format) and "prompt" (execution parameters) keywords
- 2.Extraction requires handling three chunk types, compression, and malformed JSON—simple scripts fail on 10-15% of real-world files
- 3.Finding specific parameters requires graph traversal, not just field lookup—prompts, models, and seeds are scattered across connected nodes
- 4.Custom nodes add complexity: thousands of nodes with non-standard parameter names require continuous maintenance
- 5.Extraction is just the beginning: making metadata searchable, tracking lineage, and preserving provenance through edits requires purpose-built infrastructure
The Bottom Line
Understanding PNG metadata structure is valuable. Building and maintaining extraction at scale is a different challenge entirely—one that pulls focus from your actual creative work. Numonic handles all of this automatically so you can spend your time creating, not parsing.
Try Numonic free