Technical Reference

ComfyUI Metadata Schema Documentation

The exact JSON structures ComfyUI embeds in PNG tEXt/zTXt chunks, with annotated examples for five common workflow types.

How ComfyUI Stores Metadata

Every PNG saved by ComfyUI contains two JSON objects embedded as text chunks in the PNG file header. These are standard PNG tEXt or zTXt (compressed) ancillary chunks, readable by any PNG library.

"prompt" chunk

The execution data. A flat map of node IDs to their class types, inputs, and connections. This is what you need to extract prompts, seeds, models, and parameters.

"workflow" chunk

The visual graph. Contains node positions, sizes, groups, links, and canvas state. Needed to recreate the workflow in ComfyUI's UI, but not for data extraction.

For MP4 files created with ComfyUI-VideoHelperSuite, the same JSON is stored in the ©cmt (Comment) atom as {"prompt": {...}, "workflow": {...}}.

Prompt Schema (Execution Data)

PNG chunk key: prompt

Field	Type	Req	Description
▾{nodeId}	object	yes	Each top-level key is a numeric string ("1", "2", etc.) identifying a node in the execution graph.
class_type	string	yes	The node type, e.g. "KSampler", "CLIPTextEncode". Must match a registered node in NODE_CLASS_MAPPINGS.
inputs	object	yes	Parameters for this node. Values are either direct (string, number, boolean) or a connection reference ["nodeId", outputIndex].
_meta	object	no	Optional metadata. Contains a "title" field with the display name shown in the ComfyUI canvas.
is_changed	string[]	no	Array of hashes indicating changed inputs. Used by ComfyUI Cloud for cache invalidation.

Connection references: When an input value is an array like ["4", 1], it means "connect to output slot 1 of node 4". The first element is always a string (node ID), the second is an integer (output index).

Workflow Schema (Visual Graph)

PNG chunk key: workflow · Schema version: 1.0

Field	Type	Req	Description
version	number	yes	Schema version. 1 for modern workflows (Jan 2024+), 0.4 for legacy LiteGraph format.
state	object	yes	Graph counters: lastNodeId, lastLinkId, lastGroupId, lastRerouteId. Used for generating unique IDs.
▾nodes	array	yes	Array of node objects with visual properties (position, size, flags, color) and widget values.
id	number \| string	yes	Unique node identifier matching prompt keys.
type	string	yes	Class type (same as prompt class_type).
pos	[x, y]	yes	Canvas position as [x, y] coordinates.
size	[w, h]	yes	Node dimensions [width, height] in pixels.
flags	object	yes	Display flags: collapsed, pinned, allow_interaction, horizontal.
mode	number	yes	0 = normal, 2 = muted (bypassed), 4 = pinned.
widgets_values	array \| object	no	Widget parameter values in display order. Legacy format — prefer reading from prompt inputs.
inputs	array	no	Input slot definitions with name, type, and link ID reference.
outputs	array	no	Output slot definitions with name, type, and array of connected link IDs.
links	array	no	Connection definitions: { id, origin_id, origin_slot, target_id, target_slot, type }. Maps to prompt connection refs.
groups	array	no	Visual grouping boxes. Cosmetic only — no execution effect. Each has title, bounding [x,y,w,h], color.
config	object	no	Canvas configuration: links_ontop (boolean), align_to_grid (boolean).
extra	object	no	Workspace metadata including ds (scale/offset for viewport), info (name, author, description, software), linkExtensions.
reroutes	array	no	Visual reroute waypoints for links. Cosmetic — no execution effect.
models	array	no	Model references with name, url, hash, hash_type, directory. Added by model management extensions.

KSampler Parameter Reference

The KSampler node is the heart of every generation workflow. These are the parameters that most affect output quality.

Parameter	Type	Values	Description
seed	integer	`0 – 2^53`	Random seed for reproducibility. Same seed + same parameters = same image.
steps	integer	`1 – 150 (typical: 20–30)`	Number of denoising steps. More steps = higher quality, slower generation.
cfg	float	`1.0 – 30.0 (typical: 6–8)`	Classifier-free guidance scale. Higher = closer to prompt, lower = more creative.
sampler_name	string	`euler, euler_ancestral, dpmpp_2m, dpmpp_2m_sde, ddim, uni_pc ...`	Sampling algorithm. Different samplers produce different aesthetics at the same step count.
scheduler	string	`normal, karras, exponential, sgm_uniform, simple, ddim_uniform, beta`	Noise schedule. "karras" is popular for sharper results; "normal" is the default.
denoise	float	`0.0 – 1.0`	1.0 = full generation (txt2img). Lower values preserve more of the input image (img2img). 0.65 is a common starting point.
model	reference	`["nodeId", 0]`	Connection to checkpoint model output.
positive	reference	`["nodeId", 0]`	Connection to positive conditioning (CLIPTextEncode).
negative	reference	`["nodeId", 0]`	Connection to negative conditioning (CLIPTextEncode).
latent_image	reference	`["nodeId", 0]`	Connection to latent input (EmptyLatentImage for txt2img, VAEEncode for img2img).

KSamplerAdvanced uses noise_seed instead of seed and adds add_noise, start_at_step, end_at_step, and return_with_leftover_noise parameters.

Annotated Workflow Examples

Real prompt-chunk JSON for five common workflow types. Expand any example to see the full structure with annotations.

Key Nodes

CheckpointLoaderSimpleCLIPTextEncode (x2)EmptyLatentImageKSamplerVAEDecodeSaveImage

How to Identify This Workflow Type

Uses EmptyLatentImage (no image input) and KSampler with denoise: 1.0.

{
  "4": {
    "class_type": "CheckpointLoaderSimple",
    "inputs": {
      "ckpt_name": "v1-5-pruned-emaonly.safetensors"
    },
    "_meta": { "title": "Load Checkpoint" }
  },
  "6": {
    "class_type": "CLIPTextEncode",
    "inputs": {
      "text": "a beautiful landscape, golden hour, mountains",
      "clip": ["4", 1]
    },
    "_meta": { "title": "Positive Prompt" }
  },
  "7": {
    "class_type": "CLIPTextEncode",
    "inputs": {
      "text": "blurry, low quality, watermark",
      "clip": ["4", 1]
    },
    "_meta": { "title": "Negative Prompt" }
  },
  "5": {
    "class_type": "EmptyLatentImage",
    "inputs": {
      "width": 512,
      "height": 512,
      "batch_size": 1
    },
    "_meta": { "title": "Empty Latent Image" }
  },
  "3": {
    "class_type": "KSampler",
    "inputs": {
      "seed": 156680208700286,
      "steps": 20,
      "cfg": 8.0,
      "sampler_name": "euler",
      "scheduler": "normal",
      "denoise": 1.0,
      "model": ["4", 0],
      "positive": ["6", 0],
      "negative": ["7", 0],
      "latent_image": ["5", 0]
    },
    "_meta": { "title": "KSampler" }
  },
  "8": {
    "class_type": "VAEDecode",
    "inputs": {
      "samples": ["3", 0],
      "vae": ["4", 2]
    },
    "_meta": { "title": "VAE Decode" }
  },
  "9": {
    "class_type": "SaveImage",
    "inputs": {
      "filename_prefix": "ComfyUI",
      "images": ["8", 0]
    },
    "_meta": { "title": "Save Image" }
  }
}

Parsing metadata at scale?

These schemas cover standard workflows. In practice, custom nodes create unpredictable structures — nested groups, non-standard widget formats, missing metadata fields. Numonic handles all variants automatically and maps the data to a searchable, structured catalog.

Try the Metadata Inspector View code snippets

Note: This documentation reflects the ComfyUI metadata format as of February 2026. Custom nodes may extend or modify these schemas. ComfyUI is an open-source project — schema changes may occur between releases. Always validate metadata against the actual PNG chunk contents for your specific version.

ComfyUI Metadata Schema Documentation

How ComfyUI Stores Metadata

"prompt" chunk

"workflow" chunk

Prompt Schema (Execution Data)

Workflow Schema (Visual Graph)

KSampler Parameter Reference

Annotated Workflow Examples

Text-to-Image (txt2img)

Image-to-Image (img2img)

ControlNet

IP-Adapter (Image Prompt)

Video (AnimateDiff)

Parsing metadata at scale?

ComfyUI Metadata Schema Documentation

How ComfyUI Stores Metadata

"prompt" chunk

"workflow" chunk

Prompt Schema (Execution Data)

Workflow Schema (Visual Graph)

KSampler Parameter Reference

Annotated Workflow Examples

Text-to-Image (txt2img)

Image-to-Image (img2img)

ControlNet

IP-Adapter (Image Prompt)

Video (AnimateDiff)

Parsing metadata at scale?