Technical Architecture

Cross-Tool Provenance

Creative work flows across tool boundaries — ComfyUI to Photoshop to Midjourney. Each tool has its own metadata format, its own provenance model, and no awareness of what came before. Maintaining provenance chains across ecosystems is an unsolved architectural problem.

February 25, 202610 minNumonic Team
Abstract visualization: Neon network of glowing nodes

A concept artist generates a character pose in Midjourney. They bring the result into Photoshop for color correction, compositing, and detail refinement. The refined image goes into ComfyUI as an input for ControlNet-guided generation. The output becomes a texture reference for a 3D modeler working in Blender. This four-tool pipeline is not unusual — it is how professional creative work actually happens.

At each boundary crossing, something is lost. Midjourney embeds its generation parameters in EXIF description fields. Photoshop writes its edit history into XMP packets but doesn't read Midjourney's EXIF parameters. ComfyUI records its workflow graph in PNG chunks but has no concept of the Photoshop edits that preceded it. By the time the asset reaches the end of the pipeline, its provenance chain is fragmented across four incompatible metadata systems — and no single system has the complete picture.

The Forces at Work

Several forces make cross-tool provenance architecturally difficult:

  • No shared identity system: Each tool identifies assets differently. Midjourney uses Discord message IDs and job hashes. ComfyUI uses filenames on local disk. Photoshop tracks documents by internal UUIDs. There is no shared identifier that follows an asset across tool boundaries. When a ComfyUI output is imported into Photoshop, the connection between the two representations exists only in the creator's memory.
  • Incompatible metadata formats: The two metadata problem is compounded across tools. Each tool writes to different containers within image files, using different schemas, with different semantics. Midjourney's “--ar 16:9” parameter string and ComfyUI's workflow JSON are both “generation parameters,” but they share no structural similarity.
  • Destructive handoffs: Many creative tools strip or overwrite metadata from previous tools during import. Photoshop's “Save for Web” removes EXIF data. Some image viewers strip PNG text chunks when re-saving. The handoff between tools isn't just metadata-incompatible — it is actively metadata-destructive.
  • No awareness of predecessors: Tools are designed as endpoints, not pipeline stages. ComfyUI doesn't know that its input image came from Midjourney via Photoshop. It treats every input as opaque pixel data. Without awareness of what came before, a tool cannot extend a provenance chain — only start a new one.

The Problem

Creative work that flows across tool boundaries loses its provenance at each crossing. No tool in the generative creative ecosystem maintains awareness of what happened in other tools. The result is that an asset's complete history — the sequence of decisions, transformations, and generations that produced it — exists nowhere except in the creator's recollection. This makes reproducibility impossible, compliance auditing unreliable, and creative attribution incomplete.

The problem is not that individual tools lack provenance. ComfyUI's workflow metadata is remarkably detailed. Midjourney records every generation parameter. The problem is that these provenance systems are islands — each comprehensive within its own boundaries, but fundamentally disconnected from every other tool in the pipeline.

Toward Cross-Tool Provenance

Solving cross-tool provenance requires a system that sits outside any individual tool and maintains the connections between them. Several architectural approaches are emerging:

1. External Provenance Registry

Instead of relying on each tool to maintain provenance awareness, an external system observes when assets move between tools and records the connections. When a ComfyUI output is imported into Photoshop, the registry notes: “Asset B (Photoshop document) derives from Asset A (ComfyUI output).” The connection exists in the registry, not in either tool's native metadata.

This approach works because it requires no cooperation from the tools themselves. It observes file system events, detects when assets enter and exit tools based on file creation patterns, and infers relationships from temporal proximity and content similarity. The metadata inversion pattern — capturing what tools emit rather than requiring tools to declare — applies here.

2. Content-Based Linking

When an asset is modified across tools, its pixel content changes — but often not drastically. Perceptual hashing can detect that a Photoshop-edited version of a Midjourney output is “derived from” the original, even though the files are byte-for-byte different. This allows provenance linking without shared identifiers — the content itself becomes the connection.

3. The C2PA Vision — And Its Current Limits

The C2PA standard envisions exactly this problem solved through cryptographic provenance chains. Each tool would add a signed manifest recording what it did, creating an unbroken chain from creation to distribution. In theory, this is the ideal solution. In practice, adoption remains uneven — most generative AI tools do not yet produce or consume C2PA manifests, and the standard's assumption of cooperative participants doesn't match the current ecosystem reality.

4. Hybrid Approaches

The pragmatic response combines multiple strategies: use metadata extraction where it exists, content-based linking where metadata is stripped, temporal clustering to group related assets into creative sessions, and manual annotation as a fallback. No single strategy covers all cases, but layered together they can reconstruct provenance chains that no individual tool maintains.

Consequences

Benefits

  • Reconstructed creative history: Teams can trace an asset back through its full pipeline — from initial generation through refinement to final output — even when the tools involved had no awareness of each other.
  • Compliance across the pipeline: When EU AI Act Article 50 requires disclosure of AI-generated content, cross-tool provenance allows teams to determine whether any stage of the pipeline involved generative AI — not just the final tool.
  • Attribution integrity: In collaborative teams, knowing which tools and which artists contributed to a final asset enables proper creative attribution.

Costs and Limitations

  • Inference is imperfect: Provenance relationships inferred from content similarity and temporal proximity are probabilistic, not deterministic. A system may incorrectly link unrelated assets or miss genuine connections.
  • Complexity scales with tool count: Each additional tool in the pipeline adds another format to understand, another handoff pattern to observe, and another potential point of metadata loss. The combinatorial complexity grows quickly.
  • Privacy tension: Comprehensive provenance tracking records which tools were used, which models, which parameters. This is valuable for compliance but represents a privacy consideration — the metadata persistence problem applies to provenance data itself.

Related Patterns

  • The Two Metadata Problem — the format divergence that makes cross-tool provenance fundamentally difficult.
  • Metadata Inversion — the capture-first pattern that enables provenance collection without tool cooperation.
  • Metadata Persistence — the compliance dimension of provenance data that survives across tool boundaries.
  • Keyword Search Failure — the search challenge that cross-tool provenance metadata compounds when every tool describes assets differently.

See What Metadata Your Tools Leave Behind

Upload any AI-generated image and discover what provenance data is embedded — from ComfyUI workflow graphs to Midjourney parameters.

Try the Metadata Inspector