SAGE: Structured Agentic Graph Editing for Software Diagrams

James C. Davis; Neal Singh; Tyler Sivertsen

arxiv: 2607.01102 · v1 · pith:ZBFVUK3Pnew · submitted 2026-07-01 · 💻 cs.SE

SAGE: Structured Agentic Graph Editing for Software Diagrams

Tyler Sivertsen , Neal Singh , James C. Davis This is my paper

Pith reviewed 2026-07-02 08:07 UTC · model grok-4.3

classification 💻 cs.SE

keywords software diagramsprompt-guided editinggraph representationDraw.ioMermaiddiagram validationstructured editingagentic workflow

0 comments

The pith

SAGE turns natural language edit requests into validated graph operations on diagram structure.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents SAGE, a browser-based system that converts software diagrams into an editable graph form so that natural language prompts can drive changes without destroying layout or meaning. It breaks each request into structured intents, maps those intents to explicit graph operations, applies validation and repair steps for common XML problems, and keeps the results as versioned artifacts. This matters because direct LLM edits to diagram files frequently break connectivity or visual consistency, and the graph layer is meant to catch and prevent those failures. Evaluation through unit tests and a Kubernetes architecture example measures how often the output remains structurally valid and how often unrelated parts stay untouched.

Core claim

SAGE maps diagrams into an editable graph representation, translates natural language requests into structured edit intents, analyzes those intents into graph-oriented operation steps, validates and repairs common Draw.io XML issues, and stores successful results as recoverable versioned artifacts. The design keeps structured state management separate from model-driven interpretation, though some XML edits still rely on model assistance. The tool also supports direct canvas editing and a mask-based image workflow.

What carries the argument

The editable graph representation that converts diagram elements into nodes and edges so intents can be decomposed into atomic, verifiable graph operations.

If this is right

Diagram edits succeed while leaving unrelated elements unchanged.
Common Draw.io XML errors are automatically detected and repaired before output.
Each successful edit produces a recoverable versioned artifact.
The same pipeline supports both Draw.io and Mermaid formats.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The separation of graph state from model interpretation could let other tools reuse the validation layer even when different language models supply the intents.
Because the operations are explicit graph steps, the approach might extend to automated testing of diagram consistency rules beyond what the current validator checks.
The versioned artifact store opens a path to undo or diff sequences of prompt-driven changes without manual file management.

Load-bearing premise

Natural language prompts can be translated reliably into graph operations that preserve visual layout, editable structure, and semantic relationships without introducing inconsistencies that the validation step cannot catch.

What would settle it

A natural language prompt whose resulting diagram passes all validation checks yet contains broken connectivity or altered semantics that a human editor would reject.

Figures

Figures reproduced from arXiv: 2607.01102 by James C. Davis, Neal Singh, Tyler Sivertsen.

**Figure 1.** Figure 1: Diagram-editing workflow in SAGE. validates and repairs common serialized XML issues, and stores successful results as versioned artifacts. Language-model reasoning is used for interpretation, target resolution, and some prompt-guided XML transformations. Structured application logic performs graph edits, serialization, validation, repair, and recovery. Contributions: We describe the representations and a… view at source ↗

**Figure 1.** Figure 1: In the diagram editor, the DiagramModel is the model, the SVG canvas is the view, and user actions update structured state before rendering. This keeps the canvas from becoming the source of truth: the diagram model remains serializable, validatable, restorable, and exportable as Draw.io-compatible XML. The image workflow reuses the same session and artifact infrastructure, but operates over raster images … view at source ↗

**Figure 2.** Figure 2: Kubernetes case study artifacts: the reference diagram, final structured diagram-editing result, and final image-editing [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

read the original abstract

Software diagrams are difficult to edit through human-friendly interfaces because edits expressed in natural language must still preserve visual layout, editable structure, and semantic relationships. As a step forward, we present SAGE, a browser-based tool for prompt-guided editing of Draw.io and Mermaid-style engineering diagrams. The tool maps diagrams into an editable graph representation, translates natural language requests into structured edit intents, analyzes those intents into graph-oriented operation steps, validates and repairs common Draw.io XML issues, and stores successful results as recoverable versioned artifacts. This design separates structured state management from model-driven interpretation, while acknowledging that some prompt-guided XML edits remain model-assisted. The tool also supports direct canvas editing and a secondary mask-based image-editing workflow. We evaluate the system using unit tests and a Kubernetes architecture case study, measuring structural validity, edit success, preservation of unrelated elements, and failure causes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SAGE is a tool paper that describes a modular pipeline for prompt-based diagram editing via graphs and XML repair, but supplies no numbers on how well it performs.

read the letter

SAGE turns Draw.io and Mermaid diagrams into an editable graph, parses natural language requests into structured intents, sequences graph operations, repairs common XML problems, and versions the results. The design keeps structured state separate from the model interpretation, which is a clear and useful split.

The paper does a solid job laying out that pipeline and adding direct canvas editing plus a mask-based image fallback. The unit tests plus Kubernetes case study directly check the quantities the system claims to handle: structural validity, edit success, and preservation of unrelated elements. Acknowledging that some edits stay model-assisted is also straightforward.

The main gap is the evaluation. The abstract says they measure failure causes but reports no rates, no baselines, and no comparison to simpler LLM rewriting. Without those numbers it is hard to judge whether the validation step actually catches most inconsistencies or whether the intent-to-operation translation is reliable in practice.

This is for researchers and tool builders working on AI-assisted software engineering or diagram editors. A reader who wants a worked example of state separation in an agentic workflow will find the architecture details useful.

It deserves peer review because the system is concrete, the design choices are explicit, and the evaluation plan is reasonable even if the current results section is light on data. Send it to referees.

Referee Report

1 major / 2 minor

Summary. The paper presents SAGE, a browser-based tool for prompt-guided editing of Draw.io and Mermaid diagrams. It maps diagrams to an editable graph representation, translates natural language requests into structured edit intents, decomposes intents into graph operations, validates and repairs common Draw.io XML issues, and stores results as versioned artifacts. The design separates structured state management from model-driven interpretation (while noting residual model-assisted edits) and also supports direct canvas editing plus a mask-based image workflow. Evaluation consists of unit tests plus a single Kubernetes architecture case study that measures structural validity, edit success, preservation of unrelated elements, and failure causes.

Significance. If the pipeline performs as described, the modular separation of graph representation, intent parsing, operation sequencing, and XML repair offers a principled way to combine LLM assistance with structural guarantees in diagram editing. The explicit acknowledgment that some edits remain model-assisted and the use of recoverable versioned artifacts are honest and practical design choices. The approach could reduce layout and semantic drift compared with direct XML or image edits, but the absence of quantitative results limits assessment of real-world reliability.

major comments (1)

[Abstract / Evaluation] Abstract and Evaluation section: the paper states that unit tests and the Kubernetes case study measure structural validity, edit success, preservation of unrelated elements, and failure causes, yet supplies no numerical results, success/failure rates, or baselines. Without these data it is impossible to evaluate whether the natural-language-to-graph-operation translation reliably avoids inconsistencies that the validation step cannot catch.

minor comments (2)

The description of the graph representation and the exact format of structured edit intents would benefit from a small concrete example or schema excerpt to make the pipeline easier to replicate.
The paper mentions a secondary mask-based image-editing workflow but provides no details on how it interacts with the primary graph-based path; a brief comparison or integration note would improve clarity.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on the evaluation and for highlighting the need for quantitative results. We address the major comment below.

read point-by-point responses

Referee: [Abstract / Evaluation] Abstract and Evaluation section: the paper states that unit tests and the Kubernetes case study measure structural validity, edit success, preservation of unrelated elements, and failure causes, yet supplies no numerical results, success/failure rates, or baselines. Without these data it is impossible to evaluate whether the natural-language-to-graph-operation translation reliably avoids inconsistencies that the validation step cannot catch.

Authors: We agree that the current manuscript does not report the specific numerical outcomes from the unit tests and Kubernetes case study, even though it describes the metrics collected. This omission limits the ability to assess reliability as noted. In the revised manuscript we will add a dedicated subsection (or table) in the Evaluation section that reports the measured values for structural validity rates, edit success counts/rates, preservation of unrelated elements, and categorized failure causes, together with any available comparison points from the test suite. This will directly address the concern about assessing the translation step's effectiveness beyond the validation layer. revision: yes

Circularity Check

0 steps flagged

No circularity: system description with direct evaluation

full rationale

The paper is a tool description for SAGE, presenting a modular pipeline (graph mapping, intent translation, operation sequencing, XML validation) evaluated via unit tests and a case study. No equations, fitted parameters, predictions, or self-citations appear in the provided text. The central claims are measured directly by the evaluation metrics named (structural validity, edit success), with no reduction of outputs to inputs by construction. This is the expected non-finding for an engineering systems paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical claims or new entities; the work is an engineering tool description.

pith-pipeline@v0.9.1-grok · 5673 in / 996 out tokens · 19155 ms · 2026-07-02T08:07:32.474549+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

13 extracted references · 1 canonical work pages

[1]

1999.The Unified Modeling Language User Guide

Grady Booch, James Rumbaugh, and Ivar Jacobson. 1999.The Unified Modeling Language User Guide. Addison-Wesley

1999
[2]

2010.Documenting Software Architectures: Views and Beyond(2 ed.)

Paul Clements, Felix Bachmann, Len Bass, David Garlan, James Ivers, Reed Little, Paulo Merson, Robert Nord, and Judith Stafford. 2010.Documenting Software Architectures: Views and Beyond(2 ed.). Addison-Wesley Professional

2010
[3]

Google AI for Developers. 2026. Gemini API Docs. https://ai.google.dev

2026
[4]

JGraph Ltd. 2026. draw.io Documentation. https://www.drawio.com/doc/

2026
[5]

Philippe Kruchten. 1995. The 4+1 View Model of Architecture.IEEE Software12, 6 (1995), 42–50. doi:10.1109/52.469759

work page doi:10.1109/52.469759 1995
[6]

Mermaid. 2026. Mermaid Documentation. https://mermaid.js.org/intro/

2026
[7]

Meta Open Source. 2026. React Documentation. https://react.dev/

2026
[8]

Microsoft. 2026. TypeScript Documentation. https://www.typescriptlang.org

2026
[9]

OpenAI. 2026. OpenAI API Reference. https://developers.openai.com

2026
[10]

Prisma Data, Inc. 2026. Prisma Documentation. https://www.prisma.io/docs

2026
[11]

SQLite Consortium. 2026. SQLite Documentation. https://www.sqlite.org

2026
[12]

The Kubernetes Authors. 2026. Kubernetes Components. https://kubernetes.io/ docs/concepts/architecture/

2026
[13]

Vercel. 2026. Next.js Documentation. https://nextjs.org/docs

2026

[1] [1]

1999.The Unified Modeling Language User Guide

Grady Booch, James Rumbaugh, and Ivar Jacobson. 1999.The Unified Modeling Language User Guide. Addison-Wesley

1999

[2] [2]

2010.Documenting Software Architectures: Views and Beyond(2 ed.)

Paul Clements, Felix Bachmann, Len Bass, David Garlan, James Ivers, Reed Little, Paulo Merson, Robert Nord, and Judith Stafford. 2010.Documenting Software Architectures: Views and Beyond(2 ed.). Addison-Wesley Professional

2010

[3] [3]

Google AI for Developers. 2026. Gemini API Docs. https://ai.google.dev

2026

[4] [4]

JGraph Ltd. 2026. draw.io Documentation. https://www.drawio.com/doc/

2026

[5] [5]

Philippe Kruchten. 1995. The 4+1 View Model of Architecture.IEEE Software12, 6 (1995), 42–50. doi:10.1109/52.469759

work page doi:10.1109/52.469759 1995

[6] [6]

Mermaid. 2026. Mermaid Documentation. https://mermaid.js.org/intro/

2026

[7] [7]

Meta Open Source. 2026. React Documentation. https://react.dev/

2026

[8] [8]

Microsoft. 2026. TypeScript Documentation. https://www.typescriptlang.org

2026

[9] [9]

OpenAI. 2026. OpenAI API Reference. https://developers.openai.com

2026

[10] [10]

Prisma Data, Inc. 2026. Prisma Documentation. https://www.prisma.io/docs

2026

[11] [11]

SQLite Consortium. 2026. SQLite Documentation. https://www.sqlite.org

2026

[12] [12]

The Kubernetes Authors. 2026. Kubernetes Components. https://kubernetes.io/ docs/concepts/architecture/

2026

[13] [13]

Vercel. 2026. Next.js Documentation. https://nextjs.org/docs

2026