SAGE: Structured Agentic Graph Editing for Software Diagrams
Pith reviewed 2026-07-02 08:07 UTC · model grok-4.3
The pith
SAGE turns natural language edit requests into validated graph operations on diagram structure.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SAGE maps diagrams into an editable graph representation, translates natural language requests into structured edit intents, analyzes those intents into graph-oriented operation steps, validates and repairs common Draw.io XML issues, and stores successful results as recoverable versioned artifacts. The design keeps structured state management separate from model-driven interpretation, though some XML edits still rely on model assistance. The tool also supports direct canvas editing and a mask-based image workflow.
What carries the argument
The editable graph representation that converts diagram elements into nodes and edges so intents can be decomposed into atomic, verifiable graph operations.
If this is right
- Diagram edits succeed while leaving unrelated elements unchanged.
- Common Draw.io XML errors are automatically detected and repaired before output.
- Each successful edit produces a recoverable versioned artifact.
- The same pipeline supports both Draw.io and Mermaid formats.
Where Pith is reading between the lines
- The separation of graph state from model interpretation could let other tools reuse the validation layer even when different language models supply the intents.
- Because the operations are explicit graph steps, the approach might extend to automated testing of diagram consistency rules beyond what the current validator checks.
- The versioned artifact store opens a path to undo or diff sequences of prompt-driven changes without manual file management.
Load-bearing premise
Natural language prompts can be translated reliably into graph operations that preserve visual layout, editable structure, and semantic relationships without introducing inconsistencies that the validation step cannot catch.
What would settle it
A natural language prompt whose resulting diagram passes all validation checks yet contains broken connectivity or altered semantics that a human editor would reject.
Figures
read the original abstract
Software diagrams are difficult to edit through human-friendly interfaces because edits expressed in natural language must still preserve visual layout, editable structure, and semantic relationships. As a step forward, we present SAGE, a browser-based tool for prompt-guided editing of Draw.io and Mermaid-style engineering diagrams. The tool maps diagrams into an editable graph representation, translates natural language requests into structured edit intents, analyzes those intents into graph-oriented operation steps, validates and repairs common Draw.io XML issues, and stores successful results as recoverable versioned artifacts. This design separates structured state management from model-driven interpretation, while acknowledging that some prompt-guided XML edits remain model-assisted. The tool also supports direct canvas editing and a secondary mask-based image-editing workflow. We evaluate the system using unit tests and a Kubernetes architecture case study, measuring structural validity, edit success, preservation of unrelated elements, and failure causes.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents SAGE, a browser-based tool for prompt-guided editing of Draw.io and Mermaid diagrams. It maps diagrams to an editable graph representation, translates natural language requests into structured edit intents, decomposes intents into graph operations, validates and repairs common Draw.io XML issues, and stores results as versioned artifacts. The design separates structured state management from model-driven interpretation (while noting residual model-assisted edits) and also supports direct canvas editing plus a mask-based image workflow. Evaluation consists of unit tests plus a single Kubernetes architecture case study that measures structural validity, edit success, preservation of unrelated elements, and failure causes.
Significance. If the pipeline performs as described, the modular separation of graph representation, intent parsing, operation sequencing, and XML repair offers a principled way to combine LLM assistance with structural guarantees in diagram editing. The explicit acknowledgment that some edits remain model-assisted and the use of recoverable versioned artifacts are honest and practical design choices. The approach could reduce layout and semantic drift compared with direct XML or image edits, but the absence of quantitative results limits assessment of real-world reliability.
major comments (1)
- [Abstract / Evaluation] Abstract and Evaluation section: the paper states that unit tests and the Kubernetes case study measure structural validity, edit success, preservation of unrelated elements, and failure causes, yet supplies no numerical results, success/failure rates, or baselines. Without these data it is impossible to evaluate whether the natural-language-to-graph-operation translation reliably avoids inconsistencies that the validation step cannot catch.
minor comments (2)
- The description of the graph representation and the exact format of structured edit intents would benefit from a small concrete example or schema excerpt to make the pipeline easier to replicate.
- The paper mentions a secondary mask-based image-editing workflow but provides no details on how it interacts with the primary graph-based path; a brief comparison or integration note would improve clarity.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the evaluation and for highlighting the need for quantitative results. We address the major comment below.
read point-by-point responses
-
Referee: [Abstract / Evaluation] Abstract and Evaluation section: the paper states that unit tests and the Kubernetes case study measure structural validity, edit success, preservation of unrelated elements, and failure causes, yet supplies no numerical results, success/failure rates, or baselines. Without these data it is impossible to evaluate whether the natural-language-to-graph-operation translation reliably avoids inconsistencies that the validation step cannot catch.
Authors: We agree that the current manuscript does not report the specific numerical outcomes from the unit tests and Kubernetes case study, even though it describes the metrics collected. This omission limits the ability to assess reliability as noted. In the revised manuscript we will add a dedicated subsection (or table) in the Evaluation section that reports the measured values for structural validity rates, edit success counts/rates, preservation of unrelated elements, and categorized failure causes, together with any available comparison points from the test suite. This will directly address the concern about assessing the translation step's effectiveness beyond the validation layer. revision: yes
Circularity Check
No circularity: system description with direct evaluation
full rationale
The paper is a tool description for SAGE, presenting a modular pipeline (graph mapping, intent translation, operation sequencing, XML validation) evaluated via unit tests and a case study. No equations, fitted parameters, predictions, or self-citations appear in the provided text. The central claims are measured directly by the evaluation metrics named (structural validity, edit success), with no reduction of outputs to inputs by construction. This is the expected non-finding for an engineering systems paper.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
1999.The Unified Modeling Language User Guide
Grady Booch, James Rumbaugh, and Ivar Jacobson. 1999.The Unified Modeling Language User Guide. Addison-Wesley
1999
-
[2]
2010.Documenting Software Architectures: Views and Beyond(2 ed.)
Paul Clements, Felix Bachmann, Len Bass, David Garlan, James Ivers, Reed Little, Paulo Merson, Robert Nord, and Judith Stafford. 2010.Documenting Software Architectures: Views and Beyond(2 ed.). Addison-Wesley Professional
2010
-
[3]
Google AI for Developers. 2026. Gemini API Docs. https://ai.google.dev
2026
-
[4]
JGraph Ltd. 2026. draw.io Documentation. https://www.drawio.com/doc/
2026
-
[5]
Philippe Kruchten. 1995. The 4+1 View Model of Architecture.IEEE Software12, 6 (1995), 42–50. doi:10.1109/52.469759
-
[6]
Mermaid. 2026. Mermaid Documentation. https://mermaid.js.org/intro/
2026
-
[7]
Meta Open Source. 2026. React Documentation. https://react.dev/
2026
-
[8]
Microsoft. 2026. TypeScript Documentation. https://www.typescriptlang.org
2026
-
[9]
OpenAI. 2026. OpenAI API Reference. https://developers.openai.com
2026
-
[10]
Prisma Data, Inc. 2026. Prisma Documentation. https://www.prisma.io/docs
2026
-
[11]
SQLite Consortium. 2026. SQLite Documentation. https://www.sqlite.org
2026
-
[12]
The Kubernetes Authors. 2026. Kubernetes Components. https://kubernetes.io/ docs/concepts/architecture/
2026
-
[13]
Vercel. 2026. Next.js Documentation. https://nextjs.org/docs
2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.