SciDiagram Question Generator — LangGraph Workflow

完整展开视图 — get_graph(xray=True) 将 DE/DU 子图内联展开，显示全部节点和重试循环

Mermaid Source

graph TD;
	__start__([<p>__start__</p>]):::first
	extract_prefix_node(extract_prefix_node)
	build_d2c_node(build_d2c_node)
	render_d2c_node(render_d2c_node)
	select_de_node(select_de_node)
	select_du_node(select_du_node)
	merge_results_node(merge_results_node)
	__end__([<p>__end__</p>]):::last
	__start__ --> extract_prefix_node;
	__start__ --> select_du_node;
	build_d2c_node --> render_d2c_node;
	extract_prefix_node --> build_d2c_node;
	extract_prefix_node --> select_de_node;
	generate_de_answer_node\3a__end__ --> merge_results_node;
	generate_du_answer_node\3a__end__ --> merge_results_node;
	render_d2c_node --> merge_results_node;
	select_de_node -.->|"Send x N"| generate_de_answer_node\3agenerate_code;
	select_du_node -.->|"Send x N"| generate_du_answer_node\3agenerate_answer;
	merge_results_node --> __end__;
	subgraph generate_de_answer_node
	generate_de_answer_node\3agenerate_code(generate_code)
	generate_de_answer_node\3acompile(compile)
	generate_de_answer_node\3avision_check(vision_check)
	generate_de_answer_node\3afix_code(fix_code)
	generate_de_answer_node\3acheck_retry(check_retry)
	generate_de_answer_node\3ade_success(de_success)
	generate_de_answer_node\3ade_failed(de_failed)
	generate_de_answer_node\3a__end__(<p>__end__</p>)
	generate_de_answer_node\3acheck_retry -.->|"exhausted"| generate_de_answer_node\3ade_failed;
	generate_de_answer_node\3acheck_retry -.->|"retry"| generate_de_answer_node\3afix_code;
	generate_de_answer_node\3acompile -.->|"compile fail"| generate_de_answer_node\3acheck_retry;
	generate_de_answer_node\3acompile -.->|"compile ok"| generate_de_answer_node\3avision_check;
	generate_de_answer_node\3afix_code --> generate_de_answer_node\3acompile;
	generate_de_answer_node\3agenerate_code --> generate_de_answer_node\3acompile;
	generate_de_answer_node\3avision_check -.->|"not passed"| generate_de_answer_node\3acheck_retry;
	generate_de_answer_node\3avision_check -.->|"passed"| generate_de_answer_node\3ade_success;
	generate_de_answer_node\3ade_failed --> generate_de_answer_node\3a__end__;
	generate_de_answer_node\3ade_success --> generate_de_answer_node\3a__end__;
	end
	subgraph generate_du_answer_node
	generate_du_answer_node\3agenerate_answer(generate_answer)
	generate_du_answer_node\3avalidate_answer(validate_answer)
	generate_du_answer_node\3aregenerate(regenerate)
	generate_du_answer_node\3acheck_retry(check_retry)
	generate_du_answer_node\3adu_success(du_success)
	generate_du_answer_node\3adu_resolve(du_resolve)
	generate_du_answer_node\3a__end__(<p>__end__</p>)
	generate_du_answer_node\3acheck_retry -.->|"exhausted"| generate_du_answer_node\3adu_resolve;
	generate_du_answer_node\3acheck_retry -.->|"retry"| generate_du_answer_node\3aregenerate;
	generate_du_answer_node\3agenerate_answer --> generate_du_answer_node\3avalidate_answer;
	generate_du_answer_node\3aregenerate --> generate_du_answer_node\3avalidate_answer;
	generate_du_answer_node\3avalidate_answer -.->|"incorrect"| generate_du_answer_node\3acheck_retry;
	generate_du_answer_node\3avalidate_answer -.->|"correct"| generate_du_answer_node\3adu_success;
	generate_du_answer_node\3adu_resolve --> generate_du_answer_node\3a__end__;
	generate_du_answer_node\3adu_success --> generate_du_answer_node\3a__end__;
	end
	classDef default fill:#e8f4fd,stroke:#3b82f6,stroke-width:2px,color:#1e293b,font-weight:500
	classDef first fill:#dcfce7,stroke:#16a34a,stroke-width:2.5px,color:#155724,font-weight:700
	classDef last fill:#dcfce7,stroke:#16a34a,stroke-width:2.5px,color:#155724,font-weight:700

顶层视图 — 子图折叠为单节点：START → extract_prefix + select_du 并行 → D2C/DE/DU → merge → END

Mermaid Source

graph TD;
	__start__([<p>__start__</p>]):::first
	extract_prefix_node(extract_prefix_node)
	build_d2c_node(build_d2c_node)
	render_d2c_node(render_d2c_node)
	select_de_node(select_de_node)
	generate_de_answer_node(generate_de_answer_node)
	select_du_node(select_du_node)
	generate_du_answer_node(generate_du_answer_node)
	merge_results_node(merge_results_node)
	__end__([<p>__end__</p>]):::last
	__start__ --> extract_prefix_node;
	__start__ --> select_du_node;
	build_d2c_node --> render_d2c_node;
	extract_prefix_node --> build_d2c_node;
	extract_prefix_node --> select_de_node;
	generate_de_answer_node --> merge_results_node;
	generate_du_answer_node --> merge_results_node;
	render_d2c_node --> merge_results_node;
	select_de_node -.->|"Send x N"| generate_de_answer_node;
	select_du_node -.->|"Send x N"| generate_du_answer_node;
	merge_results_node --> __end__;
	classDef default fill:#e8f4fd,stroke:#3b82f6,stroke-width:2px,color:#1e293b,font-weight:500
	classDef first fill:#dcfce7,stroke:#16a34a,stroke-width:2.5px,color:#155724,font-weight:700
	classDef last fill:#dcfce7,stroke:#16a34a,stroke-width:2.5px,color:#155724,font-weight:700

DE Worker — generate_code → compile → vision_check → 成功/重试循环 (fix_code, 最后一次升级模型) → de_success / de_failed

Mermaid Source

graph TD;
	__start__([<p>__start__</p>]):::first
	generate_code(generate_code)
	compile(compile)
	vision_check(vision_check)
	fix_code(fix_code)
	check_retry(check_retry)
	de_success(de_success)
	de_failed(de_failed)
	__end__([<p>__end__</p>]):::last
	__start__ --> generate_code;
	check_retry -.->|"exhausted"| de_failed;
	check_retry -.->|"has budget"| fix_code;
	compile -.->|"compile fail"| check_retry;
	compile -.->|"compile ok"| vision_check;
	fix_code --> compile;
	generate_code --> compile;
	vision_check -.->|"not passed"| check_retry;
	vision_check -.->|"passed"| de_success;
	de_failed --> __end__;
	de_success --> __end__;
	classDef default fill:#e8f4fd,stroke:#3b82f6,stroke-width:2px,color:#1e293b,font-weight:500
	classDef first fill:#dcfce7,stroke:#16a34a,stroke-width:2.5px,color:#155724,font-weight:700
	classDef last fill:#dcfce7,stroke:#16a34a,stroke-width:2.5px,color:#155724,font-weight:700

DU Worker — generate_answer → validate_answer → 正确 du_success / 错误重试 regenerate → 耗尽 du_resolve (采纳 validator 答案)

Mermaid Source

graph TD;
	__start__([<p>__start__</p>]):::first
	generate_answer(generate_answer)
	validate_answer(validate_answer)
	regenerate(regenerate)
	check_retry(check_retry)
	du_success(du_success)
	du_resolve(du_resolve)
	__end__([<p>__end__</p>]):::last
	__start__ --> generate_answer;
	check_retry -.->|"exhausted"| du_resolve;
	check_retry -.->|"has budget"| regenerate;
	generate_answer --> validate_answer;
	regenerate --> validate_answer;
	validate_answer -.->|"incorrect"| check_retry;
	validate_answer -.->|"correct"| du_success;
	du_resolve --> __end__;
	du_success --> __end__;
	classDef default fill:#e8f4fd,stroke:#3b82f6,stroke-width:2px,color:#1e293b,font-weight:500
	classDef first fill:#dcfce7,stroke:#16a34a,stroke-width:2.5px,color:#155724,font-weight:700
	classDef last fill:#dcfce7,stroke:#16a34a,stroke-width:2.5px,color:#155724,font-weight:700

固定边

条件边

普通节点

Start / End

路由标签

子图容器

Tip: 图表区域内鼠标拖拽平移，滚轮缩放，双击重置

LLM Prompt Templates

Main Workflow Nodes

extract_prefix_node d2c_prefix_extract.txt Generator

提取 LaTeX 前缀 (documentclass + packages)，供 D2C/DE 共享

# Role
You are a LaTeX environment configuration expert. Your task is to extract a "standard compilation environment prefix" for Diagram2Code (D2C) evaluation tasks.

# Goal
Extract the minimal but complete preamble (everything before `\begin{document}`) from the given LaTeX source code. This prefix will be provided to models as a controlled compilation environment.

# Input
**Source Code:**
```latex
{SOURCE_CODE}
```

# Extraction Rules

## MUST Include
1. `\documentclass` declaration with all options
2. All `\usepackage` statements
3. All `\usetikzlibrary` and `\usepgfplotslibrary` statements
4. All custom macro definitions (`\newcommand`, `\def`, `\definecolor`, etc.)
5. All style settings (`\tikzset`, `\pgfplotsset`, etc.)
6. Package configuration commands (e.g., `\pgfplotsset{compat=...}`)

## MUST NOT Include
1. `\begin{document}` - Do NOT add this at the end
2. `\end{document}`
3. Any drawing code or document body content
4. Comments (remove all `%...` lines)

# Output Format (STRICTLY follow this JSON format)

```json
{
  "prefix_code": "<extracted LaTeX preamble from \\documentclass to before \\begin{document}>",
  "packages_used": ["package1", "package2", "..."]
}
```

IMPORTANT:
- The "prefix_code" field contains the complete preamble code
- The "packages_used" field lists all package names (without \\usepackage{})
- Escape backslashes in JSON strings (use `\\` for `\`)
- Do NOT include markdown code blocks inside the JSON string
- Return ONLY the JSON object, no other text

# Example

**Input:**
```latex
\documentclass[border=5pt]{standalone}
\usepackage{chemfig}
\begin{document}
\chemfig{R-C(=[::+60]O)-[::-60]O-[::-60]C(=[::+60]O)-[::-60]R}
\end{document}
```

**Output:**
```json
{
  "prefix_code": "\\documentclass[border=5pt]{standalone}\n\\usepackage{chemfig}",
  "packages_used": ["chemfig"]
}
```

build_d2c_node d2c_question.txt Generator

根据图片 + 前缀生成 D2C 评测 prompt

Convert the diagram in the image into a complete, executable LaTeX code block.
Your output must be the full, compilable code, starting exactly with the provided snippet below.

**Start with this:**

```latex
{PREFIX_CODE}
```

select_de_node de_task_selection.txt Generator

分析图片+代码，从 edit_instructions.yaml 中选择 Top-K 编辑任务并实例化

# Task Description

Analyze the provided **Scientific Diagram** and its corresponding **Source Code**. Your goal is to select and instantiate the **Top {TOP_K}** most relevant editing instructions from the provided **Candidate Task List**.

You must generate a valid JSON object of instantiated editing tasks that target different evaluation dimensions.

# Input Data

**1. Target Image:**
*(Image is provided separately)*

**2. Source Code:**
```latex
{IMAGE_CODE}
```

**3. Diagram Category:**
{DIAGRAM_CATEGORY}

**4. Evaluation Dimensions:**
{DIMENSION_INFO}

**5. Candidate Task List (filtered for this category):**
{TASK_CANDIDATES}
*(Each template includes an ID, a Template Text with placeholders, and category-specific examples)*

# Execution Guidelines

1. **Cross-Modal Grounding**:
   - Use the **Source Code** to identify specific entity names, labels, coordinates, and values (e.g., finding that the red node is labeled `Node_A` or the x-axis is labeled `Time (s)`).
   - Use the **Image** to confirm visual attributes (colors, layout, shapes).

2. **Instantiation (Critical)**:
   - You MUST fill in ALL placeholders (e.g., `[TARGET_ELEMENT]`, `[NEW_COLOR]`, `[OLD_VALUE]`) with **specific content** found in the code/image.
   - **Do not** leave any brackets in the final `edit_instruction`.
   - *Bad*: "Change color of [TARGET] to red."
   - *Good*: "Change the fill color of the 'Revenue 2024' bar to red."
   - *Bad*: "Remove [ELEMENT] from the diagram."
   - *Good*: "Remove capacitor C1 from the diagram."

3. **Selection Criteria (Top {TOP_K})**:
   - **(a) Detail Understanding**: Prioritize tasks that demonstrate understanding of specific code structures (e.g., modifying specific data points, connections) over generic whole-image changes.
   - **(b) Scientific Utility**: Prioritize edits that make scientific sense (e.g., correcting data values, standardizing units, highlighting key regions).
   - **(c) Dimension Diversity (MANDATORY)**: Each of the {TOP_K} selected tasks **MUST** target a **different** evaluation dimension. Never select two tasks from the same dimension. With {TOP_K} tasks and 4 available dimensions (Color, Text, Scope, Layout), there is always sufficient variety — choose the best task from each of {TOP_K} different dimensions.
   - **(d) Feasibility**: The edit must be achievable by modifying the LaTeX source code.
   - **(e) Layout Coverage**: If the candidate list contains **Layout (L)** type tasks, you **MUST** select at least one Layout task.

# Output Format (STRICTLY follow this JSON format)

```json
{
  "selected_tasks": [
    {
      "task_id": "L1",
      "dimension": "layout",
      "edit_instruction": "Convert the bar chart into a horizontal bar chart while preserving all data values and labels.",
      "reasoning": "Layout transformation that tests holistic structural change, triggering scope/color/text dimensions simultaneously."
    },
    {
      "task_id": "T1",
      "dimension": "text",
      "edit_instruction": "Rename the x-axis label from 'Time (s)' to 'Duration (ms)'.",
      "reasoning": "Text fidelity task targeting a specific axis label found in the source code."
    },
    {
      "task_id": "S1",
      "dimension": "scope",
      "edit_instruction": "Move the legend from the top-right corner to the bottom-center of the chart.",
      "reasoning": "Scope task that tests spatial/bbox fidelity by repositioning a specific element."
    }
  ]
}
```

IMPORTANT:
- The "selected_tasks" array MUST contain exactly {TOP_K} items
- Each item MUST have all required fields: task_id, dimension, edit_instruction, reasoning
- Each task MUST target a different dimension — no two tasks can share the same dimension value
- task_id must match the ID from candidate list (e.g., "C1", "T2", "S1", "S7", "L1")
- dimension must be one of: "color", "text", "scope", "layout"
- Do NOT include markdown code blocks inside the JSON string
- Return ONLY the JSON object, no other text

select_du_node du_task_selection.txt Generator

分析图片+代码，从 du_questions.yaml 中选择 Top-K 理解问题并实例化

# Task Description

Analyze the provided **Scientific Diagram** and its corresponding **Source Code**. Your goal is to select and instantiate the **Top {TOP_K}** most relevant Question Templates from the provided **Candidate Question List**.

You must generate a valid JSON object of instantiated questions that assess **Diagram Understanding (DU)** capabilities, including both Regular Questions (Q-type) and What-If Questions (W-type).

# Input Data

**1. Target Image:**
*(Image is provided separately)*

**2. Source Code:**
```latex
{IMAGE_CODE}
```

**3. Diagram Category:**
{DIAGRAM_CATEGORY}

**4. Candidate Question List:**
{QUESTION_CANDIDATES}
*(Each template includes an ID, a Template Text with placeholders, and a required Output Instruction)*

# Execution Guidelines

1. **Cross-Modal Grounding**:
   - Use the **Source Code** to identify specific entity names, labels, coordinates, and values (e.g., finding that the red node is labeled `Node_A` or the x-axis is labeled `Time (s)`).
   - Use the **Image** to confirm visual attributes (colors, layout, shapes).

2. **Instantiation (Critical)**:
   - You MUST fill in ALL placeholders (e.g., `[NODE_LABEL]`, `[SCALE_FACTOR]`, `[TARGET_ELEMENT]`) with concrete content found in the code/image.
   - **Do not** leave any brackets in the final `instantiated_question`.
   - *Bad*: "What is the value of [BAR_LABEL]?"
   - *Good*: "What is the value of the 'Revenue 2024' bar?"
   - *Bad*: "If [NODE] is removed..."
   - *Good*: "If the node labeled 'Server 1' is removed..."

3. **Selection Criteria (Top {TOP_K})**:
   - **(a) Validity**: Ensure the target elements actually exist in the diagram. For example, do not ask to "Calculate the volume of the cylinder" if the image shows a cube.
   - **(b) Diversity**: Select questions from different templates (Q and W types) when possible.
   - **(c) Answerability**: The question must be answerable based on visible/inferable information in the diagram.
   - **(d) Scientific Utility**: Prioritize questions that test meaningful understanding of the diagram content.

4. **What-If Questions**: For W-type (What-If) questions, select modifications that are physically or logically possible for this specific diagram type (e.g., removing a node in a graph, shorting a resistor in a circuit). The hypothetical modification should reference a valid DE task type (T2, T4, T5, T7, T9) and use realistic values.

# Output Format (STRICTLY follow this JSON format)

```json
{
  "selected_questions": [
    {
      "question_id": "Q1",
      "question_type": "regular",
      "instantiated_question": "Complete question text with ALL placeholders filled with specific details",
      "output_instruction": "OI-NUM",
      "expected_answer_type": "numerical value with units",
      "reasoning": "Brief explanation of why this question is valid for this specific image/code"
    }
  ]
}
```

IMPORTANT:
- The "selected_questions" array MUST contain exactly {TOP_K} items
- Each item MUST have all required fields: question_id, question_type, instantiated_question, output_instruction, expected_answer_type, reasoning
- question_id must match the ID from candidate list (e.g., "Q1", "Q2", "W1", "W2")
- question_type must be either "regular" (for Q-type) or "what_if" (for W-type)
- output_instruction must be one of: "OI-NUM", "OI-TERM", "OI-LIST"
- Do NOT include markdown code blocks inside the JSON string
- Return ONLY the JSON object, no other text

DE Subgraph Nodes (编辑 + 编译验证 + Vision Check)

generate_code de_answer.txt Generator

根据编辑指令 + 原始代码 + 图片，生成修改后的完整 LaTeX 代码

# Task: Diagram Editing Answer Generation

You are given a scientific diagram, its corresponding LaTeX source code, and an editing instruction. Your task is to apply the specified modification and output the complete modified code.

# Input

**Original Source Code:**
```latex
{SOURCE_CODE}
```

**Editing Instruction:**
{EDIT_INSTRUCTION}

# Requirements

1. **Precise Modification**: Apply ONLY the change specified in the editing instruction
2. **Code Integrity**: Preserve all other parts of the original code unchanged
3. **Compilability**: The modified code must remain compilable
4. **Complete Output**: Output the entire modified code from `\documentclass` to `\end{document}`

# Output Format (STRICTLY follow this JSON format)

```json
{
  "code": "<complete modified LaTeX code from \\documentclass to \\end{document}>",
  "changes_made": ["<description of change 1>", "<description of change 2>", "..."]
}
```

IMPORTANT:
- The "code" field MUST contain the complete, compilable LaTeX code
- The "code" field MUST start with `\documentclass` and end with `\end{document}`
- The "changes_made" field lists all modifications applied (usually just one)
- Escape backslashes in JSON strings (use `\\` for `\`)
- Do NOT include markdown code blocks inside the JSON string
- Return ONLY the JSON object, no other text

# Example

**Editing Instruction:** Change the color of node A to red.

**Output:**
```json
{
  "code": "\\documentclass{standalone}\n\\usepackage{tikz}\n\\begin{document}\n\\begin{tikzpicture}\n\\node[draw, fill=red] (A) {A};\n\\end{tikzpicture}\n\\end{document}",
  "changes_made": ["Changed fill color of node A from blue to red"]
}
```

vision_check de_vision_check.txt Validator (Vision)

对比原图 vs 编辑后渲染图，验证编辑指令是否正确应用

# Role
You are a visual verification expert for diagram editing tasks.

# Task
You are given two images and an editing instruction:
- **Image 1 (Original)**: The diagram BEFORE the edit
- **Image 2 (Edited)**: The diagram AFTER applying the edit instruction

Determine whether the editing instruction was correctly and completely applied.

# Edit Instruction
{EDIT_INSTRUCTION}

# Evaluation Criteria

1. **Applied**: The specified change is visually present in the edited image
   - Color changes: The target element's color has visibly changed
   - Text changes: The text content has been updated
   - Layout changes: Elements have moved/resized as instructed
   - Type changes: Element shapes/styles have been replaced

2. **Complete**: The ENTIRE edit was applied, not just partially
   - If "change all labels", ALL labels must be changed
   - If "move node A to the right", node A must have actually moved

3. **Preserved**: All OTHER elements remain unchanged
   - No unintended side effects (missing elements, broken layout)

4. **Tolerable**: Minor rendering differences (anti-aliasing, slight spacing)

# Output Format (STRICTLY follow this JSON format)

If the edit was correctly applied:
```json
{
  "edit_applied": true,
  "confidence": "high",
  "reason": "The edit was correctly applied: [describe what changed]"
}
```

If the edit was NOT correctly applied:
```json
{
  "edit_applied": false,
  "confidence": "high",
  "reason": "[describe specifically what is wrong, e.g., 'The node color remains blue instead of changing to red', 'The label text was not updated']"
}
```

IMPORTANT:
- Return ONLY the JSON object
- "edit_applied" must be boolean
- "confidence" must be "high", "medium", or "low"
- "reason" must describe specific visual observations

fix_code (compile) de_fix_compile.txt Generator / Fallback

根据编译错误修复代码 (保留编辑意图)

# Role
You are a LaTeX debugger. Fix the modified code based on compilation errors.

# Context
The code below was modified to apply an editing instruction to a scientific diagram, but it failed to compile. Fix the compilation errors while preserving the intended edit.

# Inputs

## 1. Original Source Code (Reference)
```latex
{ORIGINAL_CODE}
```

## 2. Failed Modified Code
```latex
{MODIFIED_CODE}
```

## 3. Edit Instruction (What the code should achieve)
{EDIT_INSTRUCTION}

## 4. Compilation Error
```
{COMPILE_ERROR}
```

# Fixing Strategy

1. **Read the error message carefully** to identify the exact problem
2. **Common fixes**:
   - Missing package: restore `\usepackage{}` from original
   - Undefined command: restore `\newcommand` or `\def` from original
   - Syntax error: fix braces, brackets, or command usage
   - Missing library: restore `\usetikzlibrary{}` from original
3. **Preserve the edit**: The fix must still implement the editing instruction
4. **Minimal changes**: Only fix what is broken, do not rewrite unnecessarily

# Output Format (STRICTLY follow this JSON format)

```json
{
  "code": "<complete fixed LaTeX code from \\documentclass to \\end{document}>",
  "changes_made": ["<what was fixed>"]
}
```

IMPORTANT:
- The "code" field MUST be complete and compilable
- Must start with `\documentclass` and end with `\end{document}`
- Escape backslashes in JSON strings
- Return ONLY the JSON object

fix_code (vision) de_fix_vision.txt Generator / Fallback

根据 Vision Check 反馈修复代码 (编辑未正确体现)

# Role
You are a LaTeX editor. Fix the code based on visual verification feedback.

# Context
The code below was supposed to apply an editing instruction to a scientific diagram. It compiles successfully, but visual verification found that the edit was NOT correctly applied. Fix the code so that it correctly implements the edit.

# Inputs

## 1. Original Source Code (Reference)
```latex
{ORIGINAL_CODE}
```

## 2. Current Modified Code (Compiles but edit incorrect)
```latex
{MODIFIED_CODE}
```

## 3. Edit Instruction (What the code should achieve)
{EDIT_INSTRUCTION}

## 4. Visual Verification Feedback (Why the edit was judged incorrect)
{VISION_FEEDBACK}

# Fixing Strategy

1. **Read the visual feedback carefully** to understand exactly what went wrong
2. **Common issues**:
   - Changed the wrong element (e.g., changed node B instead of node A)
   - Used incorrect value (e.g., wrong color name, wrong position)
   - Change was overridden by later style definitions
   - Change was applied to a non-visible property
3. **Re-examine the original code** to locate the correct element to modify
4. **Apply the edit precisely** to the correct element with the correct value
5. **Preserve all other parts** unchanged

# Output Format (STRICTLY follow this JSON format)

```json
{
  "code": "<complete fixed LaTeX code from \\documentclass to \\end{document}>",
  "changes_made": ["<what was fixed based on vision feedback>"]
}
```

IMPORTANT:
- The "code" field MUST be complete and compilable
- Must start with `\documentclass` and end with `\end{document}`
- Escape backslashes in JSON strings
- Return ONLY the JSON object

DU Subgraph Nodes (理解 + 交叉验证)

generate_answer du_answer.txt DU Generator

分析图片+代码+问题，生成 ground truth 答案

# Task: Diagram Understanding Answer Generation

You are given a **Scientific Diagram**, its corresponding **LaTeX Source Code**, and a **Question** about the diagram. Your task is to analyze the diagram carefully and provide an accurate answer that can serve as ground truth for evaluation.

# Input

**Source Code:**
```latex
{SOURCE_CODE}
```

**Question:**
{QUESTION_TEXT}

**Output Instruction:**
{OUTPUT_INSTRUCTION}

# Output Instruction Reference

- **OI-NUM**: Output only the final numerical value with standard units if applicable (e.g., "5V", "30 deg", "42"). If the information is not present, output "NOT_PRESENT".
- **OI-TERM**: Output the answer using standard scientific terminology or exact label text from the diagram. If not visible, output "NOT_PRESENT".
- **OI-LIST**: Output a Python list of strings, sorted alphabetically (e.g., ["Node A", "Node B"]). If no such elements exist, output [].

# Requirements

1. **Cross-Modal Analysis**: Use both the visual information from the diagram AND the source code to derive the accurate answer
2. **Code Verification**: When possible, verify your answer by checking the corresponding values in the source code
3. **Precise Format**: Follow the output instruction format exactly
4. **Scientific Accuracy**: Ensure answers use correct scientific terminology and units

# Output Format (STRICTLY follow this JSON format)

```json
{
"answer": "<your answer following the output instruction format>",
"reasoning": "<step-by-step explanation of how you derived the answer>",
"code_reference": "<relevant code snippet or line that supports the answer, if applicable>"
}
```

IMPORTANT:
- The "answer" field MUST follow the format specified in the Output Instruction
- The "reasoning" field should explain your analysis process step by step
- The "code_reference" field should quote the relevant source code that supports your answer (can be empty string if not applicable)
- For What-If questions, show your calculation or reasoning for the hypothetical modification
- Return ONLY the JSON object, no other text

validate_answer du_answer_validation.txt DU Validator

独立验证候选答案是否正确 (交叉验证)

# Role
You are a rigorous answer validator for scientific diagram understanding.

# Task
You are given a scientific diagram (image), its LaTeX source code, a question, and a candidate answer. Verify whether the candidate answer is correct.

# Input

**Source Code:**
```latex
{SOURCE_CODE}
```

**Question:**
{QUESTION_TEXT}

**Output Instruction:**
{OUTPUT_INSTRUCTION}

**Candidate Answer:**
{CANDIDATE_ANSWER}

**Candidate Reasoning:**
{CANDIDATE_REASONING}

# Validation Process

1. **Independent Analysis**: First, analyze the diagram and source code YOURSELF to determine the correct answer. Do NOT be biased by the candidate answer.
2. **Cross-Modal Verification**: Check your answer against both the visual diagram AND the source code.
3. **Compare**: Compare your answer with the candidate answer.
4. **Judge**: Determine if the candidate answer is correct.

# Judgment Criteria

- **Correct**: The answer matches your independent analysis (minor format differences like "5V" vs "5 V" are acceptable)
- **Incorrect**: The answer contradicts the diagram or source code, or uses wrong values/terminology
- **Partially Correct**: Some elements are right but key details are wrong (judge as incorrect)

# Output Format (STRICTLY follow this JSON format)

If the answer is correct:
```json
{
  "is_correct": true,
  "confidence": "high",
  "feedback": "The answer is correct."
}
```

If the answer is incorrect:
```json
{
  "is_correct": false,
  "confidence": "high",
  "feedback": "The answer is incorrect. [explain what is wrong]. The correct answer should be: [your answer]"
}
```

IMPORTANT:
- Return ONLY the JSON object
- "is_correct" must be boolean
- "confidence" must be "high", "medium", or "low"
- If incorrect, "feedback" MUST include the correct answer

regenerate du_answer_retry.txt DU Generator / Fallback

根据验证反馈重新生成答案

# Task: Diagram Understanding Answer Generation (Retry)

You are given a scientific diagram, its LaTeX source code, and a question. A previous answer attempt was judged INCORRECT by a verification model. Use the feedback to generate a corrected answer.

# Input

**Source Code:**
```latex
{SOURCE_CODE}
```

**Question:**
{QUESTION_TEXT}

**Output Instruction:**
{OUTPUT_INSTRUCTION}

# Previous Attempt

**Your Previous Answer:**
{PREVIOUS_ANSWER}

**Verification Feedback (Why it was judged incorrect):**
{VALIDATION_FEEDBACK}

# Instructions

1. **Read the feedback carefully** to understand what was wrong with your previous answer
2. **Re-analyze the diagram and source code** with fresh eyes
3. **Pay special attention** to the specific issue pointed out in the feedback
4. **Verify your new answer** against both the diagram and the code before answering
5. **Follow the output instruction format** exactly

# Output Format (STRICTLY follow this JSON format)

```json
{
  "answer": "<corrected answer following the output instruction format>",
  "reasoning": "<step-by-step explanation, addressing the feedback>",
  "code_reference": "<relevant code snippet supporting the corrected answer>"
}
```

IMPORTANT:
- The "answer" field MUST follow the Output Instruction format
- The "reasoning" field should explain how you addressed the verification feedback
- Return ONLY the JSON object