Documentation
Complete guide to using ARK — from installation to advanced configuration. ARK automates the full research lifecycle: literature survey, experiment design, execution, paper writing, and iterative review.
Installation
Clone the repository and install ARK in editable mode:
git clone https://github.com/kaust-ark/ARK.git
cd ARK
pip install -e .
For research phase features (Deep Research integration), install with the research extra:
pip install -e ".[research]"
Prerequisites
- Python 3.9+
- At least one AI backend — installed and authenticated:
- Claude Code CLI (
claude) — recommended - Gemini CLI (
gemini) - Codex CLI (
codex)
- Claude Code CLI (
- LaTeX toolchain (optional) —
pdflatex,bibtexfor PDF compilation - PyYAML — installed automatically with ARK
Quick Start
Create a new project, provide your paper title and idea, then let ARK handle the rest:
# Create a new project (interactive wizard)
ark new my-project
# Run the full pipeline
ark run my-project
# Check progress anytime
ark status my-project
The ark new wizard will ask for:
- Paper title — e.g., "Efficient Cache-Aware Matrix Multiplication on Modern CPUs"
- Research idea — a paragraph describing the core contribution and approach
- Target venue — NeurIPS, ICML, ICLR, ACL, IEEE, etc.
- Code directory — where experiment code lives
- Model backend — Claude, Gemini, or Codex
You can also bootstrap from an existing proposal PDF:
ark new my-project --from-pdf proposal.pdf
ark new
Create a new research project with an interactive wizard.
ark new <project-name> [options]
| Option | Description |
|---|---|
--from-pdf <path> | Bootstrap project from a proposal PDF |
--venue <venue> | Set target venue (e.g., neurips, icml) |
--model <model> | AI backend: claude, gemini, or codex |
This creates ARK/projects/<project-name>/ with:
config.yaml— project configurationhooks.py— custom logic for research/figure generationagents/— prompt files for each agent role
ark run
Start the autonomous research pipeline for a project.
ark run <project-name> [options]
| Option | Description |
|---|---|
--mode paper|research | Pipeline mode (default: paper) |
--max-iterations <n> | Maximum review iterations (default: 14) |
--max-days <n> | Maximum days for research mode |
paper mode, ARK runs the full 3-phase pipeline (research → development → review). In research mode, it focuses on experiment design and execution without paper writing.
You can also invoke the orchestrator directly for more control:
# Paper mode with specific model and iteration limit
python -m ark.orchestrator --project my-project --mode paper --model claude --max-iterations 20
# Research mode with day limit
python -m ark.orchestrator --project my-project --mode research --model claude --max-days 3
# Background execution with logging
nohup python -m ark.orchestrator --project my-project --mode paper --model claude \
> auto_research/logs/orchestrator.log 2>&1 &
ark status
Check the current progress of a project.
ark status <project-name>
Shows: current iteration, reviewer score, phase, cost breakdown, and recent agent actions.
ark monitor
Real-time monitoring of the pipeline — streams agent output and state changes.
ark monitor <project-name>
ark update
Send a mid-run instruction to the pipeline. Useful for steering the agents without stopping.
ark update <project-name>
# Example instructions:
# "Focus on the related work section"
# "Add a comparison with TransformerFAM"
# "Increase the font size in Figure 3"
ark stop
Gracefully stop a running pipeline. State is checkpointed so you can resume later.
ark stop <project-name>
ark delete
Remove a project and all its configuration.
ark delete <project-name>
ARK/projects/. Your code directory and generated papers are not affected.
Project Configuration
Each project has a config.yaml that controls all aspects of the pipeline:
# ARK/projects/my-project/config.yaml
code_dir: /home/user/my-research # Where experiment code lives
venue: NeurIPS # Target venue
venue_format: neurips # LaTeX template format
venue_pages: 9 # Main content page limit
title: "My Paper Title"
model: claude # AI backend (claude/gemini/codex)
# Directory structure
latex_dir: paper # LaTeX source directory (relative to code_dir)
figures_dir: paper/figures # Generated figures
scripts_dir: code # Experiment scripts
# Figure generation
create_figures_script: code/create_paper_figures.py
# Quality threshold — stop when reviewer score hits this
paper_accept_threshold: 8
# Goal Anchor — keeps agents focused across iterations
goal_anchor: |
## Goal Anchor
**Paper Title**: My Paper Title
**Target Venue**: NeurIPS 2025
**Core Contributions**:
1. First contribution...
2. Second contribution...
# Compute settings (optional)
use_slurm: false
slurm_job_prefix: EXP_
conda_env: my-env
# Telegram notifications (optional)
telegram_bot_token: "YOUR_BOT_TOKEN"
telegram_chat_id: "YOUR_CHAT_ID"
Venue Presets
ARK includes precise LaTeX geometry presets for 11+ venues, ensuring figures and tables fit perfectly:
| Venue | Format Key | Text Width | Pages |
|---|---|---|---|
| NeurIPS | neurips | 5.5in | 9 |
| ICML | icml | 6.75in | 8 |
| ICLR | iclr | 6.0in | 9 |
| AAAI | aaai | 7.0in | 7 |
| ACL | acl | 6.3in | 8 |
| IEEE | ieee | 7.0in | 10 |
| ACM SIGPLAN | sigplan | 5.5in | 6 |
| LNCS | lncs | 4.8in | 12 |
| USENIX | usenix | 7.0in | 12 |
| CVPR | cvpr | 6.875in | 8 |
| ECCV | eccv | 4.8in | 14 |
Agent Prompts
Each agent has a .prompt file under projects/<name>/agents/. You can customize these to change agent behavior:
| File | Agent | Role |
|---|---|---|
reviewer.prompt | Reviewer | Scores the paper (1–10) and identifies specific issues |
planner.prompt | Planner | Converts review into a prioritized YAML action plan |
writer.prompt | Writer | Drafts and revises LaTeX sections |
experimenter.prompt | Experimenter | Designs and runs experiments |
researcher.prompt | Researcher | Literature survey and result analysis |
validator.prompt | Validator | Verifies changes were correctly applied |
figure_fixer.prompt | Visualizer | Fixes figure and plot issues |
literature.prompt | Literature | Deep literature search and summarization |
meta_debugger.prompt | Meta-Debugger | System-level diagnosis when pipeline stalls |
Custom Hooks
Each project can define a hooks.py for custom logic. This lets you inject project-specific behavior into the pipeline:
# ARK/projects/my-project/hooks.py
def pre_experiment(config, state):
"""Called before each experiment run."""
pass
def post_experiment(config, state, results):
"""Called after experiment completes. Process results here."""
pass
def create_figures(config, state):
"""Custom figure generation logic."""
pass
Phase 1: Research
The research phase uses Deep Research to gather background knowledge:
- Takes your paper title and idea as input
- Performs a comprehensive literature survey
- Identifies related work, gaps, and positioning
- Outputs a structured research summary to
auto_research/state/
pip install -e ".[research]" to enable the Deep Research integration. Without it, you can skip this phase and provide your own literature survey.
Phase 2: Development
The development phase handles experiment design and execution:
- Planner creates an experiment plan based on the research summary
- Experimenter writes and submits experiment scripts (supports Slurm, local, cloud)
- Researcher analyzes results and identifies completeness gaps
- Writer produces the initial paper draft in LaTeX
Phase 3: Review (Iterative Loop)
The review phase is where the magic happens. Each iteration:
- Compile — LaTeX → PDF, generate page images for visual review
- Review — Reviewer agent scores the paper (1–10) and writes detailed feedback
- Plan & Execute — Planner converts feedback into tasks; writer/experimenter/researcher execute
- Visualize — Fix figures, recompile, validate changes
The loop continues until:
- The reviewer score reaches
paper_accept_threshold(default: 8) - The maximum iteration count is reached
- You manually stop with
ark stop
When the score threshold is reached, ARK runs one final cleanup iteration for any remaining minor issues before producing the final PDF.
Memory System
ARK tracks progress across iterations to prevent loops and detect stagnation:
# Stored in auto_research/state/memory.yaml
scores: [5.0, 5.5, 6.0, 6.2, 6.5, 7.0, 7.2]
best_score: 7.2
stagnation_count: 0
Score Tracking
Maintains a history of reviewer scores (last 20 iterations). Used to measure progress and trigger interventions.
Stagnation Detection
If the score doesn't improve for several consecutive iterations, ARK:
- Flags the stagnation to the planner
- Triggers the meta-debugger agent for system-level diagnosis
- Sends a Telegram notification (if configured) requesting human intervention
Issue Repeat Tracking
Counts how many times each issue reappears across reviews. If a fix keeps failing, ARK bans the ineffective strategy and tries alternative approaches.
Goal Anchor
Every agent invocation includes the Goal Anchor — a constant description of the project's core objectives. This prevents agents from drifting off-topic over many iterations.
Telegram Integration
ARK can send real-time notifications and receive mid-run instructions via Telegram:
Setup
# Interactive setup
ark setup-bot
# Or manually add to config.yaml:
telegram_bot_token: "YOUR_BOT_TOKEN"
telegram_chat_id: "YOUR_CHAT_ID"
Features
- Iteration notifications — score updates after each review cycle
- PDF delivery — request the current paper PDF anytime
- Mid-run instructions — send text to steer the agents
- Stagnation alerts — automatic notification when progress stalls
- Proactive confirmations — agents ask before making risky changes
Slurm / HPC
ARK supports running experiments on Slurm-managed HPC clusters:
# config.yaml
use_slurm: true
slurm_job_prefix: EXP_ # Prefix for job names
conda_env: my-env # Conda environment to activate
When use_slurm: true, the experimenter agent:
- Generates Slurm batch scripts with appropriate resource requests
- Submits jobs via
sbatch - Monitors job status with
squeue - Collects results when jobs complete
For non-Slurm environments, experiments run locally or on any SSH-accessible server.
Multi-Model Support
ARK supports three AI backends. Set the model in config.yaml or via CLI:
| Model | CLI | Best For |
|---|---|---|
| Claude Code | claude | Best overall quality, recommended for paper writing |
| Gemini CLI | gemini | Research phase, literature survey |
| Codex CLI | codex | Code-heavy experiments |
# Switch model for a specific run
python -m ark.orchestrator --project my-project --model gemini
State Management
All runtime state lives under <code_dir>/auto_research/:
auto_research/
├── state/
│ ├── paper_state.yaml # Current paper metadata
│ ├── action_plan.yaml # Current action plan from planner
│ ├── latest_review.md # Most recent review output
│ ├── memory.yaml # Score history, stagnation tracking
│ ├── checkpoint.yaml # Resume checkpoint
│ └── findings.yaml # Accumulated research findings
└── logs/
└── *.log # Per-iteration agent logs
Checkpointing & Resume
ARK automatically checkpoints after each iteration. If a run is interrupted, simply run ark run again — it will resume from the last checkpoint.
Fresh Start
To restart from scratch, clear the state files:
rm -f auto_research/state/checkpoint.yaml \
auto_research/state/paper_state.yaml \
auto_research/state/action_plan.yaml \
auto_research/state/latest_review.md \
auto_research/state/memory.yaml
Troubleshooting
LaTeX compilation fails
Ensure pdflatex and bibtex are installed. On Ubuntu:
sudo apt install texlive-full
Agent times out or fails
Check the agent logs under auto_research/logs/. Common causes:
- Claude Code CLI not authenticated — run
claudeand complete login - Rate limiting — ARK has built-in retry with backoff
- Insufficient context — try simplifying the goal anchor
Score stuck / stagnation
If the reviewer score plateaus:
- Check
auto_research/state/latest_review.mdfor the specific issues - Send a targeted instruction via
ark updateor Telegram - Consider adjusting agent prompts in
projects/<name>/agents/
Pipeline won't start
- Verify
config.yamlpaths are absolute and exist - Check that the selected model CLI is installed:
which claude - Run
ark status <project>for diagnostic output