App 04 — Studio

Prompts deserve version control.

Versioned templates, A/B experiments, model benchmarks, and snapshot tests. Prompt engineering as infrastructure, not guesswork.

Start free See the API
Prompt Templates 4 active
Experiment: code-review Running
v6 vs v7: stricter security checks
247 completions · 72% traffic to v7
v6: 3.2 avg scorev7: 4.1 avg score
Benchmark: summarize-doc Complete
gpt-4o — 4.3 quality, $0.018/call, 1.2s
claude-sonnet — 4.5 quality, $0.024/call, 1.8s
gpt-4o-mini — 3.8 quality, $0.002/call, 0.4s

Prompt Templates

Store prompts as named, versioned templates with variables. Roll back to any version. Deploy a specific version to production while testing the next.

A/B Experiments

Split traffic between template versions. Measure quality scores, latency, and cost per variant. Promote the winner with one API call.

Model Benchmarks

Run the same prompt across multiple models. Compare quality, cost, and speed side by side. Find the best model for each use case.

Snapshot Tests

Capture a prompt's output at a point in time. Detect regressions when models update. Assert that changes don't break existing behavior.

Diff Viewer

Compare any two template versions, model outputs, or experiment results side by side. Split or unified view with inline highlighting.

Zero Config

Studio ships in the same binary. No separate service, no database migration, no SDK. Create your first template with one curl.

The API

Templates, experiments, benchmarks, and snapshots are all first-class REST resources. Version your prompts the same way you version your code.

# Create a template curl -X POST /api/studio/templates \ -d '{"slug":"summarize-doc", "body":"Summarize: {{text}}", "model":"gpt-4o", "variables":["text"]}' # Add a new version curl -X POST /api/studio/templates/summarize-doc/versions \ -d '{"body":"Summarize in 3 bullets: {{text}}", "note":"tighter format"}' # Run an A/B experiment curl -X POST /api/studio/experiments \ -d '{"template":"code-review", "variants":["v6","v7"], "split":[30,70], "metric":"quality"}' # Benchmark across models curl -X POST /api/studio/benchmarks/run \ -d '{"template":"summarize-doc", "models":["gpt-4o","claude-sonnet-4-20250514","gpt-4o-mini"]}' # Take a snapshot curl -X POST /api/studio/snapshots \ -d '{"template":"extract-entities", "version":"v5", "tag":"pre-model-update"}'

Stop guessing. Start testing.

Studio ships with every Stockyard instance. Self-hosted or Cloud.

Start free Back to platform