App 04 — Studio

Prompts deserve version control.

Versioned templates, A/B experiments, model benchmarks, and snapshot tests. Prompt engineering as infrastructure, not guesswork.

Start free See the API

Prompt Templates 4 active

summarize-docv3Summarize uploaded documents into bullet points2h ago
code-reviewv7Review PR diffs for bugs, style, and security1d ago
customer-replyv2Generate support ticket responses with context3d ago
extract-entitiesv5Pull names, dates, amounts from unstructured text5d ago

Experiment: code-review Running

v6 vs v7: stricter security checks

247 completions · 72% traffic to v7

v6: 3.2 avg scorev7: 4.1 avg score

Benchmark: summarize-doc Complete

gpt-4o — 4.3 quality, $0.018/call, 1.2s

claude-sonnet — 4.5 quality, $0.024/call, 1.8s

gpt-4o-mini — 3.8 quality, $0.002/call, 0.4s

What Studio does

Prompt Templates

Store prompts as named, versioned templates with variables. Roll back to any version. Deploy a specific version to production while testing the next.

A/B Experiments

Split traffic between template versions. Measure quality scores, latency, and cost per variant. Promote the winner with one API call.

Model Benchmarks

Run the same prompt across multiple models. Compare quality, cost, and speed side by side. Find the best model for each use case.

Snapshot Tests

Capture a prompt's output at a point in time. Detect regressions when models update. Assert that changes don't break existing behavior.

Diff Viewer

Compare any two template versions, model outputs, or experiment results side by side. Split or unified view with inline highlighting.

Zero Config

Studio ships in the same binary. No separate service, no database migration, no SDK. Create your first template with one curl.

The API

Templates, experiments, benchmarks, and snapshots are all first-class REST resources. Version your prompts the same way you version your code.

# Create a template
curl -X POST /api/studio/templates \
  -d '{"slug":"summarize-doc", "body":"Summarize: {{text}}",
       "model":"gpt-4o", "variables":["text"]}'

# Add a new version
curl -X POST /api/studio/templates/summarize-doc/versions \
  -d '{"body":"Summarize in 3 bullets: {{text}}", "note":"tighter format"}'

# Run an A/B experiment
curl -X POST /api/studio/experiments \
  -d '{"template":"code-review", "variants":["v6","v7"],
       "split":[30,70], "metric":"quality"}'

# Benchmark across models
curl -X POST /api/studio/benchmarks/run \
  -d '{"template":"summarize-doc",
       "models":["gpt-4o","claude-sonnet-4-20250514","gpt-4o-mini"]}'

# Take a snapshot
curl -X POST /api/studio/snapshots \
  -d '{"template":"extract-entities", "version":"v5", "tag":"pre-model-update"}'
  

Stop guessing. Start testing.

Studio ships with every Stockyard instance. Self-hosted or Cloud.

Start free Back to platform