Documentation
Six apps. One binary. Everything you need to know.
Install & Quickstart
Get Stockyard running in under a minute.
# Install the binary curl -sSL stockyard.dev/install | sh # Start the platform (all 6 apps on port 4200) stockyard # Point your app at the proxy export OPENAI_BASE_URL=http://localhost:4200/v1 # Make a request (goes through the full middleware chain) curl http://localhost:4200/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hello"}]}'
Open http://localhost:4200/ui for the web console. Open http://localhost:4200/api/apps to see all registered apps.
Stockyard auto-detects your API key provider from its prefix (sk- → OpenAI, sk-ant- → Anthropic, gsk_ → Groq, etc.) and routes to the correct upstream. No extra configuration needed.
Playground
Try Stockyard instantly at /playground. Paste any provider API key, pick a model, and start chatting. The playground includes real-time trace viewing, live middleware toggles, and a built-in A/B model comparison tool. No signup required.
Authentication
Stockyard has a built-in auth system for multi-user deployments. Users get sk-sy- API keys that route requests through the proxy with per-user provider key resolution.
# Create a user (returns an API key) curl -X POST http://localhost:4200/api/auth/signup \ -d '{"email":"alice@example.com","name":"Alice"}' # Use the key to make requests curl http://localhost:4200/v1/chat/completions \ -H "Authorization: Bearer sk-sy-..." \ -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello"}]}' # Add a provider key to your account curl -X PUT http://localhost:4200/api/auth/me/providers/openai \ -H "Authorization: Bearer sk-sy-..." \ -d '{"api_key":"sk-proj-..."}'
Key resolution order: user's own provider key → global provider. Users can bring their own keys without sharing them.
Configuration
Stockyard works with zero configuration. All 58 middleware modules are registered on boot with sensible defaults. You can toggle modules on/off at runtime via the API:
# List all modules curl http://localhost:4200/api/proxy/modules # Disable a module curl -X PUT http://localhost:4200/api/proxy/modules/toxicfilter \ -d '{"enabled": false}' # Re-enable it curl -X PUT http://localhost:4200/api/proxy/modules/toxicfilter \ -d '{"enabled": true}'
Module state persists in the embedded SQLite database. Changes take effect immediately — no restart needed.
License & Pro
Stockyard Community is free for up to 10,000 requests/month and 3 users. To remove all limits, upgrade to Pro ($9.99/month) by setting a license key:
# Set your license key export STOCKYARD_LICENSE_KEY="SY-eyJ..." # Restart Stockyard (or set the env var before first start) stockyard # Verify your tier and usage curl http://localhost:4200/api/license
License keys are available at stockyard.dev/pricing. Pro and Enterprise keys unlock unlimited requests, unlimited users, and extended retention. All features and modules are available on every tier.
Console
The web console is served at /ui and gives you a real-time view of all six apps. It shows modules, providers, routes, traces, costs, alerts, templates, workflows, packs, and installed status.
The console is built with Preact and has zero external dependencies. It's embedded in the binary and served alongside the API.
Proxy
The proxy is the gateway layer. Every LLM request passes through a middleware chain of 58 toggleable modules. The proxy is OpenAI API-compatible — any SDK that talks to /v1/chat/completions works out of the box.
Modules
Modules are organized by category:
| Category | Modules | Purpose |
|---|---|---|
| routing | fallbackrouter, modelswitch, regionroute, localsync, abrouter | Provider failover, model aliasing, geo routing |
| caching | cachelayer, embedcache, semanticcache | Response and embedding caching |
| cost | costcap, tierdrop, idlekill, outputcap, usagepulse, rateshield | Spending limits, rate limiting, usage reporting |
| safety | promptguard, toxicfilter, guardrail, agegate, hallucicheck, secretscan, agentguard | Content moderation, injection detection, PII |
| transform | promptslim, tokentrim, contextpack, chatmem, langbridge, voicebridge | Prompt compression, context management |
| validate | structuredshield, evalgate, codefence | JSON validation, quality gating |
| shims | anthrofit, geminishim | Use Claude/Gemini with OpenAI SDK |
| observe | llmtap, tracelink, alertpulse, driftwatch | Logging, tracing, alerting, drift detection |
Every module is wrapped with toggle.Wrap and checks enabled state on every request. Disable a module and it's bypassed instantly.
Providers
Stockyard supports 16 LLM providers out of the box. Set an environment variable and the provider is auto-configured on boot:
| Provider | Env Var | Models |
|---|---|---|
| OpenAI | OPENAI_API_KEY | gpt-4o, gpt-4.1, o3-mini, etc. |
| Anthropic | ANTHROPIC_API_KEY | claude-sonnet-4-5, claude-haiku-4-5 |
| Google Gemini | GEMINI_API_KEY | gemini-2.5-pro, gemini-2.0-flash |
| Groq | GROQ_API_KEY | llama-3.3-70b, mixtral-8x7b |
| Mistral | MISTRAL_API_KEY | mistral-large, codestral |
| DeepSeek | DEEPSEEK_API_KEY | deepseek-chat, deepseek-reasoner |
| Together | TOGETHER_API_KEY | Llama 3.1, Qwen 2.5 |
| Fireworks | FIREWORKS_API_KEY | Llama 3.3, Qwen 2.5 |
| Perplexity | PERPLEXITY_API_KEY | sonar-pro, sonar |
| xAI | XAI_API_KEY | grok-3, grok-2 |
| Cohere | COHERE_API_KEY | command-r-plus, command-a |
| OpenRouter | OPENROUTER_API_KEY | Any model via OpenRouter |
| Replicate | REPLICATE_API_TOKEN | Any model via Replicate |
| Azure OpenAI | AZURE_OPENAI_API_KEY | Azure-hosted models |
| Ollama | (auto at :11434) | Any local model |
| LM Studio | (auto at :1234) | Any local model |
Any OpenAI-compatible API works as a custom provider via the user settings or config file.
Routes
Routes map model patterns to providers. When a request comes in for gpt-4o, the router checks the routes table and sends it to the matching provider.
# List routes curl http://localhost:4200/api/proxy/routes
Observe
Every proxy request is automatically traced. Observe records model, tokens, cost, latency, and status for every call.
# Get recent traces curl http://localhost:4200/api/observe/traces?limit=20 # Daily cost breakdown curl http://localhost:4200/api/observe/costs # Create an alert curl -X POST http://localhost:4200/api/observe/alerts \ -d '{"name":"cost-spike","metric":"cost","condition":"gt","threshold":10}'
Trust
Trust maintains a hash-chained audit ledger. Every proxy request gets an entry with a SHA-256 hash linking it to the previous entry. This creates a tamper-evident log.
# View the audit ledger curl http://localhost:4200/api/trust/ledger?limit=10 # List policies curl http://localhost:4200/api/trust/policies
Studio
Studio manages versioned prompt templates, A/B experiments, and benchmarks.
# Run an A/B test across models curl -X POST http://localhost:4200/api/studio/experiments/run \ -d '{ "name": "speed-vs-quality", "prompt": "Explain quantum computing in one paragraph", "models": ["gpt-4o","claude-sonnet-4-5-20250929","gemini-2.0-flash"], "runs": 3, "eval": "length" }'
Eval methods: length (longer = better), concise (shorter = better), json (valid JSON), contains (keyword match), or empty for cost comparison.
# Run a multi-prompt benchmark curl -X POST http://localhost:4200/api/studio/benchmarks/run \ -d '{ "name": "model-eval-q1", "models": ["gpt-4o-mini","deepseek-chat"], "prompts": [ {"name":"summarize","prompt":"Summarize: ...","eval":"concise"}, {"name":"code","prompt":"Write fizzbuzz","eval":"contains","eval_arg":"for"} ], "runs": 3 }' # Create a versioned prompt template curl -X POST http://localhost:4200/api/studio/templates \ -d '{"slug":"summarizer","name":"Summarizer","content":"Summarize: {{text}}"}'
Forge
Forge is a DAG workflow engine. Define multi-step workflows where each step can be an LLM call, a transform, or a tool call. Steps declare dependencies and execute in topological order.
# Create a workflow curl -X POST http://localhost:4200/api/forge/workflows \ -d '{ "slug": "draft-and-critique", "name": "Draft + Critique", "steps": [ {"id":"draft","type":"llm","config":{"model":"gpt-4o-mini","prompt":"Write about {{input}}"}}, {"id":"critique","type":"llm","depends_on":["draft"], "config":{"prompt":"Critique: {{steps.draft.output}}"}}, {"id":"final","type":"transform","depends_on":["draft","critique"], "config":{"expression":"concat"}} ] }' # Run it curl -X POST http://localhost:4200/api/forge/workflows/draft-and-critique/run \ -d '{"input": "the future of AI"}' # Check status curl http://localhost:4200/api/forge/runs/{run_id}
Step types: llm (calls the proxy), transform (concat, extract_json, first_line), tool (external tool calls).
Template variables: {{input}} for the run input, {{steps.step_id.output}} for dependency outputs.
Exchange
Exchange is a config pack marketplace. Packs bundle providers, modules, routes, workflows, templates, policies, and alerts into one installable unit.
# List available packs curl http://localhost:4200/api/exchange/packs # Install a pack (applies all contents to the system) curl -X POST http://localhost:4200/api/exchange/packs/safety-essentials/install # View installed packs curl http://localhost:4200/api/exchange/installed # Uninstall (cleanly removes everything the pack added) curl -X DELETE http://localhost:4200/api/exchange/installed/1
6 packs ship by default: Safety Essentials, Cost Control, OpenAI Quickstart, Anthropic Quickstart, Multi-Provider Failover, and Evaluation Suite.
API Reference
All endpoints are served on the same port as the proxy. Full OpenAPI 3.1 spec available at /api/openapi.json.
| Method | Path | Description |
|---|---|---|
| GET | /api/openapi.json | OpenAPI 3.1 specification |
| GET | /api/apps | List registered apps |
| GET | /health | Health check |
| POST | /v1/chat/completions | Proxy LLM request |
| GET | /api/proxy/modules | List modules |
| PUT | /api/proxy/modules/{name} | Toggle module |
| GET | /api/proxy/providers | List providers |
| GET | /api/proxy/routes | List routes |
| GET | /api/observe/traces | Recent traces |
| GET | /api/observe/costs | Cost rollups |
| GET | /api/observe/alerts | Alert rules |
| POST | /api/observe/alerts | Create alert |
| GET | /api/observe/anomalies | Detected anomalies |
| GET | /api/trust/ledger | Audit ledger |
| GET | /api/trust/policies | Trust policies |
| POST | /api/trust/policies | Create policy |
| GET | /api/trust/evidence | Evidence packs |
| GET | /api/studio/templates | Prompt templates |
| POST | /api/studio/templates | Create template |
| GET | /api/studio/experiments | Experiments |
| POST | /api/studio/experiments/run | Run A/B test |
| GET | /api/studio/experiments/{id} | Get experiment results |
| POST | /api/studio/benchmarks/run | Run benchmark suite |
| POST | /api/auth/signup | Create user + API key |
| GET | /api/auth/me | Current user info |
| GET | /api/auth/me/usage | My usage stats (requests, cost, tokens) |
| PUT | /api/auth/me/providers/{name} | Add provider key |
| GET | /api/plans | Pricing plans |
| GET | /api/forge/workflows | List workflows |
| POST | /api/forge/workflows | Create workflow |
| POST | /api/forge/workflows/{slug}/run | Run workflow |
| GET | /api/forge/runs/{id} | Get run status |
| GET | /api/forge/tools | Tool registry |
| GET | /api/exchange/packs | Available packs |
| GET | /api/exchange/packs/{slug} | Pack detail |
| POST | /api/exchange/packs/{slug}/install | Install pack |
| GET | /api/exchange/installed | Installed packs |
| DELETE | /api/exchange/installed/{id} | Uninstall pack |
| GET | /api/exchange/environments | Environments |
| POST | /api/exchange/environments/{name}/sync | Sync environment |