Models
The AI models available on Levain, grouped by provider, with token prices.
Levain routes every agent's LLM traffic for you, so you can pick the model that fits each task. Claude models are first-class on every agent; other providers are available too — some out of the box, others when you bring your own key.
Anthropic
Frontier reasoning and agentic-coding leader; long-horizon autonomy and 1M-token context. First-class on every Levain lane.
| Model | Model ID | Context | Input / 1M | Output / 1M | Availability |
|---|---|---|---|---|---|
Claude Fable 5 | claude-fable-5 | 1M | $10.00 | $50.00 | Available |
Claude Opus 4.8 | claude-opus-4-8 | 1M | $5.00 | $25.00 | Available |
Claude Sonnet 5 | claude-sonnet-5 | 1M | $3.00 | $15.00 | Available |
Claude Haiku 4.5 | claude-haiku-4-5 | 200K | $1.00 | $5.00 | Available |
Claude Opus 4.1 | claude-opus-4-1 | 200K | $15.00 | $75.00 | Available |
Claude Opus 4.5 | claude-opus-4-5 | 200K | $5.00 | $25.00 | Available |
Claude Opus 4.6 | claude-opus-4-6-v1 | 1M | $5.00 | $25.00 | Available |
Claude Opus 4.7 | claude-opus-4-7 | 1M | $5.00 | $25.00 | Available |
Claude Sonnet 4.5 | claude-sonnet-4-5 | 200K | $3.00 | $15.00 | Available |
Claude Sonnet 4.6 | claude-sonnet-4-6 | 1M | $3.00 | $15.00 | Available |
OpenAI
Broad ecosystem; strong general-purpose flagships.
| Model | Model ID | Context | Input / 1M | Output / 1M | Availability |
|---|---|---|---|---|---|
GPT OSS Safeguard 120B | openai.gpt-oss-safeguard-120b | 128K | $0.15 | $0.60 | Available |
GPT OSS Safeguard 20B | openai.gpt-oss-safeguard-20b | 128K | $0.070 | $0.20 | Available |
gpt-oss-120b | openai.gpt-oss-120b-1:0 | 128K | $0.15 | $0.60 | Available |
gpt-oss-20b | openai.gpt-oss-20b-1:0 | 128K | $0.070 | $0.30 | Available |
GPT-5.5 | gpt-5.5 | 1M | $5.00 | $30.00 | Your own key |
GPT-5.4 | gpt-5.4 | 1M | $2.50 | $15.00 | Your own key |
Gemma open models run by default; Gemini flagships run with your own Google key.
| Model | Model ID | Context | Input / 1M | Output / 1M | Availability |
|---|---|---|---|---|---|
Gemma 3 12B IT | google.gemma-3-12b-it | 128K | $0.090 | $0.29 | Available |
Gemma 3 27B IT | google.gemma-3-27b-it | 128K | $0.23 | $0.38 | Available |
Gemma 3 4B IT | google.gemma-3-4b-it | 128K | $0.040 | $0.080 | Available |
Gemini 3 Pro | gemini-3-pro-preview | 1M | $2.00 | $12.00 | Your own key |
Gemini 3 Flash | gemini-3-flash-preview | 1M | $0.50 | $3.00 | Your own key |
Meta
Self-hostable open-weight Llama models.
| Model | Model ID | Context | Input / 1M | Output / 1M | Availability |
|---|---|---|---|---|---|
Llama 3 70B Instruct | meta.llama3-70b-instruct-v1:0 | 8K | $2.65 | $3.50 | Available |
Llama 3 8B Instruct | meta.llama3-8b-instruct-v1:0 | 8K | $0.30 | $0.60 | Available |
Llama 3.1 70B Instruct | meta.llama3-1-70b-instruct-v1:0 | 128K | $0.99 | $0.99 | Available |
Llama 3.1 8B Instruct | meta.llama3-1-8b-instruct-v1:0 | 128K | $0.22 | $0.22 | Available |
Llama 3.3 70B Instruct | meta.llama3-3-70b-instruct-v1:0 | 128K | $0.72 | $0.72 | Available |
Llama 4 Maverick 17B Instruct | meta.llama4-maverick-17b-instruct-v1:0 | 128K | $0.24 | $0.97 | Available |
Llama 4 Scout 17B Instruct | meta.llama4-scout-17b-instruct-v1:0 | 128K | $0.17 | $0.66 | Available |
Mistral
European provider; Devstral targets coding.
| Model | Model ID | Context | Input / 1M | Output / 1M | Availability |
|---|---|---|---|---|---|
Devstral 2 123B | mistral.devstral-2-123b | 256K | $0.40 | $2.00 | Available |
Magistral Small 2509 | mistral.magistral-small-2509 | 128K | $0.50 | $1.50 | Available |
Ministral 14B 3.0 | mistral.ministral-3-14b-instruct | 128K | $0.20 | $0.20 | Available |
Ministral 3 8B | mistral.ministral-3-8b-instruct | 128K | $0.15 | $0.15 | Available |
Ministral 3B | mistral.ministral-3-3b-instruct | 128K | $0.10 | $0.10 | Available |
Mistral 7B Instruct | mistral.mistral-7b-instruct-v0:2 | 32K | $0.15 | $0.20 | Available |
Mistral Large (24.02) | mistral.mistral-large-2402-v1:0 | 32K | $8.00 | $24.00 | Available |
Mistral Large 3 | mistral.mistral-large-3-675b-instruct | 256K | $0.50 | $1.50 | Available |
Mistral Small (24.02) | mistral.mistral-small-2402-v1:0 | 32K | $1.00 | $3.00 | Available |
Mixtral 8x7B Instruct | mistral.mixtral-8x7b-instruct-v0:1 | 32K | $0.45 | $0.70 | Available |
Pixtral Large (25.02) | mistral.pixtral-large-2502-v1:0 | 128K | $2.00 | $6.00 | Available |
Voxtral Mini 3B 2507 | mistral.voxtral-mini-3b-2507 | 128K | $0.040 | $0.040 | Available |
Voxtral Small 24B 2507 | mistral.voxtral-small-24b-2507 | 128K | $0.10 | $0.30 | Available |
DeepSeek
Extreme price/performance for extraction and coding tiers; open weights.
| Model | Model ID | Context | Input / 1M | Output / 1M | Availability |
|---|---|---|---|---|---|
DeepSeek V3.2 | deepseek.v3.2 | 164K | $0.62 | $1.85 | Available |
DeepSeek-R1 | deepseek.r1-v1:0 | 128K | $1.35 | $5.40 | Available |
Alibaba (Qwen)
Open-weight coder and vision lineup; budget tier.
| Model | Model ID | Context | Input / 1M | Output / 1M | Availability |
|---|---|---|---|---|---|
Qwen3 32B (dense) | qwen.qwen3-32b-v1:0 | 131K | $0.15 | $0.60 | Available |
Qwen3 Coder Next | qwen.qwen3-coder-next | 262K | $0.50 | $1.20 | Available |
Qwen3 Next 80B A3B | qwen.qwen3-next-80b-a3b | 128K | $0.15 | $1.20 | Available |
Qwen3 VL 235B A22B | qwen.qwen3-vl-235b-a22b | 128K | $0.53 | $2.66 | Available |
Qwen3-Coder-30B-A3B-Instruct | qwen.qwen3-coder-30b-a3b-v1:0 | 262K | $0.15 | $0.60 | Available |
xAI
The Grok family; runs with your own xAI key.
| Model | Model ID | Context | Input / 1M | Output / 1M | Availability |
|---|---|---|---|---|---|
Grok 4 | xai/grok-4 | 256K | $3.00 | $15.00 | Your own key |
MiniMax
Cost outlier for agentic coding.
| Model | Model ID | Context | Input / 1M | Output / 1M | Availability |
|---|---|---|---|---|---|
MiniMax M2 | minimax.minimax-m2 | 128K | $0.30 | $1.20 | Available |
MiniMax M2.1 | minimax.minimax-m2.1 | 196K | $0.30 | $1.20 | Available |
MiniMax M2.5 | minimax.minimax-m2.5 | 1M | $0.30 | $1.20 | Available |
Amazon
The Nova family — fast, low-cost multimodal models.
| Model | Model ID | Context | Input / 1M | Output / 1M | Availability |
|---|---|---|---|---|---|
Nova 2 Lite | amazon.nova-2-lite-v1:0 | 1M | $0.30 | $2.50 | Available |
Nova Lite | amazon.nova-lite-v1:0 | 300K | $0.060 | $0.24 | Available |
Nova Micro | amazon.nova-micro-v1:0 | 128K | $0.035 | $0.14 | Available |
Nova Pro | amazon.nova-pro-v1:0 | 300K | $0.80 | $3.20 | Available |
Z.AI
The GLM family; open-weight reasoning and coding.
| Model | Model ID | Context | Input / 1M | Output / 1M | Availability |
|---|---|---|---|---|---|
GLM 4.7 | zai.glm-4.7 | 200K | $0.60 | $2.20 | Available |
GLM 4.7 Flash | zai.glm-4.7-flash | 200K | $0.070 | $0.40 | Available |
GLM 5 | zai.glm-5 | 200K | $1.00 | $3.20 | Available |
Moonshot AI
The Kimi family; long-context agentic models.
| Model | Model ID | Context | Input / 1M | Output / 1M | Availability |
|---|---|---|---|---|---|
Kimi K2 Thinking | moonshot.kimi-k2-thinking | 128K | $0.60 | $2.50 | Available |
Kimi K2.5 | moonshot.kimi-k2.5 | 262K | $0.60 | $3.00 | Available |
NVIDIA
The Nemotron family; open reasoning models.
| Model | Model ID | Context | Input / 1M | Output / 1M | Availability |
|---|---|---|---|---|---|
NVIDIA Nemotron 3 Super 120B A12B | nvidia.nemotron-super-3-120b | 256K | $0.15 | $0.65 | Available |
NVIDIA Nemotron Nano 12B v2 VL BF16 | nvidia.nemotron-nano-12b-v2 | 128K | $0.20 | $0.60 | Available |
NVIDIA Nemotron Nano 9B v2 | nvidia.nemotron-nano-9b-v2 | 128K | $0.060 | $0.23 | Available |
Nemotron Nano 3 30B | nvidia.nemotron-nano-3-30b | 262K | $0.060 | $0.24 | Available |
Writer
The Palmyra family; enterprise and domain models.
| Model | Model ID | Context | Input / 1M | Output / 1M | Availability |
|---|---|---|---|---|---|
Palmyra X4 | writer.palmyra-x4-v1:0 | 128K | $2.50 | $10.00 | Available |
Palmyra X5 | writer.palmyra-x5-v1:0 | 1M | $0.60 | $6.00 | Available |
Choosing a model
Model choice is per node in a recipe. A few rules of thumb:
- Default to the cheapest model that survives the task. Reserve the top tier for deep reasoning, planning, and long-horizon autonomy.
- Tune effort before jumping tiers. A smaller model at higher effort often matches a bigger one at a fraction of the cost.
- Cost concentrates in loops. A model choice on a node that runs once is a rounding error; the same choice inside a per-item or retry loop dominates the run's cost.
For how billing works — provider cost plus a flat platform fee, or margin-only when you bring your own key — see Pricing and Bring Your Own Key.