AI Models
TaskPilot supports three AI providers. Select a provider and model from the session toolbar before starting a session.
Claude (Anthropic) — Default Provider
Claude models excel at coding, analysis, and following complex instructions. Default to Sonnet 4.6 for 80%+ of tasks — it offers the best quality-to-cost ratio.
| Model | Context | Input / Output | Best For |
|---|---|---|---|
| Haiku 4.5 | 200K | $1 / $5 | High-volume simple tasks — classification, extraction, formatting. 12x cheaper than Sonnet. |
| Opus 4.6 | 1M | $15 / $75 | Deep scientific reasoning, complex multi-file refactors, agent teams. 91.3% GPQA Diamond. |
| Sonnet 4.6 default | 200K | $3 / $15 | General coding, most tasks — best balance of quality and cost. 79.6% SWE-bench. |
OpenAI
OpenAI's GPT-5.4 family delivers strong reasoning and coding performance. GPT-5.4 Mini is the recommended default — it approaches full GPT-5.4 quality at a fraction of the cost.
| Model | Context | Input / Output | Best For |
|---|---|---|---|
| GPT-5.4 default | 1M | $2.50 / $15 | Most demanding reasoning and professional tasks. |
| GPT-5.4 Mini | 400K | $0.75 / $4.50 | General coding, high-volume workloads — 2x faster than full GPT-5.4, approaches its performance. |
| GPT-5.4 Nano | 400K | $0.20 / $1.25 | Classification, data extraction, ranking, lightweight sub-agents. Cheapest OpenAI option. |
Groq (Ultra-Fast Inference)
All Groq models run on Groq's custom LPU hardware for extremely fast inference. Choose Groq when speed matters more than frontier intelligence.
| Model | Context | Cost (per 1M) | Best For |
|---|---|---|---|
| GPT-OSS 120B | 131K | ~$1.20 | Highest quality on Groq — OpenAI's open-weight model with built-in search and code execution. |
| Kimi K2 | 262K | ~$1.50 | Agentic tasks, tool use, coding benchmarks. Largest context on Groq. |
| Llama 3.1 8B | 128K | ~$0.06 | Ultra-cheap simple tasks. Fastest and cheapest option. |
| Llama 3.3 70B default | 128K | ~$0.60 | General-purpose, proven reliability. Best default for Groq. |
| Llama 4 Scout | 10M | Low | Massive context window (10M tokens), blazing-fast speed (2600 tok/s). |
| Qwen 3 32B | 128K | ~$0.30 | Reasoning and dialogue with thinking/non-thinking modes. |
Choosing a Model
Most tasks: Use Claude Sonnet 4.6 — best quality-to-cost ratio for coding.
Deep reasoning: Use Opus 4.6 for multi-file refactors, scientific analysis, or 1M context.
Budget-conscious: Use Haiku 4.5 or GPT-5.4 Nano for high-volume simple work.
OpenAI default: Use GPT-5.4 for demanding tasks or GPT-5.4 Mini for faster, cheaper general work.
Speed-critical: Use Groq — all models benefit from LPU hardware acceleration.
Massive context: Use Llama 4 Scout on Groq (10M tokens) or Opus 4.6 (1M tokens).
Prices shown are per 1 million tokens. Actual costs depend on task complexity and token usage. Model selection is available in the session toolbar dropdown. Switching providers resets to that provider's default model.