TaskPilot
TaskPilot
Sign inRegisterRoadmapFeature RequestWeb DocsCLI DocsAI DocsOther Tools
PrivacyTermsCookies

AI Models

TaskPilot supports three AI providers. Select a provider and model from the session toolbar before starting a session.

Claude (Anthropic) — Default Provider

Claude models excel at coding, analysis, and following complex instructions. Default to Sonnet 4.6 for 80%+ of tasks — it offers the best quality-to-cost ratio.

ModelContextInput / OutputBest For
Haiku 4.5200K$1 / $5High-volume simple tasks — classification, extraction, formatting. 12x cheaper than Sonnet.
Opus 4.61M$15 / $75Deep scientific reasoning, complex multi-file refactors, agent teams. 91.3% GPQA Diamond.
Sonnet 4.6 default200K$3 / $15General coding, most tasks — best balance of quality and cost. 79.6% SWE-bench.

OpenAI

OpenAI's GPT-5.4 family delivers strong reasoning and coding performance. GPT-5.4 Mini is the recommended default — it approaches full GPT-5.4 quality at a fraction of the cost.

ModelContextInput / OutputBest For
GPT-5.4 default1M$2.50 / $15Most demanding reasoning and professional tasks.
GPT-5.4 Mini400K$0.75 / $4.50General coding, high-volume workloads — 2x faster than full GPT-5.4, approaches its performance.
GPT-5.4 Nano400K$0.20 / $1.25Classification, data extraction, ranking, lightweight sub-agents. Cheapest OpenAI option.

Groq (Ultra-Fast Inference)

All Groq models run on Groq's custom LPU hardware for extremely fast inference. Choose Groq when speed matters more than frontier intelligence.

ModelContextCost (per 1M)Best For
GPT-OSS 120B131K~$1.20Highest quality on Groq — OpenAI's open-weight model with built-in search and code execution.
Kimi K2262K~$1.50Agentic tasks, tool use, coding benchmarks. Largest context on Groq.
Llama 3.1 8B128K~$0.06Ultra-cheap simple tasks. Fastest and cheapest option.
Llama 3.3 70B default128K~$0.60General-purpose, proven reliability. Best default for Groq.
Llama 4 Scout10MLowMassive context window (10M tokens), blazing-fast speed (2600 tok/s).
Qwen 3 32B128K~$0.30Reasoning and dialogue with thinking/non-thinking modes.

Choosing a Model

Most tasks: Use Claude Sonnet 4.6 — best quality-to-cost ratio for coding.

Deep reasoning: Use Opus 4.6 for multi-file refactors, scientific analysis, or 1M context.

Budget-conscious: Use Haiku 4.5 or GPT-5.4 Nano for high-volume simple work.

OpenAI default: Use GPT-5.4 for demanding tasks or GPT-5.4 Mini for faster, cheaper general work.

Speed-critical: Use Groq — all models benefit from LPU hardware acceleration.

Massive context: Use Llama 4 Scout on Groq (10M tokens) or Opus 4.6 (1M tokens).

Prices shown are per 1 million tokens. Actual costs depend on task complexity and token usage. Model selection is available in the session toolbar dropdown. Switching providers resets to that provider's default model.