Here’s a dirty secret about building AI-powered tools: no single model is best at everything.

GPT-4o is great at vision but expensive for bulk text. Claude Sonnet writes better analysis but can’t transcribe audio. Gemini Flash is fast and cheap but less precise on diagram extraction. Whisper is unbeatable for transcription but can run locally without any API.

Most tools pick one provider and call it a day. PlanOpticon picks the right one for each task.

Auto-discovery

When PlanOpticon starts, it checks which API keys you have set — OPENAI_API_KEY, ANTHROPIC_API_KEY, GEMINI_API_KEY — and queries each provider’s API to discover available models and their capabilities. No configuration file. No manual model selection. Just set your keys and go.

planopticon list-models

This shows you exactly what’s available across all your configured providers, grouped by capability: vision, chat, and audio.

Task routing

Each step in the pipeline has different requirements:

TaskBest atFallback
TranscriptionWhisper-1 (or local Whisper)Gemini Flash
Frame classificationGemini Flash (fast + cheap)GPT-4o
Diagram analysisGPT-4o (detailed vision)Claude Sonnet
Content analysisClaude Sonnet (nuanced writing)GPT-4o
KG extractionGemini Flash (bulk processing)Claude Sonnet

PlanOpticon’s ProviderManager resolves the best available model for each capability based on what’s actually accessible. If you only have an Anthropic key, everything routes through Claude. If you have all three, each task goes to whoever’s best at it.

Why this matters

A full video analysis makes dozens of API calls across different task types. Without smart routing, you either overpay (using GPT-4o for everything) or underperform (using a cheap model for tasks that need precision).

The provider system is also how PlanOpticon survives failures. When we ran out of Anthropic credits mid-analysis, we added a Gemini key and re-ran. The checkpoint system skipped completed steps, and the provider manager routed remaining work to Gemini. No code changes, no config edits.

Override when you want to

planopticon analyze -i video.mp4 -o ./output \
  --provider gemini \
  --vision-model gemini-2.5-flash \
  --chat-model gemini-2.5-flash

Pin to a specific provider, override individual models, or let auto-routing handle it. Your choice.

The architecture is designed around the assumption that the AI landscape keeps changing. New models ship weekly. Providers raise and lower prices. The tool that locks you into one vendor is the tool you’ll replace in six months.

GitHub · Docs · PyPI

Written by Leo M.