Search-Agent Evaluation Harness
/
live
LLM Tool-Selection
Benchmark
☾
North-star:
CPC (Cost Per Correct)
= total run cost / tasks with correct retrieval strategy · Priority:
Quality → Latency → Dollar cost
OpenAI
Anthropic
Ollama (local, $0)