Evaluating AI Agents in Security Operations
Cotool threw frontier LLMs at real-world SecOps tasks using Splunkâs BOTSv3 dataset.GPT-5topped the chart in accuracy (62.7%) and gave the best results per dollar.Claude Haiku-4.5blazed through tasks fastest, just 240 seconds on average, maxing out tool integrations.Gemini-2.5-proflopped on both acc.. read more Â
















