Can LLMs replace on call SREs today?
ClickHouse ran five LLMs through an autonomous root cause gauntlet using OpenTelemetry data. None nailed it solo. OpenAIâs o3 and Claude Sonnet 4 came closest. GPT-4.1 was the cheapest brain on the block. Things got weird under the hood. Token usage spiked unpredictably. Queries slammed observabili..