ContentPosts from @tomweiwei..
Link
@kaptain shared a link, 1 week, 1 day ago
FAUN.dev()

Auto-Diagnosing Kubernetes Alerts with HolmesGPT and CNCF Tools

STCLab built an AI investigation pipeline withHolmesGPT, a 200-linePythonplaybook, andOpenTelemetry. It streamedMimir,Loki, andTempointo Slack threads. Metadata-driven markdownrunbookslimited tools per namespace, cut wasted tool calls from 16 to 2, and let the same model resolve alerts faster... read more  

Auto-Diagnosing Kubernetes Alerts with HolmesGPT and CNCF Tools
Link
@kaptain shared a link, 1 week, 1 day ago
FAUN.dev()

v1.36: Staleness Mitigation and Observability for Controllers

Kubernetes v1.36 shipsclient-goatomicFIFOprocessing and cache-introspection APIs. Controllers detect stale informer state and skip acting on it. kube-controller-managerenables the capability by default for four high-contention pod controllers. It addsalpha metricsfor skipped syncs and informer resou.. read more  

Link
@kala shared a link, 1 week, 1 day ago
FAUN.dev()

An open-weights Chinese model just beat Claude, GPT-5.5, and Gemini in a programming challenge

The AI Coding Contest Day 12 matched ten models on a sliding‑letter puzzle. Open‑weightsKimi K2.6took first: 22 match points (7‑1‑0).MiMo V2‑Proscored second by blasting claims for intact ≥7‑letter seeds (43 points).GPT‑5.5andClaude Opus 4.7landed third and fifth. Grids ran10×10→30×30. Heavy scrambl.. read more  

An open-weights Chinese model just beat Claude, GPT-5.5, and Gemini in a programming challenge
Link
@kala shared a link, 1 week, 1 day ago
FAUN.dev()

Introducing the Agent Readiness score. Check to see if your site is agent-ready

Cloudflare launchedIsItAgentReady. It scans200kdomains, scoresagent readiness, publishes weekly adoption charts, and exposes results via anAPI. It checksrobots.txt,llms.txt, content negotiation viaAccept: text/markdown,API Catalog,.well-known/mcp.json, OAuth discovery, andx402payments. Cloudflare ov.. read more  

Introducing the Agent Readiness score. Check to see if your site is agent-ready
Link
@kala shared a link, 1 week, 1 day ago
FAUN.dev()

Monitoring LLM behavior: Drift, retries, and refusal patterns

Traditional software is predictable due to determinism, while generative AI is unpredictable. Engineers need a new infrastructure layer, the AI Evaluation Stack, to ship enterprise-ready AI products. The stack includes deterministic assertions and model-based assertions to ensure structural integrit.. read more  

Link
@kala shared a link, 1 week, 1 day ago
FAUN.dev()

Multi-Agent System Reliability

LLMs are unreliable out of the box, but multi-agent systems can improve by dividing work among specialized agents. Building robust systems involves leveraging human system patterns like hierarchy, consensus, adversarial debate, and knock-out in a multi-agent architecture to ensure correctness and re.. read more  

Link
@kala shared a link, 1 week, 1 day ago
FAUN.dev()

The AI engineering stack we built internally - on the platform we ship

Cloudflare wired AI into the engineering stack. LLM traffic funnels through aproxy WorkerandAI Gateway. It shippedWorkers AIand theAgents SDK. Daily users hit 3,683 (93% R&D). MR throughput climbed to ~10,952/week.Workers AIhandled 51B input tokens and cut a security agent's inference spend by 77%... read more  

The AI engineering stack we built internally - on the platform we ship
Link
@devopslinks shared a link, 1 week, 1 day ago
FAUN.dev()

How incidents can teach us about what’s already working well

A famous optical illusion developed by Edward H. Adelson shows that two squares, despite appearing different in shade, are actually the same gray. This illusion demonstrates how the brain processes light, shadow, and objects when interpreting visual signals from the optic nerve. Studying such illusi.. read more  

How incidents can teach us about what’s already working well
Link
@devopslinks shared a link, 1 week, 1 day ago
FAUN.dev()

The Software Development Lifecycle Is Dead

AI agents collapse the classicSDLC-requirements,design,implementation,testing,review,deployment- into an intent-driven loop. They generate code, tests, and pipelines together. They commit tomain. Automated verification runs. Deployment and release split withfeature flags... read more  

Link
@devopslinks shared a link, 1 week, 1 day ago
FAUN.dev()

The Silent Failure of Reliability Metrics at Scale: Lessons Learned from a Decade of Broken Metrics

At scale, observability breaks whenSLIsand metrics mix different behaviors and lose clear meaning. Complexity grows: more event types, extra labels, and risingcardinality. That bloats queries, slows evaluation pipelines, and distortsPrometheus,PromQL, andElasticmetrics. Why this matters:Teams must t.. read more  

The Silent Failure of Reliability Metrics at Scale: Lessons Learned from a Decade of Broken Metrics