heart Posts from the community...
Link
@faun shared a link, 2 weeks, 1 day ago

Google Cloud unveils AI-focused updates to Kubernetes Engine

Meet theCluster Director for GKE. This beast masters GPU/TPU clusters seamlessly, herding them with Kubernetes APIs like a rodeo star. Meanwhile, theGKE Inference Gatewayramps up AI model performance. It's like magic but real: Serving costs tumble by up to30%. Tail latency? Chopped by up to60%...

Link
@faun shared a link, 2 weeks, 1 day ago

Optimize Gemma 3 Inference: vLLM on GKE 🏎️💨

GKE Autopilot's GPUmeans business—AI inference tasks don’t stand a chance. Just two arguments and, bam, you’ve unleashed NVIDIA's beastly Gemma 3 27B model, which chugs a massive46.4GB VRAM. ⚡️ Meanwhile, vLLM squeezes the models with bf16 precision, though optimization requires wrestling with algor..

Optimize Gemma 3 Inference: vLLM on GKE 🏎️💨
Link
@faun shared a link, 2 weeks, 1 day ago

Kubernetes 1.33 – What you need to know

Kubernetes 1.33 shakes things up with game-changing updates.LIST streaming encodingtrims down API Server memory like a chef with a sharp knife. Deliberate deletion orders lock down security tighter than a drum. And get this:in-place updatesfor Pod resources ditch those annoying restarts! Finally, us..

Kubernetes 1.33 – What you need to know
loading...