FAUN.dev() is where engineers from GitHub, Netflix, and Shopify go to stay ahead — fast.

An effortless, straightforward way to keep up with technologies...so you can keep your tabs closed and your mind open!

70,000+ developers already joined our ecosystem ⭐⭐⭐⭐⭐
Trusted by engineers at:

Google • Microsoft • AWS • Netflix

vLLM

vLLM is a high-performance open-source inference and serving engine for large language models (LLMs), designed to maximize throughput and efficiency through optimized memory management and scheduling.

Featured Course(s)

Cloud-Native Microservices With Kubernetes - 2nd Edition

A Comprehensive Guide to Building, Scaling, Deploying, Observing, and Managing Highly-Available Microservices in Kubernetes

> Get Your Copy

Updates and recent posts about vLLM..

Posts
Description

Link

@varbear shared a link, 3 months, 1 week ago

FAUN.dev()

The Story of Wall Street Raider

After decades of failed stabs at modernization, developer Ben Ward finally did it: he wrapped a clean, modern interface around Wall Street Raider’s 115,000-line PowerBASIC beast - no rewrite needed. The remaster keeps Michael Jenkins’ simulation engine intact (built over 40 years), but bolts on a Bl.. read more

Link

@varbear shared a link, 3 months, 1 week ago

FAUN.dev()

Understanding the Go Compiler: The Linker

Go’s linker stitches together object files from each package, wires up symbols across imports, lays out memory, and patches relocations. It strips dead code, merges duplicate data by content hash, and spits out binaries that boot clean - with W^X memory segments and hooks into the runtime... read more

Link

@varbear shared a link, 3 months, 1 week ago

FAUN.dev()

Why I’m not worried about AI job loss

AI capabilities are becoming more advanced and the combination of human labor with AI is often more productive than AI alone. Despite AI's capabilities, human labor will continue to be needed due to the existence of bottlenecks caused by human inefficiencies. The demand for goods and services create.. read more

Link

@kaptain shared a link, 3 months, 1 week ago

FAUN.dev()

LLMs on Kubernetes: Same Cluster, Different Threat Model

Running LLMs on Kubernetes opens up a new can of worms - stuff infra hardening won’t catch. You need a policy-smart gateway to vet inputs, lock down tool use, and whitelist models. No shortcuts. This post drops a reference gateway build usingmirrord(for fast, in-cluster tinkering) andCloudsmith(to t.. read more

Link

@kaptain shared a link, 3 months, 1 week ago

FAUN.dev()

The State of Java on Kubernetes 2026: Why Defaults are Killing Your Performance

Akamas just dropped fresh numbers: over60% of Java apps running on Kubernetesstick with default JVM settings. That means sluggish memory use, GC thrash, and CPUs getting choked out. Even with "container-friendly" Java builds out there, most teams still skip setting GC types or heap sizes. Kubernetes.. read more

Link

@kaptain shared a link, 3 months, 1 week ago

FAUN.dev()

Migrating from Slurm to Kubernetes

SkyPilot drops a clean interface that blendsSlurmwithKubernetes. AI/ML teams get to keep their Slurm-style comforts - job scripts, gang scheduling, GPU guarantees, interactive workflows - but pick up Kubernetes perks like container isolation and rich ecosystem hooks. It handles the messy bits: pods,.. read more

Link

@kaptain shared a link, 3 months, 1 week ago

FAUN.dev()

Zero-Downtime Ingress Controller Migration in Kubernetes

Ingress-nginxis heading for the exits - end-of-life drops March 2026. That puts Kubernetes operators on the hook to swap in a new ingress controller. The migration path? Run both old and new in parallel. Use DNS cutover. Point explicitly with Ingress classes. Done right, the switchover hits zero dow.. read more

Link

@kala shared a link, 3 months, 1 week ago

FAUN.dev()

YOLO Mode: Hidden Risks in Claude Code Permissions

A scrape of 18,470 Claude Code configs on GitHub shows a pattern: developers are handing their AI agents the keys to the castle. Unrestricted file, shell, and network accessis common. Among them: - 21.3% let Claude runcurl - 14.5% allowarbitrary Python execution - 19.7% give itgit pushprivileges Tha.. read more

Link

@kala shared a link, 3 months, 1 week ago

FAUN.dev()

Adventures in Neural Rendering

A graphics dev took a swing at encoding rendering signals - radiance, irradiance, depth, AO, BRDFs - using tightMLPs in HLSL. They benchmarked size, storage, and runtime cost. Turns out, MLPs beatL2 spherical harmonicsfor packing radiance. But they stumble on irradiance and specular BRDFs. Bring inR.. read more

Link

@kala shared a link, 3 months, 1 week ago

FAUN.dev()

Why Trying to Secure OpenClaw is Ridiculous

OpenClaw, an open-source autonomous AI agent with full device access, racked up 179K GitHub stars - and walked straight into a security nightmare. It shipped wide open: default ports exposed to the internet, its plugin hub laced with malicious packages. Slapped-on fixes followed, warning labels, Vir.. read more

vLLM is an advanced open-source framework for serving and running large language models efficiently at scale. Developed by researchers and engineers from UC Berkeley and adopted widely across the AI industry, vLLM focuses on optimizing inference performance through its innovative PagedAttention mechanism — a memory management system that enables near-zero waste in GPU memory utilization. It supports model parallelism, continuous batching, tensor parallelism, and dynamic batching across GPUs, making it ideal for real-world deployment of foundation models. vLLM integrates seamlessly with Hugging Face Transformers, OpenAI-compatible APIs, and popular orchestration tools like Ray Serve and Kubernetes. Its design allows developers and enterprises to host LLMs with reduced latency, lower hardware costs, and increased throughput, powering everything from chatbots to enterprise-scale AI services.