Join us

FAUN.dev() is where engineers from GitHub, Netflix, and Shopify go to stay ahead — fast.

An effortless, straightforward way to keep up with technologies...so you can keep your tabs closed and your mind open!

70,000+ developers already joined our ecosystem ⭐⭐⭐⭐⭐
Trusted by engineers at:

Google • Microsoft • AWS • Netflix

Unsloth

Unsloth is an open-source library for fine-tuning large language models faster and with far less memory. It patches the training stack with optimized Triton kernels and a manual backprop path, so you…

Featured Course(s)

Local AI Engineering with Ollama

Run, understand, customize, fine-tune, and build agentic apps on your own hardware

> Get Your Copy

Content

Updates and recent posts about Unsloth..

Posts
Description

Link

@kaptain shared a link, 1 month, 1 week ago

FAUN.dev()

How Cloud Native Infrastructure Powers AI on Kubernetes

A vendor piece from Mirantis arguing that GPU multi-tenancy on Kubernetes is widely misrepresented, with most platforms shipping namespace-based isolation while production GPU clouds require hardware-enforced separation through MIG partitioning, cluster-per-tenant architecture, and DPU-based network.. read more

How Cloud Native Infrastructure Powers AI on Kubernetes

Link

@kaptain shared a link, 1 month, 1 week ago

FAUN.dev()

v1.36: Moving Volume Group Snapshots to GA

Volume group snapshots reachedGAin Kubernetesv1.36, with the API promoted togroupsnapshot.storage.k8s.io/v1. The feature lets aVolumeGroupSnapshotobject take crash-consistent snapshots across multiple PVCs selected by label, removing the need to quiesce applications that span separate data and log v.. read more

Link

@kaptain shared a link, 1 month, 1 week ago

FAUN.dev()

v1.36: Declarative Validation Graduates to GA

Declarative validation graduated toGAin Kubernetesv1.36, replacing handwritten Go validation with+k8s:marker tags on field definitions... read more

Link

@kaptain shared a link, 1 month, 1 week ago

FAUN.dev()

v1.36: Server-Side Sharded List and Watch

Alpha inv1.36, server-side sharded list and watch adds ashardSelectorfield toListOptionsso the API server uses an FNV-1a hash onmetadata.uidormetadata.namespaceto send each controller replica only its slice of the resource collection. This eliminates the cost of every replica deserializing the full .. read more

Link

@kala shared a link, 1 month, 1 week ago

FAUN.dev()

Orchestrating AI Code Review at scale

Cloudflare engineers built an AI code review platform on OpenCode. They split GitLab integration, model providers, prompts, and policy into separate plugins. A coordinator assigns up to seven domain reviewers across security, performance, code quality, documentation, release checks, and AGENTS.md co.. read more

Orchestrating AI Code Review at scale

Link

@kala shared a link, 1 month, 1 week ago

FAUN.dev()

How We Built an AI Second Brain for 60K Knowledge Workers

Meta built an AI agent system internally called the AI Second Brain that now has over 63,000 installs and ~10,000 daily active users across engineering, PM, design, legal, finance, comms, and sales, growing from zero in roughly three months after a non-technical PM's adoption post. The architecture .. read more

How We Built an AI Second Brain for 60K Knowledge Workers

Link

@kala shared a link, 1 month, 1 week ago

FAUN.dev()

Democratizing Machine Learning at Netflix: Building the Model Lifecycle Graph

Netflix's Saish Sali, Nipun Kumar, and Sura Elamurugu describe the Metadata Service (MDS), a graph layer built to connect siloed ML tooling (model registry, pipeline orchestrator, experimentation platform, feature store, dataset platform, identity) across personalization, studio, payments, and ads. .. read more

Link

@kala shared a link, 1 month, 1 week ago

FAUN.dev()

The AWS MCP Server is now generally available

AWS now offers AWS MCP Server as a managed remote MCP server in US East (N. Virginia) and Europe (Frankfurt). MCP-compatible clients can use existing IAM credentials to access more than 15,000 AWS API operations. For GA, AWS added IAM context keys, documentation retrieval without authentication, low.. read more

The AWS MCP Server is now generally available

Link

@kala shared a link, 1 month, 1 week ago

FAUN.dev()

Running local models on an M4 with 24GB memory

Local LLMs work best as supervised coding assistants. The writer ran Qwen 3.5 9B (Q4) in LM Studio on a 24GB MacBook Pro and got about 40 tokens per second, with thinking mode, tool use, and a 128K context window. The author saw mixed results: Qwen helped with simple Elixir linter edits, then failed.. read more

Running local models on an M4 with 24GB memory

Link

@devopslinks shared a link, 1 month, 1 week ago

FAUN.dev()

S3 Files and the changing face of S3

AWS launchedS3 Files, an EFS-backed feature that mounts any S3 bucket or prefix as an NFS filesystem on EC2, containers, or Lambda, with changes batched back to S3 roughly every 60 seconds. Rather than collapsing file and object semantics into a single model (an early design attempt called "EFS3" th.. read more

S3 Files and the changing face of S3

Unsloth is an open-source toolkit for training and fine-tuning large language models faster and with less memory than a standard Hugging Face stack. Its core library replaces PyTorch's default autograd with custom backpropagation kernels written in OpenAI's Triton language, which is where most of its speed and memory savings come from. It supports LoRA, QLoRA, full fine-tuning, reinforcement learning, pretraining, and 4-bit, 16-bit, and FP8 training, across more than 500 text, vision, audio, and embedding models.

The practical draw is hardware reach. QLoRA workflows in Unsloth let you fine-tune an 8B model on a single 12 GB consumer GPU, and the project headlines roughly 2x faster training with about 70 percent less VRAM versus baseline implementations, though the exact figures vary by model, GPU, and config. A 2026 update added faster mixture-of-experts training, with models like Qwen3-30B-A3B fine-tunable on about 17.5 GB of VRAM. It runs on NVIDIA (including Blackwell and DGX Spark), AMD, and Intel GPUs, with free Colab and Kaggle notebooks for trying it without local hardware.

It fits cleanly into the local-AI workflow. Unsloth integrates with Hugging Face transformers and TRL, and uses llama.cpp to save and run models, exporting to GGUF for Ollama or LM Studio as well as safetensors. As of 2026 it also ships Unsloth Studio, a local no-code GUI that covers the full lifecycle from dataset creation to training to running and comparing GGUF and safetensors models, with tool-calling, web search, and an OpenAI-compatible API, all running offline on Mac and Windows, with the core library under the Apache 2.0 license.