Join us

ContentUpdates and recent posts about Pelagia..
Link
@faun shared a link, 3 weeks, 2 days ago

Becoming a Research Engineer at a Big LLM Lab - 18 Months of Strategic Career Development

To land a big career role like Mistral, mix efficient **tactical** moves (like LeetCode practice) with **strategic** ups, like building a powerful portfolio and a solid network. Balance is key; aim to impress and prepare well without overlooking the power of strategy in shaping a successful career...

Link
@faun shared a link, 3 weeks, 2 days ago

Jupyter Agents: training LLMs to reason with notebooks

Hugging Face dropped an open pipeline and dataset for training small models—think **Qwen3-4B**—into sharp **Jupyter-native data science agents**. They pulled curated Kaggle notebooks, whipped up synthetic QA pairs, added lightweight **scaffolding**, and went full fine-tune. Net result? A **36% jump ..

Jupyter Agents: training LLMs to reason with notebooks
Link
@faun shared a link, 3 weeks, 2 days ago

Building a Natural Language Interface for Apache Pinot with LLM Agents

MiQ plugged **Google’s Agent Development Kit** into their stack to spin up **LLM agents** that turn plain English into clean, validated SQL. These agents speak directly to **Apache Pinot**, firing off real-time queries without the usual parsing pain. Behind the scenes, it’s a slick handoff: NL2SQL ..

Building a Natural Language Interface for Apache Pinot with LLM Agents
Link
@faun shared a link, 3 weeks, 2 days ago

The productivity paradox of AI coding assistants

A July 2025 METR trial dropped a twist: seasoned devs using Cursor with Claude 3.5/3.7 moved **19% slower** - while thinking they were **20% faster**. Chalk it up to AI-induced confidence inflation. Faros AI tracked over **10,000 developers**. More AI didn’t mean more done. It meant more juggling, ..

The productivity paradox of AI coding assistants
Link
@faun shared a link, 3 weeks, 2 days ago

Inside NVIDIA GPUs: Anatomy of high performance matmul kernels

NVIDIA Hopper packs serious architectural tricks. At the core: **Tensor Memory Accelerator (TMA)**, **tensor cores**, and **swizzling**—the trio behind async, cache-friendly matmul kernels that flirt with peak throughput. But folks aren't stopping at cuBLAS. They're stacking new tactics: **warp-gro..

Inside NVIDIA GPUs: Anatomy of high performance matmul kernels
Link
@faun shared a link, 3 weeks, 2 days ago

5 Free AI Courses from Hugging Face

Hugging Face just rolled out a sharp set of free AI courses. Real topics, real tools—think **AI agents, LLMs, diffusion models, deep RL**, and more. It’s hands-on from the jump, packed with frameworks like LangGraph, Diffusers, and Stable Baselines3. You don’t just read about models—you build ‘em i..

Link
@faun shared a link, 3 weeks, 2 days ago

Implementing Vector Search from Scratch: A Step-by-Step Tutorial

Search is a fundamental problem in computing, and vector search aims to match meanings rather than exact words. By converting queries and documents into numerical vectors and calculating similarity, vector search retrieves contextually relevant results. In this tutorial, a vector search system is bu..

Link
@faun shared a link, 3 weeks, 2 days ago

Shai-Hulud npm Supply Chain Attack

Malicious npm packages just leveled up: this one dropped a self-spreading worm that hijacks repos and leaks secrets the moment it lands. It abuses `postinstall` scripts to run TruffleHog and swipe tokens straight from your codebase. Then it uses GitHub Actions to exfiltrate the loot and auto-publis..

Shai-Hulud npm Supply Chain Attack
Link
@faun shared a link, 3 weeks, 2 days ago

Observability for the Invisible: Tracing Message Drops in Kafka Pipelines

When an event drops silently in a distributed system, it is not a bug, it is an architectural blind spot. Detect, debug, and prevent message loss in Kafka-based streaming pipelines using tools like OpenTelemetry, Fluent Bit, Jaeger, and dead-letter queues. Make sure observability gaps in event strea..

Link
@faun shared a link, 3 weeks, 2 days ago

Demystifying Log Retention in Azure

Azure logs come in three flavors: **Activity Logs**, **Diagnostic Logs**, and **Log Analytics**. Each with its own rules for retention and billing. The catch? Those differences aren’t quirks—they’re baked in...

Pelagia is a Kubernetes controller that provides all-in-one management for Ceph clusters installed by Rook. It delivers two main features:

Aggregates all Rook Custom Resources (CRs) into a single CephDeployment resource, simplifying the management of Ceph clusters.
Provides automated lifecycle management (LCM) of Rook Ceph OSD nodes for bare-metal clusters. Automated LCM is managed by the special CephOsdRemoveTask resource.

It is designed to simplify the management of Ceph clusters in Kubernetes installed by Rook.

Being solid Rook users, we had dozens of Rook CRs to manage. Thus, one day we decided to create a single resource that would aggregate all Rook CRs and deliver a smoother LCM experience. This is how Pelagia was born.

It supports almost all Rook CRs API, including CephCluster, CephBlockPool, CephFilesystem, CephObjectStore, and others, aggregating them into a single specification. We continuously work on improving Pelagia's API, adding new features, and enhancing existing ones.

Pelagia collects Ceph cluster state and all Rook CRs statuses into single CephDeploymentHealth CR. This resource highlights of Ceph cluster and Rook APIs issues, if any.

Another important thing we implemented in Pelagia is the automated lifecycle management of Rook Ceph OSD nodes for bare-metal clusters. This feature is delivered by the CephOsdRemoveTask resource, which automates the process of removing OSD disks and nodes from the cluster. We are using this feature in our everyday day-2 operations routine.