Join us

ContentUpdates and recent posts about Pelagia..
Link
@faun shared a link, 1 month, 4 weeks ago

AWS, Microsoft and Google unite behind Linux Foundation DocumentDB database to cut enterprise costs and limit vendor lock-in

Document databases are crucial for AI apps in the gen AI era. Microsoft's open-source DocumentDB project, based on PostgreSQL, is moving to the Linux Foundation, offering a vendor-neutral, open-source alternative to MongoDB. DocumentDB's compatibility with MongoDB drivers and open source governance ..

Link
@faun shared a link, 1 month, 4 weeks ago

Measuring Developer Productivity with Amazon Q Developer and Jellyfish

Amazon Q Developer now plugs into Jellyfish. Teams get a clearer view of how AI fits into the real flow of work—prompt usage, code adoption, PR throughput. Not just surface stats. The setup pipes data from AWS S3 straight into Jellyfish’s analytics engine. It tags AI users, tracks velocity gains, an..

Measuring Developer Productivity with Amazon Q Developer and Jellyfish
Link
@faun shared a link, 1 month, 4 weeks ago

Which LLM writes the best analytical SQL?

Tinybird threw 19 top LLMs at a 200M-row GitHub dataset, testing how well they could turn plain English into solid SQL. Most models kept their syntax clean—but when it came to writing SQL that actually ran well and returned the right results, they lagged behind human pros. Messy schemas or tricky pr..

Which LLM writes the best analytical SQL?
Link
@faun shared a link, 1 month, 4 weeks ago

Being on the Same Page During an Incident: Not Actually Telepathy

Collaboration in incident response is crucial for effective resolution, starting with establishing a basic compact among responders. Grounding is a process that ensures alignment and common ground is maintained throughout an incident, encompassing initial common ground, public events so far, and the..

Link
@faun shared a link, 1 month, 4 weeks ago

Container Logs in Kubernetes: How to View and Collect Them

This guide shows how to wrangle container logs in Kubernetes—usingkubectl, shell tools, structured logging, and the Kubernetes Dashboard. It covers the basics and dives into how to scale up log collection and make observability less painful across clusters...

Container Logs in Kubernetes: How to View and Collect Them
Link
@faun shared a link, 1 month, 4 weeks ago

v1.34: DRA has graduated to GA

Kubernetes 1.34 turnsDynamic Resource Allocation (DRA)loose into General Availability—enabled by default. That cements native support for high-maintenance gear like GPUs, FPGAs, and any other quirky hardware your workloads need. The release also packs a fresh mix of alpha/beta features: tighter admi..

Link
@faun shared a link, 1 month, 4 weeks ago

Building a Scalable, Flexible, Cloud-Native GenAI Platform with Open Source Solutions

A fresh reference architecture built withEnvoy AI GatewayandKServebrings order to the GenAI chaos. One clean interface to route requests across internal and external LLMs—locked down with policies. It’s called aTwo-Tier Gateway Architecture. Think of it like a split-brain: external API traffic goes..

Building a Scalable, Flexible, Cloud-Native GenAI Platform with Open Source Solutions
Link
@faun shared a link, 1 month, 4 weeks ago

v1.34: Introducing CPU Manager Static Policy Option for Uncore Cache Alignment

Kubernetes 1.34 bumps theCPU Manager uncore-cache alignment policyto beta. It’s aimed at nodes withsplit uncore cache architectures. The policy groups all a container’s CPUs under the same uncore cache—cutting latency and easing contention for workloads that hate waiting. System shift:Kubernetes kee..

v1.34: Introducing CPU Manager Static Policy Option for Uncore Cache Alignment
Link
@faun shared a link, 1 month, 4 weeks ago

v1.34: Service Account Token Integration for Image Pulls Graduates to Beta

Kubernetes v1.34 bumpsServiceAccount token integration for Kubelet Credential Providersto beta. That means image pulls can now ditch long-lived secrets for workload-scoped tokens. Cleaner, safer, and more locked down per ServiceAccount...

Link
@faun shared a link, 1 month, 4 weeks ago

v1.34: Pod Replacement Policy for Jobs Goes GA

ThePod replacement policyin Kubernetes v1.34 just hit GA. Jobs can now hold off on spinning up new Pods until the old ones arefullygone. No more duplicates per index. No more blowing through quotas or stalling schedulers—big win for workloads like ML training. System shift:This rewires how Jobs hand..

Pelagia is a Kubernetes controller that provides all-in-one management for Ceph clusters installed by Rook. It delivers two main features:

Aggregates all Rook Custom Resources (CRs) into a single CephDeployment resource, simplifying the management of Ceph clusters.
Provides automated lifecycle management (LCM) of Rook Ceph OSD nodes for bare-metal clusters. Automated LCM is managed by the special CephOsdRemoveTask resource.

It is designed to simplify the management of Ceph clusters in Kubernetes installed by Rook.

Being solid Rook users, we had dozens of Rook CRs to manage. Thus, one day we decided to create a single resource that would aggregate all Rook CRs and deliver a smoother LCM experience. This is how Pelagia was born.

It supports almost all Rook CRs API, including CephCluster, CephBlockPool, CephFilesystem, CephObjectStore, and others, aggregating them into a single specification. We continuously work on improving Pelagia's API, adding new features, and enhancing existing ones.

Pelagia collects Ceph cluster state and all Rook CRs statuses into single CephDeploymentHealth CR. This resource highlights of Ceph cluster and Rook APIs issues, if any.

Another important thing we implemented in Pelagia is the automated lifecycle management of Rook Ceph OSD nodes for bare-metal clusters. This feature is delivered by the CephOsdRemoveTask resource, which automates the process of removing OSD disks and nodes from the cluster. We are using this feature in our everyday day-2 operations routine.