Join us

ContentUpdates and recent posts about Pelagia..
Link
@faun shared a link, 9 months ago
FAUN.dev()

Why "What Happened First?" Is One of the Hardest Questions in Large-Scale Systems

Logical clocks trackevent orderin distributed systems—no need for synced wall clocks. Each node keeps a counter. On every event: tick it. On every message: tack on your counter. When you receive one? Merge and bump. This flips the script. Instead of chasing global time, distributed systems lean int.. read more  

Why "What Happened First?" Is One of the Hardest Questions in Large-Scale Systems
Link
@faun shared a link, 9 months ago
FAUN.dev()

Paused Kubernetes project finds path forward

TheExternal Secrets Operator (ESO)is moving again. After hitting pause from maintainer burnout, it’s back under CNCF incubation—with a rebooted structure in place. New governance, clear contributor paths, and support tracks for CI, core dev, and testing are all in. But don’t expect fresh releases ju.. read more  

Paused Kubernetes project finds path forward
Link
@faun shared a link, 9 months ago
FAUN.dev()

The Hidden AWS Cost Traps No One Warns You About (and How I Avoid Them)

Calling outfive sneaky AWS cost traps—the kind that creep in through overlooked defaults and quiet misconfigs, then blow up your bill while no one's watching... read more  

The Hidden AWS Cost Traps No One Warns You About (and How I Avoid Them)
Link
@faun shared a link, 9 months ago
FAUN.dev()

Easy will always trump simple

Rich Hickey’s classic “Simple Made Easy” talk is making the rounds again—as a mirror held up to dev culture under pressure. The punchline: we keep picking solutions that areeasy but tangled, instead ofsimple and sane. The essay draws a sharp line between that habit and a concept from biology: exapt.. read more  

Link
@faun shared a link, 9 months ago
FAUN.dev()

Subverting code integrity checks to locally backdoor Signal, 1Password, Slack, and more

A fresh CVE (2025-55305) just put Electron apps in the hot seat. The bug? Chromium-based apps fail to treatV8 heap snapshot filesas potential attack vectors. That crack lets unsigned JavaScript slip past code signing and run inside heavyweight targets like Slack, 1Password, and Signal. The heart of.. read more  

Subverting code integrity checks to locally backdoor Signal, 1Password, Slack, and more
Link
@faun shared a link, 9 months ago
FAUN.dev()

Pooling Connections with RDS Proxy at Klaviyo

Klaviyo replaced ProxySQL on EC2 and moved toAWS RDS Proxy. Why? Less overhead. Simpler failovers. Smarter pooling. RDS Proxy handlesmultiplexing, packing thousands of client queries into way fewer DB connections. IAM access and built-in failover routing sweeten the deal... read more  

Pooling Connections with RDS Proxy at Klaviyo
Link
@faun shared a link, 9 months ago
FAUN.dev()

Kubernetes VPA: Limitations, Best Practices, and the Future of Pod Rightsizing

Kubernetes'Vertical Pod Autoscaler (VPA)tries to be helpful by tweaking CPU and memory requests on the fly. Problem is, it needs to bounce your pods to do it. And if you're also runningHorizontal Pod Autoscaler (HPA)on the same metrics? Now they're fighting over control. VPA sees a narrow slice of .. read more  

Kubernetes VPA: Limitations, Best Practices, and the Future of Pod Rightsizing
Link
@faun shared a link, 9 months ago
FAUN.dev()

Kubernetes DNS Exploit Enables Git Credential Theft from ArgoCD

A new attack chain messes withKubernetes DNS resolutionandArgoCD’s certificate injectionto swipe GitHub credentials. With the right permissions, a user inside the cluster can reroute GitOps traffic to a fake internal service, sniff auth headers, and quietly walk off with tokens. What’s broken:GitOp.. read more  

Kubernetes DNS Exploit Enables Git Credential Theft from ArgoCD
Link
@faun shared a link, 9 months ago
FAUN.dev()

Dynamic Kubernetes request right sizing with Kubecost

Kubecost’s Amazon EKS add-on now handlesautomated container request right-sizing. That means teams can tweak CPU and memory requests based on actual usage—once or on a recurring schedule. Optimization profiles are customizable, and resizing can be baked into cluster setup using Helm. Yes, that mean.. read more  

Dynamic Kubernetes request right sizing with Kubecost
Link
@faun shared a link, 9 months ago
FAUN.dev()

The Quiet Revolution in Kubernetes Security

Nigel Douglas discusses the challenges of security in Kubernetes, particularly with traditional base operating systems. Talos Linux offers a different approach with a secure-by-default, API-driven model specifically for Kubernetes. CISOs play a critical role in guiding organizations through the shif.. read more  

Pelagia is a Kubernetes controller that provides all-in-one management for Ceph clusters installed by Rook. It delivers two main features:

Aggregates all Rook Custom Resources (CRs) into a single CephDeployment resource, simplifying the management of Ceph clusters.
Provides automated lifecycle management (LCM) of Rook Ceph OSD nodes for bare-metal clusters. Automated LCM is managed by the special CephOsdRemoveTask resource.

It is designed to simplify the management of Ceph clusters in Kubernetes installed by Rook.

Being solid Rook users, we had dozens of Rook CRs to manage. Thus, one day we decided to create a single resource that would aggregate all Rook CRs and deliver a smoother LCM experience. This is how Pelagia was born.

It supports almost all Rook CRs API, including CephCluster, CephBlockPool, CephFilesystem, CephObjectStore, and others, aggregating them into a single specification. We continuously work on improving Pelagia's API, adding new features, and enhancing existing ones.

Pelagia collects Ceph cluster state and all Rook CRs statuses into single CephDeploymentHealth CR. This resource highlights of Ceph cluster and Rook APIs issues, if any.

Another important thing we implemented in Pelagia is the automated lifecycle management of Rook Ceph OSD nodes for bare-metal clusters. This feature is delivered by the CephOsdRemoveTask resource, which automates the process of removing OSD disks and nodes from the cluster. We are using this feature in our everyday day-2 operations routine.