Join us

ContentUpdates and recent posts about Slurm..
Link
@kaptain shared a link, 5 months, 3 weeks ago
FAUN.dev()

Kubernetes Configuration Good Practices

Stripped down and sharp, the blog lays out Kubernetes config best practices: keep YAML manifests in version control, use Deployments (not raw Pods), and label like you mean it - semantically, not just alphabet soup. It digs into sneaky pain points too, like how YAML mangles booleans (yes≠true), and .. read more  

Link
@kala shared a link, 5 months, 3 weeks ago
FAUN.dev()

How I Built a 100% Offline “Second Brain” for Engineering Docs using Docker & Llama 3 (No OpenAI)

Senior Automation Engineer built an offline RAG system for technical documents using Ollama, Llama 3, and ChromaDB in a Dockerized microservices architecture. The system enables efficient retrieval and generation of information from PDFs with a streamlined UI. The deployment package, including compl.. read more  

Link
@kala shared a link, 5 months, 3 weeks ago
FAUN.dev()

How to Evaluate LLMs Without Opening Your Wallet

A new mock-based framework lets QA and automation folks stress-test LLM outputs - no API calls, no surprise charges. It runs entirely local, usingpytest fixtures, structured test flows, and JSON schema checks to keep things tight. Test logic stays modular. Cross-validation’s baked in. And if you nee.. read more  

Link
@kala shared a link, 5 months, 3 weeks ago
FAUN.dev()

I tested ChatGPT’s backend API using RENTGEN, and found more issues than expected

A closer look at OpenAI’s API uncovers some shaky ground: misconfiguredCORS headers, missingX-Frame-Options, noinput validation, and borkedHTTP status handling. Large uploads? Boom..crash!CORS preflightrequests? Straight-up denied. So much for smooth browser support... read more  

I tested ChatGPT’s backend API using RENTGEN, and found more issues than expected
Link
@kala shared a link, 5 months, 3 weeks ago
FAUN.dev()

Writing a good CLAUDE.md

Anthropic’s Claude Code now deprioritizes parts of the root context file it sees as irrelevant. It still reads the file every session, but won’t waste cycles on side quests. The message to devs: stop stuffing it with catch-all instructions. Instead, use modular context that unfolds as needed - think.. read more  

Writing a good CLAUDE.md
Link
@kala shared a link, 5 months, 3 weeks ago
FAUN.dev()

1,500+ PRs Later: Spotify’s Journey with Our Background Coding Agent

Spotify just gave its internal Fleet Management tooling a serious brain upgrade. They've wired inAI coding agentsthat now handle source-to-source transformations across repos - automatically. So far? Over 1,500 AI-generated PRs pushed. Not just lint fixes - these include heavy-duty migrations. They'.. read more  

1,500+ PRs Later: Spotify’s Journey with Our Background Coding Agent
Link
@kala shared a link, 5 months, 3 weeks ago
FAUN.dev()

AI and QE: Patterns and Anti-Patterns

The author shared insights on how AI can be leveraged as a QE and highlighted potential dangers to watch out for, drawing parallels with misuse of positive behaviors or characteristics taken out of context. The post outlined anti-patterns related to automating tasks, stimulating thinking, and tailor.. read more  

Link
@kala shared a link, 5 months, 3 weeks ago
FAUN.dev()

Cato CTRL™ Threat Research: HashJack - Novel Indirect Prompt Injection Against AI Browser Assistants

A new attack method -HashJack- shows how AI browsers can be tricked with nothing more than a URL fragment. It works like this: drop malicious instructions after the#in a link, and AI copilots likeComet,Copilot for Edge, andGemini for Chromemight swallow them whole. No need to hack the site. The LLM .. read more  

Link
@devopslinks shared a link, 5 months, 3 weeks ago
FAUN.dev()

How when AWS was down, we were not

During the AWS us-east-1 meltdown - when DynamoDB, IAM, and other key services went dark - Authress kept the lights on. Their trick? A ruthless edge-first, multi-region setup built for failure. They didn’t hope DNS would save them. They wired in automated failover, rolled their own health checks, an.. read more  

How when AWS was down, we were not
Link
@devopslinks shared a link, 5 months, 3 weeks ago
FAUN.dev()

Collaborating with Terraform: How Teams Can Work Together Without Breaking Things

When working with Terraform in a team environment, common issues may arise such as state locking, version mismatches, untracked local applies, and lack of transparency. Atlantis is an open-source tool that can help streamline collaboration by automatically running Terraform commands based on GitHub .. read more  

Slurm Workload Manager is an open-source, fault-tolerant, and highly scalable cluster management and scheduling system widely used in high-performance computing (HPC). Designed to operate without kernel modifications, Slurm coordinates thousands of compute nodes by allocating resources, launching and monitoring jobs, and managing contention through its flexible scheduling queue.

At its core, Slurm uses a centralized controller (slurmctld) to track cluster state and assign work, while lightweight daemons (slurmd) on each node execute tasks and communicate hierarchically for fault tolerance. Optional components like slurmdbd and slurmrestd extend Slurm with accounting and REST APIs. A rich set of commands—such as srun, squeue, scancel, and sinfo—gives users and administrators full visibility and control.

Slurm’s modular plugin architecture supports nearly every aspect of cluster operation, including authentication, MPI integration, container runtimes, resource limits, energy accounting, topology-aware scheduling, preemption, and GPU management via Generic Resources (GRES). Nodes are organized into partitions, enabling sophisticated policies for job size, priority, fairness, oversubscription, reservation, and resource exclusivity.

Widely adopted across academia, research labs, and enterprise HPC environments, Slurm serves as the backbone for many of the world’s top supercomputers, offering a battle-tested, flexible, and highly configurable framework for large-scale distributed computing.