ContentPosts from @madhan2409..
Story
@squadcast shared a post, 1 year, 3 months ago

Best 19 Observability tools for DevOps Engineers and SREs

The blog provides a comprehensive overview of the best observability tools for DevOps engineers and SREs. It covers a wide range of tools, including log aggregation, application performance monitoring (APM), distributed tracing, time series databases, and metrics collection. The blog also offers guidance on choosing the right tools based on your specific needs and deployment model. By leveraging these tools, you can gain valuable insights into your system's performance, identify and resolve issues quickly, and optimize your operations for maximum efficiency.

Story
@adammetis shared a post, 1 year, 3 months ago
DevRel, Metis

What Are Logging Levels

Logging is one of the most important parts of the distributed systems. Many things can break, but when the logging breaks, then we are completely lost. In this blog post, we will understand log levels and how to log efficiently in distributed systems.

What Are Logging Levels@3x
Story
@squadcast shared a post, 1 year, 3 months ago

On-Call Rotations: A Guide to Efficient Incident Response

The blog provides a comprehensive guide to on-call rotations, which are essential for ensuring service reliability and availability. It covers key aspects such as scheduling, handover procedures, escalation plans, and team training.

Key Points:

Scheduling: Effective on-call rotations require careful scheduling to distribute workload fairly and accommodate personal time off.

Handover Procedures: Clear procedures for transferring information between on-call engineers are crucial for smooth transitions.

Escalation Plans: Defining a clear escalation chain helps ensure that incidents are handled efficiently, regardless of complexity.

Pager Duty Optimization: Minimizing unnecessary pages is essential for reducing alert fatigue and improving response times.

Runbook Maintenance: Up-to-date runbooks provide step-by-step instructions for common troubleshooting tasks, saving time and effort.

Change Management: Integrating on-call processes with change management workflows helps prevent disruptions caused by deployments.

Training and Documentation: Comprehensive training and documentation ensure that engineers have the necessary knowledge and skills to handle on-call responsibilities effectively.

By following these best practices, organizations can establish efficient on-call rotations that contribute to overall service reliability and team effectiveness.

Story
@laura_garcia shared a post, 1 year, 3 months ago
Software Developer, RELIANOID

CISO Singapore 2024

🚨 Join us at CISO Singapore! 🚨 🗓 August 20th to 21st, 2024 📍 Singapore Put security at the heart of your corporate strategy by attending CISO Singapore. This event is a must-attend for professionals looking to deepen their knowledge in governance, risk management, information security program manage..

CISO Singapore_RELIANOID
Link
@anjali shared a link, 1 year, 3 months ago
Customer Marketing Manager, Last9

OpenTelemetry vs. Traditional APM Tools: A Comparative Analysis

This article compares OpenTelemetry and traditional APM tools with their strengths, weaknesses, and ideal use cases to help you choose the right solution for your application performance monitoring needs.


OpenTelemetry vs. Traditional APM Tools_ A Comparative Analysis
Story
@laura_garcia shared a post, 1 year, 3 months ago
Software Developer, RELIANOID

Highly Available Data Centers: GSLB

Downtime isn't an option in today's competitive market. A robust disaster recovery plan, including Global Service Load Balancing (GSLB), is essential to ensure continuous service availability and customer satisfaction. GSLB dynamically routes traffic across multiple data centers, providing high avai..

gslb
Story
@adammetis shared a post, 1 year, 3 months ago
DevRel, Metis

Configuring a Connection Pool

A connection pooler is a software component that manages database connections. This can help in multiple ways to improve resource utilization, help with load balancing or failover, and can greatly reduce transaction times. In this blog post, we’re going to see what a connection pooler is and how to configure it.

Configuring a Connection Pool_1@3x
Link
@faun shared a link, 1 year, 3 months ago
FAUN.dev()

The Open Model Initiative joins the Linux Foundation

The Open Model Initiative (OMI) has joined the Linux Foundation to promote open standards for AI model development. OMI's focus is on shared standards, a governance framework, and open source models as alternatives to proprietary AI tech. The initiative aims for true open source AI models and resist.. read more  

The Open Model Initiative joins the Linux Foundation
Link
@faun shared a link, 1 year, 3 months ago
FAUN.dev()

How We Migrated onto K8s in Less Than 12 months

Figma migrated core services from ECS to Kubernetes to enhance platform reliability and scalability. ECS's limitations, such as lack of StatefulSets and robust auto-scaling, prompted the switch to Kubernetes. The transition improved cost-efficiency, reduced downtime, and simplified service managemen.. read more  

How We Migrated onto K8s in Less Than 12 months
Link
@faun shared a link, 1 year, 3 months ago
FAUN.dev()

Fine-tuning now available for GPT-4o

Fine-tune custom versions of GPT-4o to increase performance and accuracy for your applications. Developers can now fine-tune GPT-4o with custom datasets to get higher performance at a lower cost for their specific use cases. Fine-tuning enables customization of structure and tone of responses, or fo.. read more