Join us

Do You Use Monitoring? So Brave of You

IMG_4899

Do you rely on your monitoring solutions to let you know when things are wrong? That is so brave of you! On a more serious note, please think twice. Monitoring is not enough! It can’t explain why things happen the way they do (because it doesn’t see the past beyond metrics) and it doesn’t tell you what is going to happen (so it can’t predict the future). This is a serious problem and we need a solution (spoiler alert: we need database guardrails). Let’s read on to understand why.

Do you rely on your monitoring solutions to let you know when things are wrong? That is so brave of you!

On a more serious note, please think twice.

Monitoring is not enough! It can’t explain why things happen the way they do (because it doesn’t see the past beyond metrics) and it doesn’t tell you what is going to happen (so it can’t predict the future). This is a serious problem and we need a solution (spoiler alert: we need database guardrails). Let’s read on to understand why.

Recommended reading: Observability vs Monitoring: Key Differences & How They Pair

Monitoring Misses the Point

Current monitoring solutions typically focus on the following:

  • Quantity over quality: Monitoring solutions capture as many metrics as possible, especially generic metrics that are easy to capture: host-level metrics (CPU, memory), OS-level metrics (number of processes, threads, files), runtime-level metrics (number of GC executions or number of hanging tasks)

  • Possibilities instead of relevance: Monitoring solutions let you access all metrics and signals instead of showing only the relevant ones

  • Issues instead of solutions: They show what is wrong but do not present how to make it right

  • Seeing instead of understanding: Monitoring solutions swamp you with raw data instead of filtering irrelevant signals

You may think that your monitoring solutions let you keep things in shape and react easily to problems. That is not true, unfortunately. They show you that “your CPU spiked” in one place and then let you monitor how this spike affects other systems. You can see where the fire started and how the world burns. However, they don’t tell you why the fire started and what happened. They don’t show you what the reason is and how to fix it.

Monitoring solutions help you put out the fire but they don’t help you prevent the fire in the first place.

It’s great that you can see where the fire started, and slice and dice metrics to figure out what happened. But what you really need is solutions and how to fix the issues. Once you know the solutions, you can put out the fire and prevent it from recurring. Monitoring solutions can swamp you with signals but can’t do the work for you.

Monitoring Doesn’t Understand the Metrics

Your monitoring solutions do not understand what they’re collecting. They go after quantity instead of quality.

You may think that you have plenty of metrics but in fact, you’re missing your KPIs.

It’s easy to capture infrastructure metrics. Wikipedia mentions that Windows comes out of the box with over 350 performance counters. Your monitoring solutions can capture them immediately and pretend like you have tons of metrics. Add runtime metrics to that (so metrics emitted by your JVM, Node, Docker, web servers, databases, etc.), multiply by the number of aggregates and dimensions, and you can easily have tens of thousands of metrics out of the box. That’s a lot.

Recommended reading: Database Monitoring Metrics - Key Indicators for Performance Analysis

Unfortunately, this doesn’t include your business metrics. You don’t see the forest for the trees. Monitoring solutions don’t understand what happens in your databases and why. They can’t tell you that your CPU spiked because of extension misconfiguration or because of the deployment you had two days ago. They see that the CPU load goes up, but they can’t tell if that’s expected or if that’s an issue.

Monitoring Doesn’t Fix the Problems

Since the monitoring solutions do not understand what’s going on, they don’t know how to fix the problems. They can’t suggest configuration improvements, query changes, or code modifications. They can’t automate your work. They can alert you when something breaks but they can’t prevent it from happening.

Monitoring solutions expect you to do the hard work.

It’s great to have tools that can show us problems. It’s even better when the tools can fix the problems for us and prevent them from happening. Monitoring solutions don’t do that.

You must be very brave to rely on monitoring solutions only.

However, there is a better approach. You can use database guardrails to make your life easier.

Use Database Guardrails And Prevent Issues Automatically

Database guardrails do the following:

  • They analyze your code before the deployment to check if it’s safe to be deployed (if it’s scalable, won’t cause issues with indexing, will not take your database down, etc.)

  • They verify your database metrics, configurations, extensions, indexes, schemas, and many more to assess if your database performs well

  • They can connect the dots and explain that your CPU load increases because of your code changes or different traffic patterns

  • They can fix issues automatically or submit automated code changes that you just need to approve

Database guardrails do the work so you don’t have to.

Database guardrails understand what happens in your databases. They can show you meaningful performance indicators and business metrics, tune your configurations to match your workflow and fix issues automatically. They can alert you when things require your business decisions and tune alarms automatically based on anomaly detection.

Monitoring is not enough. Use database guardrails and let them do your work.


Only registered users can post comments. Please, login or signup.

Start blogging about your favorite technologies, reach more readers and earn rewards!

Join other developers and claim your FAUN account now!

Avatar

Adam Furmanek

DevRel, Metis

@adammetis
I work in the area of databases, troubleshooting, query optimization, and organization architecture for the last 15 years. I spoke about that on many conferences. I published in DZone, InfoQ, wrote a couple of books, and I generally know stuff.
User Popularity
99

Influence

9k

Total Hits

37

Posts