Join us

How we reduced our Prometheus infrastructure footprint by a third

How we reduced our Prometheus infrastructure footprint by a third

This article discusses sharding in Prometheus, a technique used to distribute the load of collecting metrics across multiple instances. The article describes a problem where the number of metrics being scraped was growing non-linearly, causing increased memory and CPU costs.

The root cause was identified as the scalability limits in the Prometheus drop metric_relabel_configs. To address the issue, the team implemented a solution that filters the metrics during the scrape process rather than after, reducing the overall footprint of their Prometheus setup.

This change resulted in significant memory and CPU savings, as well as improved efficiency and performance of the system.


Only registered users can post comments. Please, login or signup.

Start blogging about your favorite technologies, reach more readers and earn rewards!

Join other developers and claim your FAUN account now!

Avatar

The FAUN

@faun
A worldwide community of developers and DevOps enthusiasts!
User Popularity
3k

Influence

280k

Total Hits

1

Posts