Spring Boot Actuator Metrics 性能监控 和 Prometheus配置

2020-04-24 0 By admin

一、Prometheus 和 Grafana

1.1、Prometheus

Prometheus是一个开源的监控系统,起源于SoundCloud。它由以下几个核心组件构成:

  1. 数据爬虫:根据配置的时间定期的通过HTTP抓去metrics数据。
  2. time-series 数据库:存储所有的metrics数据。
  3. 简单的用户交互接口:可视化、查询和监控所有的metrics。

1.2、Grafana

Grafana使你能够把来自不同数据源比如Elasticsearch, Prometheus, Graphite, influxDB等多样的数据以绚丽的图标展示出来。
它也能基于你的metrics数据发出告警。当一个告警状态改变时,它能通知你通过email,slack或者其他途径。
值得注意的是,Prometheus仪表盘也有简单的图标。但是Grafana的图表表现的更好。

二、增加Micrometer Prometheus Registry到Spring Boot应用

Spring Boot使用Micrometer,一个应用metrics组件,将actuator metrics整合到外部监控系统中。
它支持很多种监控系统,比如Netflix Atalas, AWS Cloudwatch, Datadog, InfluxData, SignalFx, Graphite, Wavefront和Prometheus等。
为了整合Prometheus,你需要增加micrometer-registry-prometheus依赖:

<!-- Micrometer Prometheus registry  -->
<dependency>
	<groupId>io.micrometer</groupId>
	<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>

一旦你增加上述的依赖,Spring Boot会自动配置一个PrometheusMeterRegistry和CollectorRegistry来收集和输出格式化的metrics数据,使得Prometheus服务器可以爬取。
所有应用的metrics数据是根据一个叫/prometheus的endpoint来设置是否可用。Prometheus服务器可以周期性的爬取这个endpoint来获取metrics数据。

三、解析Spring Boot Actuator的/prometheus Endpoint

首先,你可以通过actuator endpoint-discovery页面(http://localhost:8080/actuator)来看一下prometheus endpoint。

"prometheus": {
"href": "http://127.0.0.1:8080/actuator/prometheus",
"templated": false
}

prometheus endpoint暴露了格式化的metrics数据给Prometheus服务器。你可以通过prometheus endpoint(http://localhost:8080/actuator/prometheus)看到被暴露的metrics数据:

# HELP jvm_memory_committed_bytes The amount of memory in bytes that is committed for  the Java virtual machine to use
# TYPE jvm_memory_committed_bytes gauge
jvm_memory_committed_bytes{area="nonheap",id="Code Cache",} 9830400.0
jvm_memory_committed_bytes{area="nonheap",id="Metaspace",} 4.3032576E7
jvm_memory_committed_bytes{area="nonheap",id="Compressed Class Space",} 6070272.0
jvm_memory_committed_bytes{area="heap",id="PS Eden Space",} 2.63192576E8
jvm_memory_committed_bytes{area="heap",id="PS Survivor Space",} 1.2058624E7
jvm_memory_committed_bytes{area="heap",id="PS Old Gen",} 1.96608E8
# HELP logback_events_total Number of error level events that made it to the logs
# TYPE logback_events_total counter
logback_events_total{level="error",} 0.0
logback_events_total{level="warn",} 0.0
logback_events_total{level="info",} 42.0
logback_events_total{level="debug",} 0.0
logback_events_total{level="trace",} 0.0
...

四、Prometheus配置(prometheus.yml)

我们需要配置Prometheus来抓取Spring Boot Actuator的/prometheus endpoint中的metrics数据。
创建一个prometheus.yml的文件,填入以下内容:

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs:
    - targets: ['127.0.0.1:9090']

  - job_name: 'spring-actuator'
    metrics_path: '/actuator/prometheus'
    scrape_interval: 5s
    static_configs:
    - targets: ['HOST_IP:8080']

在Prometheus文档中,上面的配置文件是basic configuration file的扩展。

上面中比较重要的配置项是spring-actuator job中的scrape_configs选项。

metrics_path是Actuator中prometheus endpoint中的路径。targets包含了Spring Boot应用的HOST和PORT。