[监控] - 在Prometheus上配置报警

48人浏览 / 0人评论

Doris是基于Prometheus监控的, 下面例举一些常用报警表达式:

Doris容量报警

(sum(doris_be_disks_data_used_capacity{job="Doris_CLUSTER"})  / sum(doris_be_disks_total_capacity{job="Doris_CLUSTER"})) * 100 > 85
	```
	
### Doris BE宕机报警

(count(up{group="be",job="Doris_CLUSTER"}) - count(up{group="be",job="Doris_CLUSTER"} == 1)) >= 1


### Doris BE Exporter宕机报警

up{group="be",job="Doris_CLUSTER"} == 0


### Doris FE Exporter宕机报警

up{group="fe",job="Doris_CLUSTER"} == 0


### Doris FE节点宕机报警

(count(up{group="fe",job="Doris_CLUSTER"}) - count(up{group="fe",job="Doris_CLUSTER"} == 1)) >= 1


### FE gc报警

increase(jvm_old_gc{group="fe",job="Doris_CLUSTER",type="count"}[1m]) >= 2

全部评论