Prometheus监控MySQL

一、先决条件

  • 安装 prometheus、alertmanager、grafana、mysql_exporter 详细过程省略...

二、警告管理者

[root@centos ~]# useradd -s /sbin/nologin prometheus -M
[root@centos ~]# mkdir -p /data/alertmanager/{base,conf,data,software}
[root@centos ~]# cd /data/alertmanager/software
[root@centos software]# wget https://github.com/prometheus/alertmanager/releases/download/v0.21.0/alertmanager-0.21.0.linux-amd64.tar.gz
[root@centos software]# tar xf alertmanager-0.21.0.linux-amd64.tar.gz -C /data/alertmanager/base
[root@centos software]# mv /data/alertmanager/base/alertmanager-0.21.0.linux-amd64 /data/alertmanager/base/0.21.0
[root@centos software]# mv /data/alertmanager/base/0.21.0/alertmanager.yml /data/alertmanager/conf/alertmanager.yml
[root@centos software]# chown -R prometheus.prometheus /data/alertmanager
  • 创建一个 alertmanager.service 管理文件,由 systemd 管理,减少少量运维成本。
[root@centos ~]# vim /etc/systemd/system/alertmanager.service
[Unit]
Description=Prometheus
Documentation=https://prometheus.io/
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/data/alertmanager/base/0.21.0/alertmanager --config.file=/data/alertmanager/conf/alertmanager.yml --storage.path=/data/alertmanager/data
Restart=on-failure
[Install]
WantedBy=multi-user.target

[root@centos ~]# systemctl start alertmanager.service
[root@centos ~]# systemctl enable alertmanager.service
  • 编辑一个微信告警信息模板
[root@centos ~]# mkdir -p /data/alertmanager/conf/templates
[root@centos ~]# vim /data/alertmanager/conf/templates/wechat.tmpl
{{ define "wechat.default.message" }}
{{ range $i, $alert :=.Alerts }}
告警状态:{{   .Status }}
告警级别:{{ $alert.Labels.severity }}
告警类型:{{ $alert.Labels.alertname }}
告警主机:{{ $alert.Labels.instance }}
告警应用:{{ $alert.Annotations.summary }}
告警详情:{{ $alert.Annotations.description }}
告警时间:{{ $alert.StartsAt.Format "2006-01-02 15:04:05" }}
{{ end }}
{{ end }}
  • 参数说明:
    • corp_id: 企业微信账号唯一 ID, 可以在我的企业中查看。
    • to_user: 发送到组内哪些用户接收告警
    • to_party: 需要发送的组。
    • agent_id: 第三方企业应用的 ID,可以在自己创建的第三方企业应用详情页面查看。
    • api_secret: 第三方企业应用的密钥,可以在自己创建的第三方企业应用详情页面查看。
    • repeat_interval: 30m 设置同一个告警发送间隔时间
[root@centos ~]# vim /data/alertmanager/conf/alertmanager.yml
global:
    resolve_timeout: 3m
    wechat_api_url: "https://qyapi.weixin.qq.com/cgi-bin/"
    wechat_api_secret: ""
    wechat_api_corp_id: ""

templates:
  - './templates/wechat.tmpl'

route:
  group_by: ['alertname']
  group_wait: 3s
  group_interval: 1m
  repeat_interval: 30m
  receiver: wsp_wechat

receivers:
  - name: "wsp_wechat"       # 名称好像不可以设置为 wechat ,不然发送告警失败
    wechat_configs:
      - send_resolved: true
        to_user: "@all" # 发给部门所有人
        to_party: "1" # 发给某个部门组织
        agent_id: "" # 发给到具体哪个应用, 联系微信管理员可获得
        corp_id: "" ## 企业号的corp id 找企业号管理员可以获得,同全局设置
  • 重启 alertmanager 告警程序,重新加载配置文件
[root@centos ~]# chown -R prometheus.prometheus /data/alertmanager
[root@centos ~]# systemctl restart alertmanager.service && systemctl status alertmanager.service

三、普罗米修斯

[root@centos ~]# useradd -s /sbin/nologin prometheus -M
[root@centos ~]# mkdir -p /data/prometheus/{base,conf,data,software}
[root@centos ~]# cd /data/prometheus/software/
[root@centos software]# wget https://github.com/prometheus/prometheus/releases/download/v2.19.2/prometheus-2.19.2.linux-amd64.tar.gz
[root@centos software]# tar xf prometheus-2.19.2.linux-amd64.tar.gz -C /data/prometheus/base
[root@centos software]# mv /data/prometheus/base/prometheus-2.19.2.linux-amd64 /data/prometheus/base/2.19.2
[root@centos software]# chown -R prometheus.prometheus /data/prometheus
  • 创建一个 prometheus.service 管理文件,由 systemd 管理,减少少量运维成本。
[root@centos ~]# vim /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus
Documentation=https://prometheus.io/
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/data/prometheus/base/2.19.2/prometheus --config.file=/data/prometheus/conf/prometheus.yml --storage.tsdb.path=/data/prometheus/data
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
[Install]
WantedBy=multi-user.target

[root@centos ~]# systemctl start prometheus.service
[root@centos ~]# systemctl enable prometheus.service
  • 编写一个对 prometheus 与 mysql 的监控配置文件,同时关联 alertmanager 告警程序,定义告警规则文件路径
[root@centos ~]# mkdir /data/prometheus/conf/rules
[root@centos ~]# vim /data/prometheus/conf/prometheus.yml 
global:
  scrape_interval: 30s

  external_labels:
    monitor: 'codelab-monitor'

alerting:
  alertmanagers:
    - static_configs:
        - targets: ['10.25.1.100:9093']

rule_files:
  - './rules/*.yml'

scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ['10.25.1.100:9090']

  - job_name: "mysql"
    static_configs:

      - targets: ['10.25.1.101:9104']
        labels:
          group: "mysql-group-1"
          instance: "10.25.1.101:3306"

      - targets: ['10.25.1.102:9104']
        labels:
          group: "mysql-group-1"
          instance: "10.25.1.102:3306"
  • 定义告警规则
    • 也就是 prometheus 所有的告警规则都可以创建不同的 yml 文件,prometheus 会自动加载 rules/ 目录下所有文件。
[root@centos ~]# vim /data/prometheus/conf/rules/mysql.yml
groups:
- name: 'mysql'
  rules:
  - alert: mysql is down
    expr: mysql_up == 0
    for: 15s
    labels:
      severity: disaster
    annotations:
      summary: "{{ $labels.instance }} is stoped"
      description: "实例: {{ $labels.instance }} 停止服务"

  - alert: Connection greater than 50
    expr: mysql_global_status_threads_running > 50
    for: 15s
    labels:
      severity: warning
    annotations:
      summary: "{{ $labels.instance }} Connect > 50"
      description: "实例: {{ $labels.instance }} | 状态: {{ $value }}"

  - alert: Connection greater than 100
    expr: mysql_global_status_threads_running > 100
    for: 15s
    labels:
      severity: warning
    annotations:
      summary: "{{ $labels.instance }} Connect > 100"
      description: "实例: {{ $labels.instance }} | 状态: {{ $value }}"

  - alert: Connection greater than 200
    expr: mysql_global_status_threads_running > 200
    for: 15s
    labels:
      severity: serious
    annotations:
      summary: "{{ $labels.instance }} Connect > 200"
      description: "实例: {{ $labels.instance }} | 状态: {{ $value }}"

  - alert: Database IO Thread is Stoped
    expr: mysql_slave_status_slave_io_running == 0
    for: 15s
    labels:
      severity: serious
    annotations:
      summary: "{{ $labels.instance }} Slave IO Thread stoped"
      description: "实例: {{ $labels.instance }} 从库 IO 线程故障"

  - alert: Database SQL Thread is Stoped
    expr: mysql_slave_status_slave_sql_running == 0
    for: 15s
    labels:
      severity: serious
    annotations:
      summary: "{{ $labels.instance }} Slave SQL Thread stoped"
      description: "实例: {{ $labels.instance }} 从库 SQL 线程故障"

  - alert: database latency is greater than 600
    expr: mysql_slave_status_seconds_behind_master > 600
    for: 15s
    labels:
      severity: warning
    annotations:
      summary: "{{ $labels.instance }} Slave delay"
      description: "实例: {{ $labels.instance }} | 状态: {{ $value }}"

  - alert: database latency is greater than 1200
    expr: mysql_slave_status_seconds_behind_master > 1200
    for: 15s
    labels:
      severity: warning
    annotations:
      summary: "{{ $labels.instance }} Slave delay"
      description: "实例: {{ $labels.instance }} | 状态: {{ $value }}"

  - alert: hit rate of database buffer pool is lower than 80%
    expr: 100 * ((mysql_global_status_innodb_buffer_pool_read_requests - mysql_global_status_innodb_buffer_pool_reads) / mysql_global_status_innodb_buffer_pool_read_requests) < 80
    for: 15s
    labels:
      severity: warning
    annotations:
      summary: "{{ $labels.instance }} Buffer Pool hit rate lower"
      description: "实例: {{ $labels.instance }} |状态: {{ $value }}"

  - alert: hit rate of database buffer pool is lower than 60%
    expr: 100 * ((mysql_global_status_innodb_buffer_pool_read_requests - mysql_global_status_innodb_buffer_pool_reads) / mysql_global_status_innodb_buffer_pool_read_requests) < 60
    for: 15s
    labels:
      severity: serious
    annotations:
      summary: "{{ $labels.instance }} Buffer Pool hit rate lower"
      description: "实例: {{ $labels.instance }} |状态: {{ $value }}"
  • 重启 prometheus 告警程序,重新加载配置文件( reload 与 restart 都可以 )
[root@centos ~]# systemctl reload prometheus.service && systemctl status prometheus.service

四、控制仪表盘

[root@centos ~]# wget https://dl.grafana.com/oss/release/grafana-7.0.5-1.x86_64.rpm
[root@centos ~]# yum -y localinstall grafana-7.0.5-1.x86_64.rpm

[root@centos ~]# systemctl start grafana-server.service && systemctl enable grafana-server.service && systemctl status grafana-server.service
  • 添加 prometheus 数据源

五、指标采集器

  • 数据库内创建监控用户
mysql> grant process,select,replication client on *.* to prometheus@'%' identified by '123456';
Query OK, 0 rows affected, 1 warning (0.00 sec)
  • 本次安装以 0.12.1 版本为例
[root@centos ~]# mkdir -p /data/mysql_exporter/{base,conf,software}
[root@centos ~]# cd /data/mysql_exporter/software
[root@centos software]# wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.12.1/mysqld_exporter-0.12.1.linux-amd64.tar.gz
[root@centos software]# tar xf mysqld_exporter-0.12.1.linux-amd64.tar.gz -C /data/mysql_exporter/base
[root@centos software]# mv /data/mysql_exporter/base/mysqld_exporter-0.12.1.linux-amd64 /data/mysql_exporter/base/0.12.1
  • 编辑连接 MySQL 的配置文件
    • host:数据库IP地址
    • port:数据库端口
    • user:监控数据库用户名
    • password:数据库用户名密码
[root@centos ~]# vim /data/mysql_exporter/conf/mysql_exporter.cnf
[client]
host=10.186.60.102
port=3306
user=prometheus
password=123456
  • 编写 system 管理配置文件
[root@centos ~]# useradd -s /sbin/nologin prometheus -M
[root@centos ~]# chown -R prometheus.prometheus /data/mysql_exporter
[root@centos ~]# vim /etc/systemd/system/mysql_exporter.service
[Unit]
Description=node_exporter
After=network.target

[Service]
Type=simple
User=prometheus
ExecStart=/data/mysql_exporter/base/0.12.1/mysqld_exporter --config.my-cnf=/data/mysql_exporter/conf/mysql_exporter.cnf
Restart=on-failure

[Install]
WantedBy=multi-user.target
  • 启动 mysql_exporter 采集器
[root@centos ~]# systemctl start mysql_exporter.service
[root@centos ~]# systemctl enable mysql_exporter.service
「点点赞赏,手留余香」

    还没有人赞赏,快来当第一个赞赏的人吧!
0 条回复 A 作者 M 管理员
    所有的伟大,都源于一个勇敢的开始!
欢迎您,新朋友,感谢参与互动!欢迎您 {{author}},您在本站有{{commentsCount}}条评论