网站首页 > 开源技术正文

Prometheus(普罗米修斯)介绍（普罗米修斯深度解析知乎）

wxchong 2024-10-15 17:20:42 开源技术 41 ℃ 0 评论

适合k8s和docker的监控系统

一：介绍

Prometheus（普罗米修斯）是一套开源的监控&报警&时间序列数据库的组合.由SoundCloud公司开发

Prometheus基本原理是通过HTTP协议周期性抓取被监控组件的状态，这样做的好处是任意组件只要提供HTTP接口就可以接入监控系统，不需要任何SDK或者其他的集成过程。

随着k8s的流行，prometheus成为了一个越来越流行的监控工具。

Prometheus可以做什么

在业务层用作埋点系统 Prometheus支持各个主流开发语言（Go，java，python，ruby官方提供客户端，其他语言有第三方开源客户端）。我们可以通过客户端方面的对核心业务进行埋点。

在应用层用作应用监控系统一些主流应用可以通过官方或第三方的导出器，来对这些应用做核心指标的收集。如redis,mysql

在系统层用作系统监控除了常用软件， prometheus也有相关系统层和网络层exporter,用以监控服务器或网络。集成其他的监控 prometheus还可以通过各种exporter，集成其他的监控系统，收集监控数据

不要用Prometheus做什么

prometheus也提供了Grok exporter等工具可以用来读取日志，但是prometheus是监控系统，不是日志系统。

prometheus缺点

1. 单机缺点，单机下存储量有限，根据你的监控量局限你的存储时间

2. 内存占用率大，prometheus集成了leveldb，一个能高效插入数据的数据库，在ssd盘下io占用比较高。同时可能会有大量数据堆积内存。但是这是可以配置的

图形展示

一般配合grafana做前端展示

二：架构

三：监控k8s资源组件

实现思路

监控指标	具体实现	举例
Pod性能	cAdvisor	容器CPU，内存利用率
Node性能	node-exporter	节点CPU，内存利用率
K8S资源对象	kube-state-metrics	Pod/Deployment/Service

注意：

node-exporter以DaemonSet部署并且设置为hostNetwork，可通过宿主机ip+9100直接访问，因此我认为Service可以不创建，经测试是可以的；

node-exporter需要收集集群内所有节点，而master节点默认有NoSchedule污点，即不参与任何调度；因此我们需要单独设置toleration来使master节点也可以部署node-exporter；

添加prometheus.io/scrape: 'true’用于自动发现；

K8S资源对象监控

kube-state-metrics是一个简单的服务，它监听Kubernetes API服务器并生成关联对象的指标。它不关注单个Kubernetes组件的运行状况，而是关注内部各种对象（如deployment、node、pod等）的运行状况。

# 1.下载部署文件

https://github.com/kubernetes/kube-state-metrics/tree/master/examples/standard

# 2.部署

kubectl apply -f service-account.yaml

kubectl apply -f cluster-role.yaml

kubectl apply -f cluster-role-binding.yaml

kubectl apply -f deployment.yaml

kubectl apply -f service.yaml

四：node-exporter agent部署方式

可以直接在每个物理节点是直接安装，这里我们使用DaemonSet部署到每个节点上，使用 hostNetwork: true 和 hostPID: true 使其获得Node的物理指标信息，配置tolerations使其在master节点也启动一个pod

#node-exporter.yaml

apiVersion: apps/v1beta2
kind: DaemonSet
metadata: 
  labels:
    app: node-exporter
  name: node-exporter
  namespace: ns-monitor
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      labels:
        app: node-exporter
    spec:
      containers:
        - name: node-exporter
          image: prom/node-exporter:v0.16.0
          ports:
            - containerPort: 9100
              protocol: TCP
              name: http
      hostNetwork: true
      hostPID: true
      tolerations:
        - effect: NoSchedule
          operator: Exists
---
kind: Service
apiVersion: v1
metadata:
  labels:
    app: node-exporter
  name: node-exporter-service
  namespace: ns-monitor
spec:
  ports:
    - name: http
      port: 9100
      nodePort: 31672
      protocol: TCP
  type: NodePort
  selector:
    app: node-exporter

kubectl apply -f node-exporter.yaml

检验node-exporter是否都成功运行

http://IP:31672/metrics

五：prometheus.yml配置文件案例

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    #- targets:
    - targets: ["localhost:9093"]
      # - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"
  - "node_down.yml"
  - "memory_over.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.


    static_configs:
    - targets: ['localhost:9090']

  - job_name: 'node_10.41'
    static_configs:
      - targets: ['localhost:9100']

  - job_name: 'node_10.38'
    static_configs:
      - targets: ['192.168.10.38:31672']

  - job_name: 'node_10.39'
    static_configs:
      - targets: ['192.168.10.39:31672']

  - job_name: 'cadvisor_10.38'
    static_configs:
      - targets: ['192.168.10.38:4194']

  - job_name: 'cadvisor_10.39'
    static_configs:
      - targets: ['192.168.10.39:4194']

六：prometheus 容器方式简易部署

部署prometheus服务，这里依然使用容器部署：

docker run \
-p 9090:9090 \
--log-driver none \
-v /data/prometheus/etc/:/etc/prometheus/ \
-v /data/prometheus/data/:/prometheus/ \
-v /etc/localtime:/etc/localtime \
--name prometheus \
prom/prometheus

创建/data/prometheus/etc/prometheus.yml配置文件

my global config
global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.
  evaluation_interval: 15s # By default, scrape targets every 15 seconds.
  # scrape_timeout is set to the global default (10s).
  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      monitor: 'container-monitor'
# Load and evaluate rules in this file every 'evaluation_interval' seconds.
rule_files:
   - "/etc/prometheus/rules/common.rules"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'container'

    static_configs:
    - targets: ['10.2.90.129:9090','10.2.90.12:9090','10.2.90.19:9090','10.2.90.29:9090']

配置文件中 -targets中的端点填写你的实际cadvisor所在的ip和暴露的端口.正确启动后访问ip:9090就能查询数据了

上一篇：自动伸缩你的应用（自动伸缩装置怎么做?）
下一篇： 10分钟知晓这些Docker监控工具（docker swarm 监控）

网站首页 > 开源技术正文

Prometheus(普罗米修斯)介绍（普罗米修斯深度解析知乎）

猜你喜欢

本文暂时没有评论，来添加一个吧(●'◡'●)

取消回复欢迎你发表评论:

网站首页 > 开源技术 正文

Prometheus(普罗米修斯)介绍（普罗米修斯深度解析知乎）

猜你喜欢

本文暂时没有评论，来添加一个吧(●'◡'●)

取消回复欢迎 你 发表评论:

网站首页 > 开源技术正文

取消回复欢迎你发表评论: