1、告警模板

关于Alertmanager的告警模板,我们以上篇《Prometheus配置和使用Alertmanager发送告警至企业微信》的模板为例,对其做个说明,

[root@centos74 home]# cat /usr/local/prometheus/alertmanager/wechat.tmpl
{{ define "wechat.default.message" }}
{{- if gt (len .Alerts.Firing) 0 -}}
{{- range $index, $alert := .Alerts -}}
======== 异常告警 ========
告警名称:{{ $alert.Labels.alertname }}
告警级别:{{ $alert.Labels.severity }}
告警机器:{{ $alert.Labels.instance }} {{ $alert.Labels.device }}
告警详情:{{ $alert.Annotations.summary }}
告警时间:{{ $alert.StartsAt.Format "2006-01-02 15:04:05" }}
========== END ==========
{{- end }}
{{- end }}
{{- if gt (len .Alerts.Resolved) 0 -}}
{{- range $index, $alert := .Alerts -}}
======== 告警恢复 ========
告警名称:{{ $alert.Labels.alertname }}
告警级别:{{ $alert.Labels.severity }}
告警机器:{{ $alert.Labels.instance }}
告警详情:{{ $alert.Annotations.summary }}
告警时间:{{ $alert.StartsAt.Format "2006-01-02 15:04:05" }}
恢复时间:{{ $alert.EndsAt.Format "2006-01-02 15:04:05" }}
========== END ==========
{{- end }}
{{- end }}
{{- end }}

2、主要语法

模板是基于go语言的template——https://golang.org/pkg/text/template/

2.1 Text and spaces

By default, all text between actions is copied verbatim when the template 
is executed. For example, the string " items are made of " in the example
above appears on standard output when the program is run.

However, to aid in formatting template source code, if an action's left 
delimiter (by default "{{") is followed immediately by a minus sign and 
ASCII space character ("{{- "), all trailing white space is trimmed from 
the immediately preceding text. Similarly, if the right delimiter ("}}") 
is preceded by a space and minus sign (" -}}"), all leading white space 
is trimmed from the immediately following text. In these trim markers, 
the ASCII space must be present; "{{-3}}" parses as an action containing 
the number -3.

For instance, when executing the template whose source is

"{{23 -}} < {{- 45}}"

the generated output would be

"23<45"

For this trimming, the definition of white space characters is the same 
as in Go: space, horizontal tab, carriage return, and newline.

2.2 Actions

Here is the list of actions. "Arguments" and "pipelines" are evaluations 
of data, defined in detail in the corresponding sections that follow.

{{/* a comment */}}
{{- /* a comment with white space trimmed from preceding and following text */ -}}
	A comment; discarded. May contain newlines.
	Comments do not nest and must start and end at the
	delimiters, as shown here.

{{pipeline}}
	The default textual representation (the same as would be
	printed by fmt.Print) of the value of the pipeline is copied
	to the output.

{{if pipeline}} T1 {{end}}
	If the value of the pipeline is empty, no output is generated;
	otherwise, T1 is executed. The empty values are false, 0, any
	nil pointer or interface value, and any array, slice, map, or
	string of length zero.
	Dot is unaffected.

{{if pipeline}} T1 {{else}} T0 {{end}}
	If the value of the pipeline is empty, T0 is executed;
	otherwise, T1 is executed. Dot is unaffected.

{{if pipeline}} T1 {{else if pipeline}} T0 {{end}}
	To simplify the appearance of if-else chains, the else action
	of an if may include another if directly; the effect is exactly
	the same as writing
		{{if pipeline}} T1 {{else}}{{if pipeline}} T0 {{end}}{{end}}

{{range pipeline}} T1 {{end}}
	The value of the pipeline must be an array, slice, map, or channel.
	If the value of the pipeline has length zero, nothing is output;
	otherwise, dot is set to the successive elements of the array,
	slice, or map and T1 is executed. If the value is a map and the
	keys are of basic type with a defined order, the elements will be
	visited in sorted key order.

{{range pipeline}} T1 {{else}} T0 {{end}}
	The value of the pipeline must be an array, slice, map, or channel.
	If the value of the pipeline has length zero, dot is unaffected and
	T0 is executed; otherwise, dot is set to the successive elements
	of the array, slice, or map and T1 is executed.

{{template "name"}}
	The template with the specified name is executed with nil data.

{{template "name" pipeline}}
	The template with the specified name is executed with dot set
	to the value of the pipeline.

{{block "name" pipeline}} T1 {{end}}
	A block is shorthand for defining a template
		{{define "name"}} T1 {{end}}
	and then executing it in place
		{{template "name" pipeline}}
	The typical use is to define a set of root templates that are
	then customized by redefining the block templates within.

{{with pipeline}} T1 {{end}}
	If the value of the pipeline is empty, no output is generated;
	otherwise, dot is set to the value of the pipeline and T1 is
	executed.

{{with pipeline}} T1 {{else}} T0 {{end}}
	If the value of the pipeline is empty, dot is unaffected and T0
	is executed; otherwise, dot is set to the value of the pipeline
	and T1 is executed.

3、Alert数据结构

告警的数据结构主要如下,

Name Type Notes
Status string Defines whether or not the alert is resolved or currently firing.
Labels KV A set of labels to be attached to the alert.
Annotations KV A set of annotations for the alert.
StartsAt time.Time The time the alert started firing. If omitted, the current time is assigned by the Alertmanager.
EndsAt time.Time Only set if the end time of an alert is known. Otherwise set to a configurable timeout period from the time since the last alert was received.
GeneratorURL string A backlink which identifies the causing entity of this alert.

3.1 Labels

Labels为prometheus web上告警时的label,如下
在这里插入图片描述

3.2 Annotations

Annotations为用户在告警规则里定义的annotations字段,

groups:
- name: node_health
  rules:
  - alert: HighMemoryUsage
    expr: node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes < 0.9
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: High memory usage

3.3 StartsAt和EndsAt

StartsAt用于告警触发的时间,EndsAt则用于告警恢复的时间。

如果我们在告警模板中直接使用$alert.StartsAt,得到的时间格式如下,

告警时间:2020-12-16 22:35:33.676515606 +0800 CST

这个时间也就是和我们机器上的时间一致,因此如果需要保证机器上的时间是我们需要的时区。

可以通过tzselect命令设置时区,完成后最好重启下系统,通过/var/log/messages的时间戳来确定时间是否满足我们的需求,这样我们收到的告警时间才是符合我们所在时区。

不过对于我们的告警不需要精确到纳秒级别,也不需要显示时区,那就需要对这个时间进行格式化,这也是我们模板中使用$alert.StartsAt.Format的原因,至于其中的"2006-01-02 15:04:05",可以理解为时间格式,而且还必须就是这个时间,不可以修改,就当做是魔术字吧,go的开发者就是这么个性。

在使用邮件告警时,一般使用如下格式,

告警时间: {{ ($alert.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}

其中Add 28800e9表示在基准时间上添加8小时,28800e9是8小时的纳秒数。这就是从UTC时间转换到北京东八区时间。

Logo

有“AI”的1024 = 2048,欢迎大家加入2048 AI社区

更多推荐