kibana出图展示,Filebeat多行匹配,故障排查技巧及ELFK架构

一.昨日内容回顾及今日内容预告

1.昨日内容回顾

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
- EFK架构数据流走向
- ElasticSearch
- Filebeat
- Kibana

- ES环境部署
- ES单点部署
- ES集群部署

- kibana环境部署

- Filebeat环境部署
- input | output
- modules | output
- 多实例

- EFK分析nginx日志

2.今日内容预告

1
2
3
4
5
6
7
- Kibana出图展示

- Filebeat多行匹配

- 故障排查技巧

- ELKF架构

二.kibana出图展示

1.kibana分析PV

1.1 进入到可视化库界面

1
如上图所示,依次点击即可。

1.2 基于聚合

1.3 指标

1.4 选择索引

1.5 统计PV

2.kibana分析IP

2.1 基于聚合

2.2 指标

2.3 选择索引的匹配模式

2.4 选择指定字段

3.kibana分析设备类型

3.1 选择Lens

3.2 选择指定字段分析

4.kibana分析操作系统用户占比

4.1 选择特定字段

1
如上图所示,我们选择user_agent.os.full字段来进行查询。

4.2 查看现有的图列表

1
如上图所示,我们已经画了4张图了。

5.kibana分析流量带宽

5.1 直接在原有的图上修改

5.2 另存为新的图

5.3 修改nginx访问日志观察数据是否监控

1
如上图所示,我们可以尝试修改nginx的访问日志,修改带宽的大小,而后由Filebeat采集数据到ES,kibana出图展示。
1
2
3
4
5
root@elk91:/var/log/nginx# echo 10*1024*1024*1024 | bc
10737418240
root@elk91:/var/log/nginx# vim access.log
修改日志中最后一行
10.0.0.1 - - [14/Mar/2026:12:35:28 +0000] "GET /files/ HTTP/1.1" 200 10737418240 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 ( KHTML, like Gecko) Chrome/146.0.0.0 Safari/537.36"

6.全球用户分布图

6.1 选择maps

6.2 添加图层

6.3 选择地理位置坐标点

6.4 保存地图

7.制作Dashboard展示数据

7.1 新建仪表盘

7.2 从库中添加数据

7.3 展示数据并保存Dashboard

三.课堂练习EFK架构采集web集群日志

1.要求说明

EFK分析web集群架构图解

1
2
3
4
5
6
需求说明:
- 在elk91和elk93部署tomcat,要求首页内容不一样;
- 在elk92节点部署nginx服务作为统一的访问入口,要求如下:
- 1.访问路径为: /oldboyedu时请求转发到elk91节点;
- 2.访问路径为: /linux时请求转发到elk93节点;
- 使用EFK架构分析整个web集群的访问日志;

2.参考答案

2.1 部署tomcat

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
	1.下载软件包
wget https://dlcdn.apache.org/tomcat/tomcat-11/v11.0.13/bin/apache-tomcat-11.0.13.tar.gz


SVIP:
[root@elk91 ~]# wget http://192.168.14.253/Resources/ElasticStack/softwares/tomcat/apache-tomcat-11.0.13.tar.gz



2.解压软件包
[root@elk91 ~]# tar xf apache-tomcat-11.0.13.tar.gz -C /usr/local/
[root@elk91 ~]# ll /usr/local/apache-tomcat-11.0.13/
total 160
drwxr-xr-x 9 root root 4096 Oct 30 15:05 ./
drwxr-xr-x 11 root root 4096 Oct 30 15:05 ../
drwxr-x--- 2 root root 4096 Oct 30 15:05 bin/
-rw-r----- 1 root root 24262 Oct 10 21:57 BUILDING.txt
drwx------ 2 root root 4096 Oct 10 21:57 conf/
-rw-r----- 1 root root 6096 Oct 10 21:57 CONTRIBUTING.md
drwxr-x--- 2 root root 4096 Oct 30 15:05 lib/
-rw-r----- 1 root root 60517 Oct 10 21:57 LICENSE
drwxr-x--- 2 root root 4096 Oct 10 21:57 logs/
-rw-r----- 1 root root 2333 Oct 10 21:57 NOTICE
-rw-r----- 1 root root 3291 Oct 10 21:57 README.md
-rw-r----- 1 root root 6470 Oct 10 21:57 RELEASE-NOTES
-rw-r----- 1 root root 16114 Oct 10 21:57 RUNNING.txt
drwxr-x--- 2 root root 4096 Oct 30 15:05 temp/
drwxr-x--- 7 root root 4096 Oct 10 21:57 webapps/
drwxr-x--- 2 root root 4096 Oct 10 21:57 work/
[root@elk91 ~]#

3.配置环境变量
[root@elk91 ~]# cat /etc/profile.d/tomcat.sh
#!/bin/bash

export JAVA_HOME=/usr/share/elasticsearch/jdk
export TOMCAT_HOME=/usr/local/apache-tomcat-11.0.13
export PATH=$PATH:$JAVA_HOME/bin:$TOMCAT_HOME/bin
[root@elk91 ~]#
[root@elk91 ~]# source /etc/profile.d/tomcat.sh
[root@elk91 ~]#
[root@elk91 ~]# java --version
openjdk 22.0.2 2024-07-16
OpenJDK Runtime Environment (build 22.0.2+9-70)
OpenJDK 64-Bit Server VM (build 22.0.2+9-70, mixed mode, sharing)
[root@elk91 ~]#
[root@elk91 ~]#


4.启动服务
[root@elk91 ~]# startup.sh
Using CATALINA_BASE: /usr/local/apache-tomcat-11.0.13
Using CATALINA_HOME: /usr/local/apache-tomcat-11.0.13
Using CATALINA_TMPDIR: /usr/local/apache-tomcat-11.0.13/temp
Using JRE_HOME: /usr/share/elasticsearch/jdk
Using CLASSPATH: /usr/local/apache-tomcat-11.0.13/bin/bootstrap.jar:/usr/local/apache-tomcat-11.0.13/bin/tomcat-juli.jar
Using CATALINA_OPTS:
Tomcat started.
[root@elk91 ~]#
[root@elk91 ~]# ss -ntl | grep 8080
LISTEN 0 100 *:8080 *:*
[root@elk91 ~]#


5.访问tomcat的WebUI
http://10.0.0.91:8080/


6.检查tomcat的访问日志
[root@elk91 ~]# cat /usr/local/apache-tomcat-11.0.13/logs/localhost_access_log.2025-10-30.txt
10.0.0.1 - - [30/Oct/2025:15:08:15 +0800] "GET / HTTP/1.1" 200 11237
10.0.0.1 - - [30/Oct/2025:15:08:16 +0800] "GET /tomcat.css HTTP/1.1" 200 5584
10.0.0.1 - - [30/Oct/2025:15:08:16 +0800] "GET /tomcat.svg HTTP/1.1" 200 67795
10.0.0.1 - - [30/Oct/2025:15:08:16 +0800] "GET /asf-logo-wide.svg HTTP/1.1" 200 7089
10.0.0.1 - - [30/Oct/2025:15:08:16 +0800] "GET /bg-nav.png HTTP/1.1" 200 1401
10.0.0.1 - - [30/Oct/2025:15:08:16 +0800] "GET /bg-upper.png HTTP/1.1" 200 3103
10.0.0.1 - - [30/Oct/2025:15:08:16 +0800] "GET /bg-middle.png HTTP/1.1" 200 1918
10.0.0.1 - - [30/Oct/2025:15:08:16 +0800] "GET /bg-button.png HTTP/1.1" 200 713
10.0.0.1 - - [30/Oct/2025:15:08:16 +0800] "GET /favicon.ico HTTP/1.1" 200 21630
[root@elk91 ~]#

7.elk93节点重复上述操作
略,见视频。

2.2 准备tomcat首页内容

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
	1.tomcat实例1环境准备
[root@elk91 ~]# rm -rf /usr/local/apache-tomcat-11.0.13/webapps/ROOT/*
[root@elk91 ~]#
[root@elk91 ~]# echo "<h1 style='color: red;'>www.oldboyedu.com</h1>" > /usr/local/apache-tomcat-11.0.13/webapps/ROOT/index.html
[root@elk91 ~]#


2.tomcat实例2环境准备
[root@elk93 ~]# rm -rf /usr/local/apache-tomcat-11.0.13/webapps/ROOT/*
[root@elk93 ~]#
[root@elk93 ~]# echo "<h1 style='color: green;'>Linux 2026 hahaha~</h1>" > /usr/local/apache-tomcat-11.0.13/webapps/ROOT/index.html
[root@elk93 ~]#


3.测试验证
[root@elk92 ~]# curl http://10.0.0.93:8080/
<h1 style='color: green;'>Linux 2026 hahaha~</h1>
[root@elk92 ~]#
[root@elk92 ~]# curl http://10.0.0.91:8080/
<h1 style='color: red;'>www.oldboyedu.com</h1>
[root@elk92 ~]#

2.3 修改nginx的配置文件代理tomcat应用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
	1.修改nginx的配置文件
[root@elk92 ~]# cat /etc/nginx/conf.d/lb-tomcat.conf
server {
listen 80;
server_name localhost;


location /oldboyedu/ {
proxy_pass http://10.0.0.91:8080/;
}

location /linux/ {
proxy_pass http://10.0.0.93:8080/;
}

location / {
return 404;
}
}
[root@elk92 ~]#


2.重启nginx服务
[root@elk92 ~]# rm -f /etc/nginx/sites-enabled/default
[root@elk92 ~]#
[root@elk92 ~]# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
[root@elk92 ~]#
[root@elk92 ~]# nginx -s reload
[root@elk92 ~]#



3.测试验证
[root@elk92 ~]# curl 10.0.0.92/oldboyedu/
<h1 style='color: red;'>www.oldboyedu.com</h1>
[root@elk92 ~]#
[root@elk92 ~]# curl 10.0.0.92/linux/
<h1 style='color: green;'>Linux 2026 hahaha~</h1>
[root@elk92 ~]#
[root@elk92 ~]# curl 10.0.0.92
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.18.0 (Ubuntu)</center>
</body>
</html>
[root@elk92 ~]#


温馨提示:
请思考如何让tomcat的访问日志中记录客户端实际的IP地址。
参考博客:
https://www.cnblogs.com/pangguoping/p/5748783.html

2.4 使用Filebeat采集web集群日志

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
	1.Filebeat采集nginx访问日志
[root@elk92 ~]# cat /etc/filebeat/config/06-efk-nginx-to-es.yaml
filebeat.inputs:
- type: log
paths:
- /var/log/nginx/access.log

output.elasticsearch:
hosts: ["http://10.0.0.91:9200","http://10.0.0.92:9200","http://10.0.0.93:9200"]
index: "oldboyedu-linux-efk-nginx-%{+yyyy.MM.dd}"


setup.ilm.enabled: false
setup.template.name: "oldboyedu-linux"
setup.template.pattern: "oldboyedu-linux-efk*"
setup.template.overwrite: false
setup.template.settings:
index.number_of_shards: 5
index.number_of_replicas: 0
[root@elk92 ~]#
[root@elk92 ~]# rm -rf /var/lib/filebeat/
[root@elk92 ~]#
[root@elk92 ~]# filebeat -e -c /etc/filebeat/config/06-efk-nginx-to-es.yaml


2.Filebeat采集elk91的tomcat数据
[root@elk91 ~]# cat /tmp/tomcat-to-es.yaml
filebeat.inputs:
- type: log
paths:
- /usr/local/apache-tomcat-11.0.13/logs/*.txt

output.elasticsearch:
hosts: ["http://10.0.0.91:9200","http://10.0.0.92:9200","http://10.0.0.93:9200"]
index: "oldboyedu-linux-efk-tomcat-%{+yyyy.MM.dd}"

setup.ilm.enabled: false
setup.template.name: "oldboyedu-linux"
setup.template.pattern: "oldboyedu-linux-efk*"
setup.template.overwrite: false
setup.template.settings:
index.number_of_shards: 8
index.number_of_replicas: 0
[root@elk91 ~]#
[root@elk91 ~]# filebeat -e -c /tmp/tomcat-to-es.yaml


3.Filebeat采集elk93的tomcat数据
[root@elk93 ~]# cat /tmp/tomcat-to-es.yaml
filebeat.inputs:
- type: log
paths:
- /usr/local/apache-tomcat-11.0.13/logs/*.txt

output.elasticsearch:
hosts: ["http://10.0.0.91:9200","http://10.0.0.92:9200","http://10.0.0.93:9200"]
index: "oldboyedu-linux-efk-tomcat-%{+yyyy.MM.dd}"

setup.ilm.enabled: false
setup.template.name: "oldboyedu-linux"
setup.template.pattern: "oldboyedu-linux-efk*"
setup.template.overwrite: false
setup.template.settings:
index.number_of_shards: 8
index.number_of_replicas: 0
[root@elk93 ~]#
[root@elk93 ~]# filebeat -e -c /tmp/tomcat-to-es.yaml

2.5 出图展示数据

1
2
3
4
5
6
7
8
9
如上图所示,我们可以分析数据指标。

查看相关字段:
- message
- host.name

KQL语句:
- message : 200 and host.name : "elk93"
- message : 404

四.filebeat多行匹配

1.filestream替代log类型

1.1 编写配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[root@elk91 ~]# cat /tmp/filestream_tomcat-to-es.yaml
filebeat.inputs:
- type: filestream
paths:
- /usr/local/apache-tomcat-11.0.13/logs/catalina.out

output.elasticsearch:
hosts: ["http://10.0.0.91:9200","http://10.0.0.92:9200","http://10.0.0.93:9200"]
index: "oldboyedu-linux-filebeat-tomcat-catalina-%{+yyyy.MM.dd}"

setup.ilm.enabled: false
setup.template.name: "oldboyedu-linux"
setup.template.pattern: "oldboyedu-linux-filebeat*"
setup.template.overwrite: true
setup.template.settings:
index.number_of_shards: 5
index.number_of_replicas: 0
[root@elk91 ~]#
[root@elk91 ~]#

1.2 启动实例

1
[root@elk91 ~]# filebeat -e -c /tmp/filestream_tomcat-to-es.yaml

1.3 kibana查看数据

2.多行匹配案例

2.1 测试环境准备

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
	1.停止tomcat服务
[root@elk91 ~]# shutdown.sh
Using CATALINA_BASE: /usr/local/apache-tomcat-11.0.13
Using CATALINA_HOME: /usr/local/apache-tomcat-11.0.13
Using CATALINA_TMPDIR: /usr/local/apache-tomcat-11.0.13/temp
Using JRE_HOME: /usr/share/elasticsearch/jdk
Using CLASSPATH: /usr/local/apache-tomcat-11.0.13/bin/bootstrap.jar:/usr/local/apache-tomcat-11.0.13/bin/tomcat-juli.jar
Using CATALINA_OPTS:
[root@elk91 ~]#


2.修改tomcat的配置文件
[root@elk91 ~]# vim /usr/local/apache-tomcat-11.0.13/conf/server.xml
...
151 </Host666666666>
....


3.启动tomcat
[root@elk91 ~]# startup.sh
Using CATALINA_BASE: /usr/local/apache-tomcat-11.0.13
Using CATALINA_HOME: /usr/local/apache-tomcat-11.0.13
Using CATALINA_TMPDIR: /usr/local/apache-tomcat-11.0.13/temp
Using JRE_HOME: /usr/share/elasticsearch/jdk
Using CLASSPATH: /usr/local/apache-tomcat-11.0.13/bin/bootstrap.jar:/usr/local/apache-tomcat-11.0.13/bin/tomcat-juli.jar
Using CATALINA_OPTS:
Tomcat started.
[root@elk91 ~]#
[root@elk91 ~]# ss -ntl | grep 8080
[root@elk91 ~]#


4.查看日志
[root@elk91 ~]# tail -f /usr/local/apache-tomcat-11.0.13/logs/catalina.out
at java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643)
at org.apache.tomcat.util.digester.Digester.parse(Digester.java:1506)
at org.apache.catalina.startup.Catalina.parseServerXml(Catalina.java:602)
at org.apache.catalina.startup.Catalina.load(Catalina.java:692)
at org.apache.catalina.startup.Catalina.load(Catalina.java:730)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at org.apache.catalina.startup.Bootstrap.load(Bootstrap.java:296)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:469)
30-Oct-2025 15:47:13.275 SEVERE [main] org.apache.catalina.startup.Catalina.start Cannot start server, server instance is not configured

2.2 多行匹配案例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
	1.准备配置文件
[root@elk93 ~]# cat /tmp/filestream_tomcat-to-es.yaml
filebeat.inputs:
- type: filestream
paths:
- /usr/local/apache-tomcat-11.0.13/logs/catalina.out
# 定义解析器
parsers:
# 配置多行匹配
- multiline:
# 指定多行匹配的模式,有效值为:pattern,count。
type: pattern
# 如果类型是pattern,则需要指定正则表达式。
pattern: '^\d'
# 以下2个字段参考官网即可: https://www.elastic.co/guide/en/beats/filebeat/7.17/multiline-examples.html
negate: true
match: after


output.elasticsearch:
hosts: ["http://10.0.0.91:9200","http://10.0.0.92:9200","http://10.0.0.93:9200"]
index: "oldboyedu-linux-filebeat-tomcat-catalina-%{+yyyy.MM.dd}"

setup.ilm.enabled: false
setup.template.name: "oldboyedu-linux"
setup.template.pattern: "oldboyedu-linux-filebeat*"
setup.template.overwrite: false
setup.template.settings:
index.number_of_shards: 3
index.number_of_replicas: 0
[root@elk91 ~]#

2.启动实例
[root@elk91 ~]# rm -rf /var/lib/filebeat/
[root@elk91 ~]# filebeat -e -c /tmp/filestream_tomcat-to-es.yaml


3.kibana查看数据
如上图所示。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
为什么需要多行匹配
Tomcat 的报错日志长这样,一个错误占好多行:
2025-03-13 09:12:34 ERROR [main] - NullPointerException ← 第1行
at com.oldboyedu.Main.run(Main.java:42) ← 第2行
at com.oldboyedu.Main.main(Main.java:18) ← 第3行
at java.lang.Thread.run(Thread.java:748) ← 第4行
2025-03-13 09:12:35 INFO [main] - 服务启动成功 ← 新的一条日志
如果不配置多行匹配,Filebeat 会把每一行当成独立一条发给 ES,日志就碎了。

逐行解释配置
parsers:
- multiline:
type: pattern # 用「正则模式」来判断多行边界
pattern: '^\d' # 正则:匹配以「数字」开头的行
negate: true # 取反:不匹配 pattern 的行(即不以数字开头)
match: after # 把不匹配的行合并到「前一行」后面
​```

---

## 执行逻辑图解
​```
输入的原始行:

行1:2025-03-13 09:12:34 ERROR - NullPointerException ← 以数字开头 ✅ 匹配pattern
行2: at com.oldboyedu.Main.run(Main.java:42) ← 不以数字开头 ❌ 不匹配pattern
行3: at com.oldboyedu.Main.main(Main.java:18) ← 不以数字开头 ❌ 不匹配pattern
行4: at java.lang.Thread.run(Thread.java:748) ← 不以数字开头 ❌ 不匹配pattern
行5:2025-03-13 09:12:35 INFO - 服务启动成功 ← 以数字开头 ✅ 匹配pattern


negate: true → 把「不匹配」的行(行2、3、4)视为需要合并的行
match: after → 合并到「前一行」后面


最终输出(合并成1条):

2025-03-13 09:12:34 ERROR - NullPointerException
at com.oldboyedu.Main.run(Main.java:42)
at com.oldboyedu.Main.main(Main.java:18)
at java.lang.Thread.run(Thread.java:748)
​```

---

## negate 和 match 组合说明
​```
negate: false + match: after → 匹配到 pattern 的行,追加到前一行后面
negate: false + match: before → 匹配到 pattern 的行,追加到后一行前面
negate: true + match: after → 不匹配 pattern 的行,追加到前一行后面 ← 你的配置
negate: true + match: before → 不匹配 pattern 的行,追加到后一行前面
​```

---

## 你的配置用一句话总结
​```
遇到「不以数字开头」的行,就把它拼接到上一行后面,
直到遇到「以数字开头」的新行,才认为上一条日志结束。
也就是说以时间戳(数字)开头 = 一条新日志的开始,这和 Tomcat 日志格式完全吻合。
1
2
3
4
5
对negate的理解:
negate → 英文原意是「否定、取反、使无效」

negate: false → 「不执行取反」 → 保持原样 → 匹配pattern的行
negate: true → 「执行取反」 → 结果反转 → 不匹配pattern的行
1
2
3
4
5
^\d  含义拆解:
^ → 匹配行的开头
\d → 匹配任意数字(0-9)

合起来 → 匹配「以数字开头」的行

五.ELFK架构实战

1.ELFK架构

ELFK架构图解

2.部署logstash环境

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
	1.下载logstash
wget https://artifacts.elastic.co/downloads/logstash/logstash-7.17.29-amd64.deb


SVIP:
[root@elk93 ~]# wget http://192.168.14.253/Resources/ElasticStack/softwares/ES7/7.17.29/logstash-7.17.29-amd64.deb



2.安装logstash
[root@elk93 ~]# dpkg -i logstash-7.17.29-amd64.deb


3.添加符号链接
[root@elk93 ~]# ln -svf /usr/share/logstash/bin/logstash /usr/local/bin/
'/usr/local/bin/logstash' -> '/usr/share/logstash/bin/logstash'
[root@elk93 ~]#


4.在命令行中测试启动
[root@elk93 ~]# logstash -e "input { stdin {} } output { stdout {} }"
...
The stdin plugin is now waiting for input:
www.oldboyedu.com # 这是输入的数据
{
"message" => "www.oldboyedu.com", # 这是采集到的数据
"@version" => "1",
"host" => "elk93",
"@timestamp" => 2025-10-31T01:16:35.198Z
}

5.在配置文件中测试启动
[root@elk93 ~]# cat /etc/logstash/conf.d/01-stdin-to-stdout.conf
input {
stdin {}
}

output {
stdout {}
}
[root@elk93 ~]#
[root@elk93 ~]# logstash -f /etc/logstash/conf.d/01-stdin-to-stdout.conf
...
The stdin plugin is now waiting for input:
...
Linux100 HAHAHA~ # 这是用户输入的数据
{
"message" => "Linux100 HAHAHA~",
"@timestamp" => 2025-10-31T01:19:52.277Z,
"@version" => "1",
"host" => "elk93"
}




温馨提示:
- 如果说有jdk错误信息,可以直接指向logstash的JDK环境哟~
[root@elk93 ~]# cat /etc/profile.d/tomcat.sh
#!/bin/bash

# export JAVA_HOME=/usr/share/elasticsearch/jdk
export JAVA_HOME=/usr/share/logstash/jdk
export TOMCAT_HOME=/usr/local/apache-tomcat-11.0.13
export PATH=$PATH:$JAVA_HOME/bin:$TOMCAT_HOME/bin
[root@elk93 ~]#
[root@elk93 ~]# source /etc/profile.d/tomcat.sh
[root@elk93 ~]#
[root@elk93 ~]# java -version
openjdk version "11.0.26" 2025-01-21
OpenJDK Runtime Environment Temurin-11.0.26+4 (build 11.0.26+4)
OpenJDK 64-Bit Server VM Temurin-11.0.26+4 (build 11.0.26+4, mixed mode)
[root@elk93 ~]#
[root@elk93 ~]# echo $JAVA_HOME
/usr/share/logstash/jdk
[root@elk93 ~]#

3.ELK架构采集文本案例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
	1.准备配置文件
[root@elk93 ~]# cat /etc/logstash/conf.d/02-file-to-es.conf
input {
file {
# 指定文件的路径
path => ["/tmp/*.log"]

# 指定第一次采集时的起始位置,有效值为: beginning, end
start_position => "beginning"
}
}

output {
# stdout {
# # 指定输出的数据格式,如果不指定,则默认值为: rubydebug
# # codec => "json"
# }

elasticsearch {
# 指定ES集群的地址
hosts => ["10.0.0.91:9200","10.0.0.92:9200","10.0.0.93:9200"]
# 指定ES集群的索引名称
index => "oldboyedu-linux-logstash-%{+YYYY.MM.dd}"
}

}
[root@elk93 ~]#

2.启动logstash
[root@elk93 ~]# logstash -rf /etc/logstash/conf.d/02-file-to-es.conf

-r → --config.reload.automatic 配置文件热重载(修改后自动生效,不用重启)
-f → --path.config 指定配置文件路径
合并写法:-rf = -r -f

不加 -r → 修改了配置文件,必须重启 Logstash 才能生效
加了 -r → 修改了配置文件,Logstash 自动检测到变化并重载,不用重启

3.发送测试数据
[root@elk93 ~]# echo 666666666666666666 >> /tmp/haha.log


4.kibana出图展示
如上图所示。


温馨提示:
- logstash的文本日志是有缓存的,也是在相应的数据目录中的".sincedb"文件中记录;
[root@elk93 ~]# ll /usr/share/logstash/data/plugins/inputs/file/
total 12
drwxr-xr-x 2 root root 4096 Oct 31 10:34 ./
drwxr-xr-x 3 root root 4096 Oct 31 10:05 ../
-rw-r--r-- 1 root root 100 Oct 31 10:34 .sincedb_f673f6cfb25c903a69af206b740195d3
[root@elk93 ~]#
1
2
3
4
5
6
7
8
9
10
.sincedb 文件解释
.sincedb 是 Logstash 的文件读取进度记录文件

它的作用
Logstash 重启后:

没有 .sincedb → 从头读取文件,日志全部重复发送到 ES
有 .sincedb → 从上次读到的位置继续读,不重复发送

服务重启后不重复发送日志。

4.logstash对接Filebeat

4.1 模拟APP生成测试日志

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
	1.生成测试数据
[root@elk91 ~]# cat > generate_log.py <<EOF
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
# @author : Jason Yin

import datetime
import random
import logging
import time
import sys

LOG_FORMAT = "%(levelname)s %(asctime)s [com.oldboyedu.%(module)s] - %(message)s "
DATE_FORMAT = "%Y-%m-%d %H:%M:%S"

# 配置root的logging.Logger实例的基本配置
logging.basicConfig(level=logging.INFO, format=LOG_FORMAT, datefmt=DATE_FORMAT, filename=sys.argv[1]
, filemode='a',)
actions = ["浏览页面", "评论商品", "加入收藏", "加入购物车", "提交订单", "使用优惠券", "领取优惠券",
"搜索", "查看订单", "付款", "清空购物车"]

while True:
time.sleep(random.randint(1, 5))
user_id = random.randint(1, 10000)
# 对生成的浮点数保留2位有效数字.
price = round(random.uniform(15000, 30000),2)
action = random.choice(actions)
svip = random.choice([0,1,2])
logging.info("DAU|{0}|{1}|{2}|{3}".format(user_id, action,svip,price))
EOF

[root@elk91 ~]# python3 generate_log.py /tmp/apps.log


2.查看测试数据
[root@elk92 ~]# tail -f /tmp/apps.log
INFO 2025-09-01 09:01:44 [com.oldboyedu.generate_log] - DAU|9677|加入购物车|2|27052.06
INFO 2025-09-01 09:01:49 [com.oldboyedu.generate_log] - DAU|4032|查看订单|0|26870.7
INFO 2025-09-01 09:01:53 [com.oldboyedu.generate_log] - DAU|8366|加入购物车|1|25298.89
INFO 2025-09-01 09:01:54 [com.oldboyedu.generate_log] - DAU|651|浏览页面|0|16631.32
INFO 2025-09-01 09:01:58 [com.oldboyedu.generate_log] - DAU|5629|加入购物车|2|16103.85
INFO 2025-09-01 09:02:01 [com.oldboyedu.generate_log] - DAU|7435|加入购物车|2|18304.75
INFO 2025-09-01 09:02:03 [com.oldboyedu.generate_log] - DAU|2907|付款|0|15682.89
INFO 2025-09-01 09:02:05 [com.oldboyedu.generate_log] - DAU|6663|加入收藏|0|22213.47
INFO 2025-09-01 09:02:08 [com.oldboyedu.generate_log] - DAU|1810|浏览页面|1|20675.82
INFO 2025-09-01 09:02:13 [com.oldboyedu.generate_log] - DAU|368|清空购物车|2|22214.21
INFO 2025-09-01 09:02:17 [com.oldboyedu.generate_log] - DAU|9063|搜索|2|26552.32
...

4 .2 Filebeat写入数据到logstash

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
	1.编写logstash的配置文件
[root@elk93 ~]# cat /etc/logstash/conf.d/03-beats-to-es.conf
input {
beats {
# 指定监听的端口号
port => "6666"
}
}

filter {
mutate {
remove_field => [ "@version","agent","log","ecs","tags","input" ]
}

}

output {
stdout {
codec => "rubydebug"
}

# elasticsearch {
# hosts => ["10.0.0.91:9200","10.0.0.92:9200","10.0.0.93:9200"]
# index => "oldboyedu-linux-logstash-apps-%{+YYYY.MM.dd}"
# }

}
[root@elk93 ~]#


2.启动logstash
[root@elk93 ~]# logstash -rf /etc/logstash/conf.d/03-beats-to-es.conf


3.编写Filebeat的配置文件
[root@elk91 ~]# cat /tmp/apps-to-logstash.yaml
filebeat.inputs:
- type: filestream
paths:
- /tmp/apps.log

output.logstash:
hosts: ["10.0.0.93:6666"]
[root@elk91 ~]#

4.启动Filebeat实例
[root@elk91 ~]# filebeat -e -c /tmp/apps-to-logstash.yaml


5.查看终端数据
如上图所示。

梳理:

这一套流程在干什么

整体链路:

1
2
3
4
5
6
7
elk91 的 /tmp/apps.log

Filebeat(elk91)采集日志
↓ 发送到 6666 端口
Logstash(elk93)接收、过滤、处理
↓ 打印到终端(调试阶段)
stdout 输出(暂时不入ES)

逐步解释每一步

第一步:Logstash 配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
input {
beats {
port => "6666" # 开启 6666 端口,等待 Filebeat 发数据过来
}
}

filter {
mutate {
remove_field => ["@version","agent","log","ecs","tags","input"]
# 把 Filebeat 自动添加的无用字段删掉,让数据更干净
}
}

output {
stdout {
codec => "rubydebug" # 把收到的数据直接打印到终端,方便调试
}
# elasticsearch { # ES 输出先注释掉,调试没问题再开启
# ...
# }
}

第二步:启动 Logstash

1
2
3
4
logstash -rf /etc/logstash/conf.d/03-beats-to-es.conf
# ↑
# 开启热重载,修改配置后自动生效
# 此时 Logstash 在 elk93 上监听 6666 端口,等待数据

第三步:Filebeat 配置文件

1
2
3
4
5
6
7
filebeat.inputs:
- type: filestream
paths:
- /tmp/apps.log # 监听这个日志文件

output.logstash:
hosts: ["10.0.0.93:6666"] # 把采集到的数据发给 elk93 的 6666 端口

第四步:启动 Filebeat

1
2
3
4
5
filebeat -e -c /tmp/apps-to-logstash.yaml
# ↑
# -e 把日志打印到终端(前台运行,方便调试)
# -c 指定配置文件

Logstash在这里的作用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
## Logstash 在这里的作用

这一步 Logstash 做了三件事:
​```
1. 接收数据 → input beats 插件开启 6666 端口接收 Filebeat 发来的数据

2. 清洗数据 → filter mutate 删除无用字段
原始数据有很多 Filebeat 自动加的字段:
@version、agent、log、ecs、tags、input
这些字段对业务没用,删掉让数据更干净

3. 输出数据 → output stdout 打印到终端
此时 ES 输出是注释掉的,目的是「先验证数据对不对」
确认数据格式正确后,再取消注释写入 ES

为什么要先用 stdout 调试

1
2
3
4
5
6
7
8
9
## 为什么要先用 stdout 调试
​```
直接写入 ES → 数据格式不对也看不出来,排查困难
先用 stdout → 数据直接打印到终端,一眼看清楚每个字段对不对

调试流程:
第一阶段:output stdout 确认数据格式正确 ← 现在在这里
第二阶段:output elasticsearch 确认无误后写入ES
​```

整个流程用一句话总结

1
2
3
4
5
6
7
8
## 整个流程用一句话总结
​```
这一步是在验证:
Filebeat 能不能把 elk91 的日志正确采集
并且发送给 elk93 的 Logstash
Logstash 能不能正确接收和清洗数据

先调通链路,再写入 ES。

5.logstash分析自研apps日志案例

5.1 使用mutate处理数据案例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
	1.编写配置文件
[root@elk93 ~]# cat /etc/logstash/conf.d/03-beats-to-es.conf
input {
beats {
# 指定监听的端口号
port => "6666"
}
}

filter {
mutate {
split => { "message" => "|" }

add_field => {
"other" => "%{[message][0]}"
"userid" => "%{[message][1]}"
"action" => "%{[message][2]}"
"svip" => "%{[message][3]}"
"price" => "%{[message][4]}"
}

}

mutate {
split => { "other" => " " }

add_field => {
"dt" => "%{[other][1]} %{[other][2]}"
}

remove_field => [ "@version","agent","log","ecs","tags","input","message","host","other" ]
}


mutate {
convert => {
"userid" => "integer"
"price" => "float"
}
}

}

output {
stdout {
codec => "rubydebug"
}

# elasticsearch {
# hosts => ["10.0.0.91:9200","10.0.0.92:9200","10.0.0.93:9200"]
# index => "oldboyedu-linux100-logstash-apps-%{+YYYY.MM.dd}"
# }

}
[root@elk93 ~]#
[root@elk93 ~]#


2.启动实例
[root@elk93 ~]# logstash -rf /etc/logstash/conf.d/03-beats-to-es.conf

理解:

先看原始数据长什么样

1
INFO 2025-09-01 09:01:44 [com.oldboyedu.generate_log] - DAU|9677|加入购物车|2|27052.06

目标是把这一整行字符串拆解成独立字段存入 ES。


第一个 mutate:按 | 切割

1
2
3
4
5
6
7
8
9
10
11
12
mutate {
split => { "message" => "|" }
# 把 message 按「|」切割成数组

add_field => {
"other" => "%{[message][0]}" # INFO 2025-09-01 09:01:44 [com.oldboyedu.generate_log] - DAU
"userid" => "%{[message][1]}" # 9677
"action" => "%{[message][2]}" # 加入购物车
"svip" => "%{[message][3]}" # 2
"price" => "%{[message][4]}" # 27052.06
}
}

切割过程图解:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
原始 message:
"INFO 2025-09-01 09:01:44 [com.oldboyedu.generate_log] - DAU|9677|加入购物车|2|27052.06"

按 | 切割后变成数组:
[0] = "INFO 2025-09-01 09:01:44 [com.oldboyedu.generate_log] - DAU"
[1] = "9677"
[2] = "加入购物车"
[3] = "2"
[4] = "27052.06"

↓ 分别赋值给新字段

other = "INFO 2025-09-01 09:01:44 [com.oldboyedu.generate_log] - DAU"
userid = "9677"
action = "加入购物车"
svip = "2"
price = "27052.06"

第二个 mutate:从 other 里提取时间

1
2
3
4
5
6
7
8
9
10
11
12
13
mutate {
split => { "other" => " " }
# 把 other 按「空格」再次切割

add_field => {
"dt" => "%{[other][1]} %{[other][2]}"
# 拼接日期和时间
}

remove_field => ["@version","agent","log","ecs","tags",
"input","message","host","other"]
# 删除所有无用字段
}

切割过程图解:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
other:
"INFO 2025-09-01 09:01:44 [com.oldboyedu.generate_log] - DAU"

按空格切割后变成数组:
[0] = "INFO"
[1] = "2025-09-01"
[2] = "09:01:44"
[3] = "[com.oldboyedu.generate_log]"
[4] = "-"
[5] = "DAU"

↓ 取 [1][2] 拼接

dt = "2025-09-01 09:01:44"

第三个 mutate:类型转换

1
2
3
4
5
6
mutate {
convert => {
"userid" => "integer" # "9677" → 9677 字符串转整数
"price" => "float" # "27052.06" → 27052.06 字符串转浮点数
}
}

为什么要转类型:

1
2
3
4
5
不转类型:userid = "9677"(字符串)→ ES 无法做数值计算、范围查询
转了类型:userid = 9677 (整数) → ES 可以排序、聚合、范围查询

price 同理,转成 float 后才能做
求平均价格、最大最小值等统计分析

三个 mutate 处理完后的最终结果

1
2
3
4
5
6
7
8
9
处理前(一整行乱字符串):
message = "INFO 2025-09-01 09:01:44 [com.oldboyedu.generate_log] - DAU|9677|加入购物车|2|27052.06"

处理后(结构化字段):
dt = "2025-09-01 09:01:44" ← 时间
userid = 9677 ← 用户ID(整数)
action = "加入购物车" ← 行为
svip = "2" ← 会员等级
price = 27052.06 ← 价格(浮点数)

为什么要分三个 mutate

1
2
3
4
5
6
7
8
9
第一个 mutate:按 | 切割 message,提取业务字段

必须先切割完,other 字段才存在

第二个 mutate:按空格切割 other,提取时间字段,删除无用字段

必须先有 userid 和 price 字段

第三个 mutate:对已存在的字段做类型转换

整体目的一句话总结

1
2
3
把一行非结构化的原始日志字符串
拆解、清洗、转换成结构化的字段
方便存入 ES 后做统计分析

5.2 使用date插件处理日期

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
	1.编写配置文件
[root@elk93 ~]# cat /etc/logstash/conf.d/03-beats-to-es.conf
input {
beats {
# 指定监听的端口号
port => "6666"
}
}

filter {
mutate {
split => { "message" => "|" }

add_field => {
"other" => "%{[message][0]}"
"userid" => "%{[message][1]}"
"action" => "%{[message][2]}"
"svip" => "%{[message][3]}"
"price" => "%{[message][4]}"
}

}

mutate {
split => { "other" => " " }

add_field => {
"dt" => "%{[other][1]} %{[other][2]}"
}

remove_field => [ "@version","agent","log","ecs","tags","input","message","host","other" ]
}


mutate {
convert => {
"userid" => "integer"
"price" => "float"
}
}

date {
# "2026-03-03 16:00:01"
match => [ "dt", "yyyy-MM-dd HH:mm:ss" ]

# 将解析的日期时间存储到指定字段,若不指定,则默认覆盖原有的"@timestamp"字段。
# target => "oldboyedu-linux-dt"
}


}

output {
stdout {
codec => "rubydebug"
}

elasticsearch {
hosts => ["10.0.0.91:9200","10.0.0.92:9200","10.0.0.93:9200"]
index => "oldboyedu-linux-logstash-apps-%{+YYYY.MM.dd}"
}

}
[root@elk93 ~]#


2.启动logstash实例
[root@elk93 ~]# logstash -rf /etc/logstash/conf.d/03-beats-to-es.conf


3.kibana出图展示
如山图所示。

这步是在上一步的基础上新增了两个功能:时间解析 和 写入ES。


新增第一个功能:date 插件

1
2
3
4
date {
match => [ "dt", "yyyy-MM-dd HH:mm:ss" ]
# target => "oldboyedu-linux-dt" ← 注释掉了
}

为什么需要 date 插件

不用 date 插件时:

1
2
@timestamp = Logstash 收到数据的时间   ← 是处理时间,不是日志产生时间
dt = "2025-09-01 09:01:44" ← 这才是日志真实产生时间

两个时间不一致会导致:

1
2
3
4
日志实际产生时间:2025-09-01 09:01:44
Logstash 处理时间:2025-03-13 10:30:22 ← 差了好几个月!

在 Kibana 按时间查日志 → 找不到!因为 @timestamp 对不上

date 插件做了什么

1
2
3
4
5
6
7
match => [ "dt", "yyyy-MM-dd HH:mm:ss" ]
# ↑ ↑
# 来源字段 时间格式

# 把 dt 字段的值 "2025-09-01 09:01:44"
# 按照 "yyyy-MM-dd HH:mm:ss" 格式解析
# 解析后覆盖 @timestamp 字段

效果:

1
2
3
4
5
6
7
处理前:
dt = "2025-09-01 09:01:44"
@timestamp = "2025-03-13T10:30:22.000Z" ← Logstash处理时间

处理后:
dt = "2025-09-01 09:01:44"
@timestamp = "2025-09-01T09:01:44.000Z" ← 替换成日志真实时间

target 注释掉的含义

1
2
3
4
5
6
7
# target => "oldboyedu-linux-dt"   ← 注释掉

# 注释掉 = 解析结果覆盖默认的 @timestamp 字段
# 不注释 = 解析结果存到自定义字段 oldboyedu-linux-dt,@timestamp 不变

# 一般都注释掉,让 @timestamp 保持和日志时间一致
# Kibana 默认按 @timestamp 排序和过滤

新增第二个功能:写入 ES

1
2
3
4
5
6
7
8
9
10
11
12
13
output {
stdout {
codec => "rubydebug" # 继续打印到终端(保留调试)
}

elasticsearch { # ← 新增,同时写入 ES
hosts => ["10.0.0.91:9200","10.0.0.92:9200","10.0.0.93:9200"]
index => "oldboyedu-linux-logstash-apps-%{+YYYY.MM.dd}"
# ↑
# 按日期自动创建索引
# 每天一个新索引
}
}

索引名按日期自动生成:

1
2
3
4
oldboyedu-linux-logstash-apps-2025.09.01
oldboyedu-linux-logstash-apps-2025.09.02
oldboyedu-linux-logstash-apps-2025.09.03
...

和上一步配置的完整对比

1
2
3
4
5
6
7
8
9
10
11
12
13
上一步:
✅ 按 | 切割字段
✅ 提取时间、用户ID、行为、价格
✅ 类型转换
❌ 时间不准确(@timestamp 是处理时间)
❌ 只打印到终端,不写 ES

这一步:
✅ 按 | 切割字段
✅ 提取时间、用户ID、行为、价格
✅ 类型转换
date 插件修正 @timestamp 为日志真实时间 ← 新增
✅ 写入 ES ← 新增

最终写入 ES 的数据结构

1
2
3
4
5
6
7
8
{
"@timestamp" : "2025-09-01T09:01:44.000Z",
"dt" : "2025-09-01 09:01:44",
"userid" : 9677,
"action" : "加入购物车",
"svip" : "2",
"price" : 27052.06
}

这样在 Kibana 里就可以:

1
2
3
4
5
按真实时间筛选日志        ← @timestamp 已修正
统计每天活跃用户数 ← userid 是整数可聚合
分析各行为占比 ← action 字段
统计不同会员等级消费 ← svip + price 字段
计算平均订单金额 ← price 是浮点数可计算

6.分析自定义程序的业务指标

6.1 PV统计

6.2 分析SVIP用户占比

6.3 用户行为分析

6.4 分析用户的UV

6.5 分析平台的交易额

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
[root@elk93 ~]# cat /etc/logstash/conf.d/03-beats-to-es.conf 
input {
beats {
# 指定监听的端口号
port => "6666"
}
}

filter {
mutate {
split => { "message" => "|" }

add_field => {
"other" => "%{[message][0]}"
"userid" => "%{[message][1]}"
"action" => "%{[message][2]}"
"svip" => "%{[message][3]}"
"price" => "%{[message][4]}"
}


}



mutate {
split => { "other" => " " }

add_field => {
"dt" => "%{[other][1]} %{[other][2]}"
}

remove_field => [ "@version","agent","log","ecs","tags","input","message","host","other" ]

}

mutate {
convert => {
"price" => "integer"
}
}

date {
# "2025-10-31 11:37:43"
match => [ "dt", "yyyy-MM-dd HH:mm:ss" ]

# 将解析的日期时间存储到指定字段,若不指定,则默认覆盖原有的"@timestamp"字段。
#target => "oldboyedu-linux-dt"
}

}

output {
# stdout {
# codec => "rubydebug"
# }

elasticsearch {
hosts => ["10.0.0.91:9200","10.0.0.92:9200","10.0.0.93:9200"]
index => "oldboyedu-linux100-logstash-apps-%{+YYYY.MM.dd}"
}

}
[root@elk93 ~]#
[root@elk93 ~]# logstash -rf /etc/logstash/conf.d/03-beats-to-es.conf

6.6 制作Dashboard

六.ElasticStack故障排查技巧指南

1.EFK架构排查技巧

EFK架构故障排查技巧

1
2
3
4
5
6
7
8
如果kibana查询不到数据可能是哪些问题呢?
- 1.可能是源数据本身就没有数据;
- 2.filebeat进程没有启动;
- 3.filebeat配置文件指定错误;
- 4.ES集群故障(如果ES集群故障,kibana将无法访问)
- 5.filebeat本地的data目录记录了offset,导致数据无法写入到ES集群;
- 6.kibana查询的时间范围不匹配;
- 7.有可能是ES集群脑裂,kibana链接的es和filebeat链接的ES不是同一个集群;

2.ELFK架构排查技巧

ELFK架构故障排查技巧

1
2
3
4
5
6
7
8
9
10
如果kibana不展示数据,可能时哪些因素呢?
- 1.源数据文件本身就没有数据;
- 2.filebeat实例挂掉;
- 3.filebeat加载配置文件错误,例如:没有将数据写入Logstash或者ES,没有采集源数据等;
- 4.Logstash实例挂掉;
- 5.Logstash加载配置文件错误,例如: 没有将数据写入ES集群;
- 6.kibana的索引模式和ES的索引没有正确匹配,导致查询数据逻辑错误;
- 7.kibana的查询时间范围不正确;
- 8.在ELK架构中,如果是file类型,可能是由于offset文件读取位置问题导致;
- 9.ES有数据,kibana无法出图的原因可能是数据类型的问题,需要通过索引模板修改映射信息;

offset是什么

offset = 文件读取偏移量 = 记录 「上次读到文件的第几个字节」

七.今日内容回顾及作业

1.今日内容回顾

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
- kibana出图展示		*****
- pv
- ip
- 带宽
- 设备类型
- 操作系统
- 全球用户分布
- Dashboard

- EFK架构分析Web集群 ****

- Filebeat的多行匹配 ***


- ELFK架构分析自研的apps日志 ****
- mutate
- date

- 故障排查集群 *****

2.今日作业

1
2
3
4
5
- 完成课堂的所有练习并整理思维导图;

- 编写程序模拟nginx访问日志10w条记录,并使用EFK架构分析全球用户分布图,设备类型等信息。

- 使用ansible Playbook一键部署ELFK架构;
1
- 编写程序模拟nginx访问日志10w条记录,并使用EFK架构分析全球用户分布图,设备类型等信息。

Python代码(AI生成)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
root@elk91:~# cat generate_nginx_log_V2.py 
#!/usr/bin/env python3
# -*- coding: UTF-8 -*-
# generate_nginx_log.py
# 生成模拟 Nginx 访问日志 10万条

import random
import time
import datetime
import sys
import os

# ── 全球各大洲真实 IP 段 ──────────────────────────────────────
IP_RANGES = {
"asia": [
("1.0.1.0", "1.0.3.255"), # 中国
("1.180.0.0", "1.180.255.255"), # 中国
("14.0.0.0", "14.127.255.255"), # 中国
("27.0.0.0", "27.63.255.255"), # 中国
("36.0.0.0", "36.63.255.255"), # 中国
("49.0.0.0", "49.127.255.255"), # 中国
("58.0.0.0", "58.255.255.255"), # 中国
("59.0.0.0", "59.63.255.255"), # 中国
("60.0.0.0", "60.255.255.255"), # 中国
("61.0.0.0", "61.63.255.255"), # 中国
("101.0.0.0", "101.127.255.255"), # 中国
("103.0.0.0", "103.63.255.255"), # 印度
("110.0.0.0", "110.127.255.255"), # 日本/韩国
("111.0.0.0", "111.63.255.255"), # 日本
("112.0.0.0", "112.127.255.255"), # 中国
("113.0.0.0", "113.127.255.255"), # 中国
("114.0.0.0", "114.127.255.255"), # 中国
("115.0.0.0", "115.127.255.255"), # 中国
("116.0.0.0", "116.127.255.255"), # 中国
("117.0.0.0", "117.127.255.255"), # 中国
("118.0.0.0", "118.63.255.255"), # 中国
("119.0.0.0", "119.63.255.255"), # 中国
("120.0.0.0", "120.63.255.255"), # 中国
("121.0.0.0", "121.63.255.255"), # 中国
("122.0.0.0", "122.127.255.255"), # 中国
("123.0.0.0", "123.127.255.255"), # 中国
("124.0.0.0", "124.127.255.255"), # 中国
("125.0.0.0", "125.127.255.255"), # 中国
("126.0.0.0", "126.127.255.255"), # 日本
("175.0.0.0", "175.63.255.255"), # 中国
("180.0.0.0", "180.127.255.255"), # 中国
("182.0.0.0", "182.127.255.255"), # 中国
("183.0.0.0", "183.63.255.255"), # 中国
("202.0.0.0", "202.127.255.255"), # 亚洲混合
("203.0.0.0", "203.127.255.255"), # 亚洲混合
("210.0.0.0", "210.127.255.255"), # 日本/韩国
("211.0.0.0", "211.63.255.255"), # 中国
("218.0.0.0", "218.127.255.255"), # 中国
("219.0.0.0", "219.63.255.255"), # 中国
("220.0.0.0", "220.127.255.255"), # 中国
("221.0.0.0", "221.127.255.255"), # 中国
("222.0.0.0", "222.127.255.255"), # 中国
("223.0.0.0", "223.63.255.255"), # 中国
],
"north_america": [
("3.0.0.0", "3.255.255.255"), # AWS 美国
("4.0.0.0", "4.255.255.255"), # 美国
("8.0.0.0", "8.255.255.255"), # 美国 Level3
("12.0.0.0", "12.255.255.255"), # AT&T 美国
("13.0.0.0", "13.255.255.255"), # 微软 Azure 美国
("15.0.0.0", "15.255.255.255"), # HP 美国
("17.0.0.0", "17.255.255.255"), # Apple 美国
("18.0.0.0", "18.255.255.255"), # MIT/Amazon
("23.0.0.0", "23.255.255.255"), # Akamai 美国
("34.0.0.0", "34.255.255.255"), # Google Cloud
("35.0.0.0", "35.255.255.255"), # Google
("40.0.0.0", "40.255.255.255"), # 微软
("44.0.0.0", "44.255.255.255"), # 美国业余无线
("45.0.0.0", "45.63.255.255"), # 美国
("52.0.0.0", "52.255.255.255"), # Amazon AWS
("54.0.0.0", "54.255.255.255"), # Amazon AWS
("66.0.0.0", "66.255.255.255"), # 美国各ISP
("67.0.0.0", "67.255.255.255"), # 美国
("68.0.0.0", "68.255.255.255"), # 美国
("69.0.0.0", "69.255.255.255"), # 美国
("70.0.0.0", "70.255.255.255"), # 美国
("71.0.0.0", "71.255.255.255"), # 美国
("72.0.0.0", "72.255.255.255"), # 美国
("73.0.0.0", "73.255.255.255"), # 美国 Comcast
("74.0.0.0", "74.255.255.255"), # 美国
("75.0.0.0", "75.255.255.255"), # 美国
("76.0.0.0", "76.255.255.255"), # 美国
("96.0.0.0", "96.255.255.255"), # 美国/加拿大
("97.0.0.0", "97.255.255.255"), # 美国
("98.0.0.0", "98.255.255.255"), # 美国
("99.0.0.0", "99.255.255.255"), # 美国 Amazon
("107.0.0.0", "107.255.255.255"), # 美国
("108.0.0.0", "108.255.255.255"), # 美国
("173.0.0.0", "173.255.255.255"), # 美国
("174.0.0.0", "174.255.255.255"), # 美国/加拿大
("184.0.0.0", "184.255.255.255"), # 美国/加拿大
("199.0.0.0", "199.255.255.255"), # 美国
("204.0.0.0", "204.255.255.255"), # 美国
("205.0.0.0", "205.255.255.255"), # 美国/加拿大
("206.0.0.0", "206.255.255.255"), # 美国/加拿大
("207.0.0.0", "207.255.255.255"), # 美国
("208.0.0.0", "208.255.255.255"), # 美国
("209.0.0.0", "209.255.255.255"), # 美国
("216.0.0.0", "216.255.255.255"), # 美国
],
"europe": [
("2.0.0.0", "2.255.255.255"), # 欧洲混合
("5.0.0.0", "5.255.255.255"), # 欧洲混合
("31.0.0.0", "31.255.255.255"), # 欧洲
("37.0.0.0", "37.255.255.255"), # 欧洲
("46.0.0.0", "46.255.255.255"), # 欧洲
("51.0.0.0", "51.255.255.255"), # 英国/微软
("62.0.0.0", "62.255.255.255"), # 欧洲
("77.0.0.0", "77.255.255.255"), # 欧洲
("78.0.0.0", "78.255.255.255"), # 欧洲
("79.0.0.0", "79.255.255.255"), # 欧洲
("80.0.0.0", "80.255.255.255"), # 欧洲
("81.0.0.0", "81.255.255.255"), # 欧洲
("82.0.0.0", "82.255.255.255"), # 欧洲
("83.0.0.0", "83.255.255.255"), # 欧洲
("84.0.0.0", "84.255.255.255"), # 欧洲
("85.0.0.0", "85.255.255.255"), # 欧洲
("86.0.0.0", "86.255.255.255"), # 欧洲
("87.0.0.0", "87.255.255.255"), # 欧洲
("88.0.0.0", "88.255.255.255"), # 欧洲
("89.0.0.0", "89.255.255.255"), # 欧洲
("90.0.0.0", "90.255.255.255"), # 欧洲
("91.0.0.0", "91.255.255.255"), # 欧洲
("92.0.0.0", "92.255.255.255"), # 欧洲
("93.0.0.0", "93.255.255.255"), # 欧洲
("94.0.0.0", "94.255.255.255"), # 欧洲
("95.0.0.0", "95.255.255.255"), # 欧洲
("176.0.0.0", "176.255.255.255"), # 欧洲
("178.0.0.0", "178.255.255.255"), # 欧洲
("185.0.0.0", "185.255.255.255"), # 欧洲
("188.0.0.0", "188.255.255.255"), # 欧洲
("193.0.0.0", "193.255.255.255"), # 欧洲
("194.0.0.0", "194.255.255.255"), # 欧洲
("195.0.0.0", "195.255.255.255"), # 欧洲
("213.0.0.0", "213.255.255.255"), # 欧洲
("217.0.0.0", "217.255.255.255"), # 欧洲
],
"south_america": [
("177.0.0.0", "177.255.255.255"), # 巴西
("179.0.0.0", "179.255.255.255"), # 巴西
("181.0.0.0", "181.255.255.255"), # 南美
("186.0.0.0", "186.255.255.255"), # 南美
("187.0.0.0", "187.255.255.255"), # 南美
("189.0.0.0", "189.255.255.255"), # 巴西
("190.0.0.0", "190.255.255.255"), # 南美
("191.0.0.0", "191.255.255.255"), # 南美
("200.0.0.0", "200.255.255.255"), # 南美
("201.0.0.0", "201.255.255.255"), # 南美
],
"africa": [
("41.0.0.0", "41.255.255.255"), # 非洲
("102.0.0.0", "102.255.255.255"), # 非洲
("105.0.0.0", "105.255.255.255"), # 非洲
("154.0.0.0", "154.255.255.255"), # 非洲
("196.0.0.0", "196.255.255.255"), # 非洲
("197.0.0.0", "197.255.255.255"), # 非洲
],
"oceania": [
("1.120.0.0", "1.159.255.255"), # 澳大利亚
("14.128.0.0", "14.191.255.255"), # 澳大利亚
("27.96.0.0", "27.127.255.255"), # 澳大利亚
("49.128.0.0", "49.191.255.255"), # 澳大利亚
("58.96.0.0", "58.111.255.255"), # 澳大利亚
("101.160.0.0","101.191.255.255"), # 澳大利亚
("110.136.0.0","110.143.255.255"), # 澳大利亚
("121.200.0.0","121.215.255.255"), # 澳大利亚
("124.168.0.0","124.191.255.255"), # 澳大利亚
("139.130.0.0","139.130.255.255"), # 澳大利亚
("144.136.0.0","144.137.255.255"), # 澳大利亚
("150.101.0.0","150.101.255.255"), # 澳大利亚
("175.100.0.0","175.111.255.255"), # 新西兰
("202.0.0.0", "202.63.255.255"), # 澳洲/太平洋
("203.96.0.0", "203.127.255.255"), # 澳洲
],
"middle_east": [
("5.0.0.0", "5.63.255.255"), # 中东
("37.0.0.0", "37.63.255.255"), # 中东
("46.0.0.0", "46.63.255.255"), # 中东
("78.0.0.0", "78.63.255.255"), # 中东
("82.0.0.0", "82.63.255.255"), # 中东
("86.0.0.0", "86.63.255.255"), # 伊朗
("91.0.0.0", "91.63.255.255"), # 中东
("94.0.0.0", "94.63.255.255"), # 中东
("176.0.0.0", "176.63.255.255"), # 中东
("178.0.0.0", "178.63.255.255"), # 中东
("185.0.0.0", "185.63.255.255"), # 中东
("188.0.0.0", "188.63.255.255"), # 中东
("213.0.0.0", "213.63.255.255"), # 中东
],
}

# 各地区访问权重(模拟真实业务场景:亚洲流量最多)
REGION_WEIGHTS = {
"asia": 45,
"north_america": 25,
"europe": 18,
"south_america": 5,
"oceania": 3,
"africa": 2,
"middle_east": 2,
}

# ── User-Agent 池(真实 UA 字符串)────────────────────────────
USER_AGENTS = {
"desktop_chrome_windows": [
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36",
"Mozilla/5.0 (Windows NT 11.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36",
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36",
],
"desktop_chrome_mac": [
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 14_2_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36",
],
"desktop_firefox_windows": [
"Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0",
"Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:120.0) Gecko/20100101 Firefox/120.0",
"Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:119.0) Gecko/20100101 Firefox/119.0",
],
"desktop_firefox_linux": [
"Mozilla/5.0 (X11; Linux x86_64; rv:121.0) Gecko/20100101 Firefox/121.0",
"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:120.0) Gecko/20100101 Firefox/120.0",
],
"desktop_safari_mac": [
"Mozilla/5.0 (Macintosh; Intel Mac OS X 14_2) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.2 Safari/605.1.15",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 13_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.0 Safari/605.1.15",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Safari/605.1.15",
],
"desktop_edge_windows": [
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36 Edg/120.0.0.0",
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36 Edg/119.0.0.0",
],
"mobile_chrome_android": [
"Mozilla/5.0 (Linux; Android 14; Pixel 8) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.6099.144 Mobile Safari/537.36",
"Mozilla/5.0 (Linux; Android 13; SM-S918B) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.6099.144 Mobile Safari/537.36",
"Mozilla/5.0 (Linux; Android 14; SM-A546B) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.6099.144 Mobile Safari/537.36",
"Mozilla/5.0 (Linux; Android 12; Redmi Note 11) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.6099.144 Mobile Safari/537.36",
"Mozilla/5.0 (Linux; Android 13; HUAWEI P60 Pro) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.6099.144 Mobile Safari/537.36",
"Mozilla/5.0 (Linux; Android 14; OnePlus 12) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.6099.144 Mobile Safari/537.36",
],
"mobile_safari_iphone": [
"Mozilla/5.0 (iPhone; CPU iPhone OS 17_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.2 Mobile/15E148 Safari/604.1",
"Mozilla/5.0 (iPhone; CPU iPhone OS 17_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Mobile/15E148 Safari/604.1",
"Mozilla/5.0 (iPhone; CPU iPhone OS 16_7 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Mobile/15E148 Safari/604.1",
"Mozilla/5.0 (iPhone; CPU iPhone OS 15_8 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1",
],
"tablet_ipad": [
"Mozilla/5.0 (iPad; CPU OS 17_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.2 Mobile/15E148 Safari/604.1",
"Mozilla/5.0 (iPad; CPU OS 16_7 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Mobile/15E148 Safari/604.1",
],
"tablet_android": [
"Mozilla/5.0 (Linux; Android 13; SM-X906C) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.6099.144 Safari/537.36",
"Mozilla/5.0 (Linux; Android 12; SM-T870) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.6099.144 Safari/537.36",
],
"bot": [
"Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)",
"Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)",
"Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)",
"facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)",
],
}

# UA 权重
UA_WEIGHTS = {
"desktop_chrome_windows": 28,
"desktop_chrome_mac": 10,
"desktop_firefox_windows": 6,
"desktop_firefox_linux": 3,
"desktop_safari_mac": 7,
"desktop_edge_windows": 5,
"mobile_chrome_android": 22,
"mobile_safari_iphone": 13,
"tablet_ipad": 3,
"tablet_android": 2,
"bot": 1,
}

# ── 请求路径 ──────────────────────────────────────────────────
URIS = {
"/": 10,
"/index.html": 8,
"/about": 3,
"/contact": 2,
"/products": 6,
"/products/list": 5,
"/products/detail": 5,
"/cart": 4,
"/checkout": 3,
"/order/confirm": 2,
"/user/login": 4,
"/user/register": 3,
"/user/profile": 3,
"/user/orders": 2,
"/search": 5,
"/api/v1/products": 4,
"/api/v1/user/info": 3,
"/api/v1/cart/add": 3,
"/api/v1/order/submit": 2,
"/api/v1/search": 3,
"/static/css/main.css": 6,
"/static/js/app.js": 6,
"/static/img/banner.jpg": 5,
"/static/img/logo.png": 4,
"/favicon.ico": 4,
"/robots.txt": 2,
"/sitemap.xml": 1,
"/admin": 1,
"/wp-login.php": 1, # 模拟爬虫/攻击
"/phpmyadmin": 1, # 模拟扫描
}

# HTTP 方法
METHODS = {"GET": 78, "POST": 18, "PUT": 2, "DELETE": 1, "HEAD": 1}

# HTTP 状态码权重
STATUS_CODES = {
200: 65,
206: 2,
301: 3,
302: 5,
304: 10,
400: 2,
401: 1,
403: 2,
404: 6,
408: 1,
429: 1,
500: 1,
502: 1,
}

# HTTP 协议版本
HTTP_VERSIONS = {"HTTP/1.1": 60, "HTTP/2.0": 35, "HTTP/1.0": 5}

# Referer 列表
REFERERS = [
"-",
"https://www.google.com/",
"https://www.baidu.com/",
"https://www.bing.com/",
"https://www.google.co.jp/",
"https://search.yahoo.com/",
"https://www.facebook.com/",
"https://t.co/",
"https://www.instagram.com/",
"https://www.linkedin.com/",
"https://www.zhihu.com/",
"https://weibo.com/",
]

# ── 工具函数 ──────────────────────────────────────────────────

def weighted_choice(weight_dict):
"""按权重随机选择"""
keys = list(weight_dict.keys())
weights = list(weight_dict.values())
return random.choices(keys, weights=weights, k=1)[0]


def ip_to_int(ip):
parts = list(map(int, ip.split(".")))
return (parts[0] << 24) | (parts[1] << 16) | (parts[2] << 8) | parts[3]


def int_to_ip(n):
return ".".join([
str((n >> 24) & 0xFF),
str((n >> 16) & 0xFF),
str((n >> 8) & 0xFF),
str( n & 0xFF),
])


def random_ip():
"""按地区权重生成随机 IP"""
region = weighted_choice(REGION_WEIGHTS)
ip_range = random.choice(IP_RANGES[region])
start = ip_to_int(ip_range[0])
end = ip_to_int(ip_range[1])
return int_to_ip(random.randint(start, end))


def random_ua():
"""按权重选择 User-Agent"""
ua_type = weighted_choice(UA_WEIGHTS)
return random.choice(USER_AGENTS[ua_type])


def random_time(start_days_ago=30):
"""生成过去 N 天内的随机时间(模拟真实时间分布)"""
now = datetime.datetime.now()
delta = datetime.timedelta(
days=random.randint(0, start_days_ago),
hours=random.randint(0, 23),
minutes=random.randint(0, 59),
seconds=random.randint(0, 59),
)
t = now - delta
# 模拟高峰期:白天流量高,凌晨流量低
hour = t.hour
if 0 <= hour < 6:
if random.random() < 0.7: # 70% 概率跳过凌晨低谷(减少凌晨日志)
t += datetime.timedelta(hours=random.randint(6, 12))
return t.strftime("%d/%b/%Y:%H:%M:%S +0800")


def random_bytes(status):
"""根据状态码生成合理的响应大小"""
if status in (301, 302, 304):
return random.randint(0, 256)
elif status in (400, 401, 403, 404):
return random.randint(256, 2048)
elif status == 500:
return random.randint(512, 4096)
else:
return random.randint(1024, 102400)


def random_request_time(status):
"""生成请求响应时间(秒)"""
if status in (301, 302, 304):
return round(random.uniform(0.001, 0.01), 3)
elif status == 500:
return round(random.uniform(1.0, 30.0), 3)
elif status == 502:
return round(random.uniform(5.0, 60.0), 3)
else:
return round(random.uniform(0.01, 2.0), 3)


def generate_log_line():
"""生成一条 Nginx Combined 格式日志"""
remote_addr = random_ip()
remote_user = "-"
time_local = random_time()
method = weighted_choice(METHODS)
uri = weighted_choice(URIS)
http_version = weighted_choice(HTTP_VERSIONS)
status = weighted_choice(STATUS_CODES)
body_bytes_sent = random_bytes(status)
http_referer = random.choice(REFERERS)
http_user_agent = random_ua()
request_time = random_request_time(status)

# Nginx Combined Log Format + request_time(与真实生产环境一致)
return (
f'{remote_addr} - {remote_user} [{time_local}] '
f'"{method} {uri} {http_version}" '
f'{status} {body_bytes_sent} '
f'"{http_referer}" '
f'"{http_user_agent}" '
f'{request_time}'
)


# ── 主程序 ────────────────────────────────────────────────────

def main():
output_file = sys.argv[1] if len(sys.argv) > 1 else "/tmp/nginx_sim.log"
total = int(sys.argv[2]) if len(sys.argv) > 2 else 100000

print(f"[*] 开始生成 {total:,} 条 Nginx 模拟日志")
print(f"[*] 输出文件: {output_file}")

os.makedirs(os.path.dirname(os.path.abspath(output_file)), exist_ok=True)

with open(output_file, "w", encoding="utf-8") as f:
for i in range(1, total + 1):
f.write(generate_log_line() + "\n")
if i % 10000 == 0:
pct = i / total * 100
print(f" 进度: {i:>7,} / {total:,} ({pct:.0f}%)")

size_mb = os.path.getsize(output_file) / 1024 / 1024
print(f"[✓] 生成完成!文件大小: {size_mb:.1f} MB")


if __name__ == "__main__":
main()

执行,生成日志 ( elk91 的 /tmp/nginx_sim.log )

1
[root@elk91 ~]#python3 generate_nginx_log_V2.py /tmp/nginx_sim.log 100000

Filebeat(elk91)采集日志 → 发送到 6666 端口

1
2
3
4
5
6
7
8
9
root@elk91:~# cat /tmp/nginx-sim-to-logstash.yaml
filebeat.inputs:
- type: filestream
paths:
- /tmp/nginx_sim.log

output.logstash:
hosts: ["10.0.0.93:6666"]

Logstash(elk93)接收、过滤、处理, 写入ES

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
root@elk93:/etc/logstash/conf.d# cat /etc/logstash/conf.d/04-beats-nginx-to-es.conf 
input {
beats {
port => "6666"
}
}

filter {

# ── 第一步:grok 解析 Nginx 日志格式 ──────────────────────
grok {
match => {
"message" => '%{IPORHOST:remote_addr} - - \[%{HTTPDATE:time_local}\] \"%{WORD:method} %{URIPATHPARAM:request_uri} HTTP/%{NUMBER:http_version}\" %{NUMBER:status:int} %{NUMBER:body_bytes_sent:int} \"%{DATA:http_referer}\" \"%{DATA:http_user_agent}\" %{NUMBER:request_time:float}'
}
tag_on_failure => ["_grok_failure"]
}

# ── 第二步:修正时间戳 ────────────────────────────────────
date {
match => ["time_local", "dd/MMM/yyyy:HH:mm:ss Z"]
target => "@timestamp"
timezone => "Asia/Shanghai"
remove_field => ["time_local"]
}

# ── 第三步:GeoIP 解析(国家、城市、经纬度)──────────────
geoip {
source => "remote_addr"
target => "geoip"
fields => ["city_name", "country_name", "country_code2",
"continent_code", "location", "region_name"]
tag_on_failure => ["_geoip_failure"]
}

# ── 第四步:User-Agent 解析 ───────────────────────────────
useragent {
source => "http_user_agent"
target => "ua"
}

# ── 第五步:设备类型标准化 ────────────────────────────────
if [ua][device][name] == "iPhone" or [ua][os_name] == "Android" {
mutate { add_field => { "device_type" => "Mobile" } }
} else if [ua][device][name] == "iPad" or [ua][device][name] =~ /(?i)tablet/ {
mutate { add_field => { "device_type" => "Tablet" } }
} else if [ua][name] =~ /(?i)(bot|spider|crawler)/ {
mutate { add_field => { "device_type" => "Bot" } }
} else {
mutate { add_field => { "device_type" => "Desktop" } }
}

# ── 第六步:HTTP 状态码分类 ───────────────────────────────
if [status] >= 500 {
mutate { add_field => { "status_type" => "5xx Server Error" } }
} else if [status] >= 400 {
mutate { add_field => { "status_type" => "4xx Client Error" } }
} else if [status] >= 300 {
mutate { add_field => { "status_type" => "3xx Redirect" } }
} else {
mutate { add_field => { "status_type" => "2xx Success" } }
}

# ── 第七步:流量来源分类 ──────────────────────────────────
if [http_referer] == "-" or [http_referer] == "" {
mutate { add_field => { "traffic_source" => "Direct" } }
} else if [http_referer] =~ /google/ {
mutate { add_field => { "traffic_source" => "Google" } }
} else if [http_referer] =~ /baidu/ {
mutate { add_field => { "traffic_source" => "Baidu" } }
} else if [http_referer] =~ /bing/ {
mutate { add_field => { "traffic_source" => "Bing" } }
} else if [http_referer] =~ /facebook|twitter|instagram|linkedin|weibo|zhihu/ {
mutate { add_field => { "traffic_source" => "Social" } }
} else {
mutate { add_field => { "traffic_source" => "Other Referral" } }
}

# ── 第八步:清理无用字段 ──────────────────────────────────
mutate {
remove_field => ["message", "@version", "agent", "log",
"ecs", "tags", "input", "host"]
}

}

output {
# 调试阶段先开 stdout,确认数据格式正确后注释掉
stdout {
codec => "rubydebug"
}

elasticsearch {
hosts => ["10.0.0.91:9200","10.0.0.92:9200","10.0.0.93:9200"]
index => "oldboyedu-linux-nginx-sim-%{+YYYY.MM.dd}"
}
}

Kibana数据可视化,创建索引模式

遇到的问题:索引模式不包括任何地理字段

尝试解决:克隆老师之前的索引模板,在映射中点击 添加字段

添加 location 字段:

1
2
字段名:geoip.location
字段类型:geo_point ← 这个很关键

创建模板,将模板命名为自己之后要创建的索引模板的名字

同学分享:该作业可以利用Day1的filebeat的module模块实现

1
- 使用ansible Playbook一键部署ELFK架构;

在Day1的扩展作业上增加利用ansible部署Logstash

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
通过命令生成名为logstash的角色,后面用于执行playbook
ansible-galaxy init logstash

将前面准备好的安装包和配置文件都放到/etc/ansible/roles/logstash/files目录下
[root@m01 files]#ls
01-stdin-to-stdout.conf logstash-7.17.29-x86_64.rpm

在/etc/ansible/roles/logstash/tasks下的main.yml书写剧本
[root@m01 tasks]#cat main.yml
---
# tasks file for logstash
- name: copy logstash package
copy:
src: logstash-7.17.29-x86_64.rpm
dest: /root/logstash-7.17.29-x86_64.rpm

- name: install logstash
yum:
name: /root/logstash-7.17.29-x86_64.rpm
disable_gpg_check: yes
state: present

- name: copy logstash config
copy:
src: 01-stdin-to-stdout.conf
dest: /etc/logstash/conf.d/01-stdin-to-stdout.conf

- name: create logstash symlink
file:
src: /usr/share/logstash/bin/logstash
dest: /usr/local/bin/logstash
state: link
force: yes

3.扩展作业

1
2
3
4
- 调研Loki+Grafana轻量级日志采集方案,并给出测试样例。

参考博客:
https://www.cnblogs.com/xiangpeng/p/18127120

其中promtail的配置文件的修改:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
[root@elk93 promtail]$cat promtail.yaml
server:
http_listen_port: 9080
grpc_listen_port: 0

positions:
filename: ./positions.yaml

clients:
- url: http://10.0.0.91:8094/loki/api/v1/push #日志服务器loki地址和端口

scrape_configs:
#ucenter1
- job_name: oldboyedu-elk93
static_configs:
- targets:
- localhost
- labels:
job: oldboyedu-elk93
host: localhost
__path__: /tmp/haha.log #本机日志路径

八.自己整理

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
vi/vim多行注释和取消注释
多行注释:
  1. 进入命令行模式,按ctrl + v进入 visual block模式,选择需要注释的行

  2. 按大写字母I,再输入注释符,例如 # 或 // 等注释符号

  3. 按 Esc 键,所有选中的行都会添加注释

取消多行注释:
  1. 进入命令行模式,按ctrl + v进入 visual block模式

  2. 选中注释符号

  3. 按d键删除
  
vim + 文件名 + 指定行
打开文件的同时直接定位到指定行
省去了打开后再手动跳行的步骤
配置文件很长时非常实用


短选项 → -f 单横线+字母 写起来快,适合命令行日常使用
长选项 → --path.config 双横线+单词 写起来清晰,适合脚本和文档

九.思维导图

Kibana出图展示,Filebeat多行匹配,故障排查技巧及ELFK架构