外观
一次简单的nginx异常流量排查
现象描述
一觉醒来,发现阿里云给我发了几十封流量报警邮件。
查看服务器监控,能看到non established的tcp连接数经常会飙升,然后又恢复正常,并且服务器流量也会突然飙升到10+MB/s,然后在一段时间内又恢复正常。
排查过程
sudo nethogs -d 1查看实时流量进程,等待一段时间,定位到是nginx: worker process在跑流量,监控报警时上下行流量均超过10MB/s:
# sudo nethogs -d 1
PID USER PROGRAM DEV SENT RECEIVED
392225 www-da.. nginx: worker process ztphij 10490.422 12648.170 KB/sec
... ...进一步用tail -f查看nginx访问日志:
# tail -f /var/log/nginx/access.log
45.148.10.21 - - [27/Apr/2026:09:05:39 +0800] "GET /wp-json/gravitysmtp/v1/tests/mock-data?page=gravitysmtp-settings HTTP/1.1" 301 178 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36"
45.148.10.21 - - [27/Apr/2026:09:07:06 +0800] "GET /wp-json/gravitysmtp/v1/tests/mock-data?page=gravitysmtp-settings HTTP/1.1" 301 178 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36"
74.7.228.2 - - [27/Apr/2026:09:08:41 +0800] "GET /robots.txt HTTP/1.1" 301 178 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36; compatible; OAI-SearchBot/1.3; robots.txt; +https://openai.com/searchbot"
74.7.228.2 - - [27/Apr/2026:09:08:42 +0800] "GET /robots.txt HTTP/2.0" 404 138 "http://zerotier.yuany3721.site/robots.txt" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36; compatible; OAI-SearchBot/1.3; robots.txt; +https://openai.com/searchbot"
... ...
# 都是一些低频访问请求,尽管有一些是我们不愿意看到的,不过无关大雅
# tail -f /var/log/nginx/stream.log
216.73.216.76 [27/Apr/2026:09:41:46 +0800] TCP 200 2421 222 0.264
216.73.216.76 [27/Apr/2026:09:41:46 +0800] TCP 200 2421 222 0.271
216.73.216.76 [27/Apr/2026:09:41:46 +0800] TCP 200 2418 222 0.291
216.73.216.76 [27/Apr/2026:09:41:46 +0800] TCP 200 2421 222 0.275
216.73.216.76 [27/Apr/2026:09:41:46 +0800] TCP 200 2419 222 0.266
216.73.216.76 [27/Apr/2026:09:41:46 +0800] TCP 200 2421 222 0.292
216.73.216.76 [27/Apr/2026:09:41:46 +0800] TCP 200 2420 222 0.290
216.73.216.76 [27/Apr/2026:09:41:47 +0800] TCP 200 3746 426 19.624
# 哦豁,发现罪魁祸首了!这个ip一秒能给我干出去几百个请求,绝大多数请求都是发来222字节,会话时间非常短。解决方案
临时解决方案
给nginx加上deny list。
提示
如果 deny 列表很大的时候,用 iptables + ipset 配置路由 deny 规则会更合适。
# sudo vi /etc/nginx/deny.conf
deny 216.73.216.76;
# 可以配置多个deny,不需要再额外更改nginx配置文件
# 在nginx.conf的server块里加上
include /etc/nginx/deny.conf;
allow all;
# 重载配置
# sudo nginx -s reload配置生效后再去查看日志:
# tail -f /var/log/nginx/stream.log
# 都被403拦截了
216.73.216.76 [27/Apr/2026:09:48:59 +0800] TCP 403 0 0 0.000
216.73.216.76 [27/Apr/2026:09:48:59 +0800] TCP 403 0 0 0.000
216.73.216.76 [27/Apr/2026:09:49:00 +0800] TCP 403 0 0 0.000
216.73.216.76 [27/Apr/2026:09:49:00 +0800] TCP 403 0 0 0.000
216.73.216.76 [27/Apr/2026:09:49:00 +0800] TCP 403 0 0 0.000
216.73.216.76 [27/Apr/2026:09:49:00 +0800] TCP 403 0 0 0.000
# tail -f /var/log/nginx/error.log
# 正常拦截的提示
2026/04/27 09:58:40 [error] 486051#486051: *247585 access forbidden by rule while initializing session, client: 216.73.216.76, server: 0.0.0.0:443
2026/04/27 09:58:40 [error] 486051#486051: *247586 access forbidden by rule while initializing session, client: 216.73.216.76, server: 0.0.0.0:443
2026/04/27 09:58:40 [error] 486051#486051: *247587 access forbidden by rule while initializing session, client: 216.73.216.76, server: 0.0.0.0:443
2026/04/27 09:58:40 [error] 486051#486051: *247588 access forbidden by rule while initializing session, client: 216.73.216.76, server: 0.0.0.0:443fail2ban
临时解决后,没过多久就会死灰复燃,攻击者换个ip继续干。
fail2ban的核心原理就是实时监控指定的日志文件(比如Nginx的访问或错误日志),用正则表达式匹配异常行为(如大量404、频繁请求登录接口),一旦某个IP在设定时间内达到了触发阈值,就自动调用系统防火墙(如iptables)按指定规则将其封禁。
sudo apt install fail2ban
sudo systemctl start fail2ban
sudo systemctl enable fail2ban配置fail2ban:
# 覆盖/etc/fail2ban/jail.conf
sudo cp /etc/fail2ban/jail.conf /etc/fail2ban/jail.localsudo vi /etc/fail2ban/jail.local
[DEFAULT]
ignoreip = 172.30.0.0/16
bantime = -1
[nginx-stream-attack]
enabled = true
filter = nginx-stream-attack
logpath = /var/log/nginx/stream.log
maxretry = 30 # 30次重试
findtime = 5 # 5s
bantime = -1 # 永封
action = iptables-allports[name=StreamAttacker, protocol=tcp]sudo vi /etc/fail2ban/filter.d/nginx-stream-attack.conf
[Definition]
# 匹配 Stream 日志
# 日志格式:<IP> [time] TCP xxx ...
failregex = ^<HOST> .* TCP \d{3} .*$
# 不忽略任何行
ignoreregex =重启fail2ban并查看状态:
sudo systemctl restart fail2ban
sudo fail2ban-client status nginx-stream-attack