早晨六点多看到消息,有警报。用手机上的 Safari 浏览器打开管理后台的链接,没反应。赶紧起床,打开电脑,在电脑的浏览器上尝试,浏览器底部状态栏是 Waiting for ......
Dec 18 2025 - 07:05:00
WHM/cPanel FTP都登录正常,但是网站打不开。
Dec 18 2025 - 07:09:30
error code 522
522:源站超时(线路/防火墙丢包/源站负载高)
如下图,不过下图是我今天截取的。
SSH 顺畅登录服务器。
curl -I http://127.0.0.1/
curl -Ik https://127.0.0.1/
自检正常。
ss -lntp | egrep ':80|:443'
数据出来,发现了异常。
*:80 是 LISTEN 0 128(正常)
*:443 是 LISTEN 129 128(不正常)
在 ss -lntp 里,LISTEN 行的 Recv-Q 表示“已排队等待 accept 的连接数”,Send-Q 是 backlog 上限。你现在 443 的 accept 队列已经满了(甚至显示超过 128),结果就是:Cloudflare 过来的新连接进不去 / 排队太久 → connect 超时 → Cloudflare 报 522
所以现在要查的不是“80/443 有没有监听、机器有没有爆”,而是:为什么 443 accept 不过来(worker 满、KeepAlive 太长、后端卡住、或有连接洪水/扫描)。
其实,ChatGPT 在几分钟内就指出了异常所在。
直接启动就好了。然后我就没管了。其实几分钟就不行了。问题并没有解决。
今天警报加强,必须解决。
~# date
Fri Dec 19 11:54:35 CST 2025
~# echo "== listen queue 443 =="; ss -lnt | awk '$4 ~ /:443$/ {print $2 "/" $3}'
== listen queue 443 ==
1025/1024
~# echo "== estab 443 =="; ss -ntH state established '( sport = :443 )' | wc -l
== estab 443 ==
443
~# echo "== syn-recv 443 =="; ss -ntH state syn-recv '( sport = :443 )' | wc -l
== syn-recv 443 ==
911
~# echo "== scoreboard counts ==";
== scoreboard counts ==
~# curl -s --max-time 2 http://127.0.0.1/server-status?auto \
> | awk -F': ' '/^Scoreboard:/{s=$2} END{for(i=1;i<=length(s);i++)c[substr(s,i,1)]++; for(k in c)print k,c[k]}' \
> | sort -k2 -nr || echo "server-status timeout"
_ 100
W 150
. 256
~# echo "== busy/idle =="
== busy/idle ==
~# curl -s --max-time 2 http://127.0.0.1/server-status?auto | egrep 'ServerMPM|BusyWorkers|IdleWorkers' || true
ServerMPM: prefork
ParentServerMPMGeneration: 1
BusyWorkers: 150
IdleWorkers: 0
~# echo "== top states of httpd =="
== top states of httpd ==
~# ps -C httpd --no-headers | wc -l
151
~# top -bn1 | head -n 25
top - 11:54:38 up 1 day, 4:49, 1 user, load average: 0.57, 3.77, 4.56
Tasks: 563 total, 1 running, 562 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.2 us, 0.5 sy, 0.0 ni, 99.0 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st
KiB Mem : 65589344 total, 570792 free, 3288612 used, 61729940 buff/cache
KiB Swap: 4194300 total, 3053052 free, 1141248 used. 61686688 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4967 root 20 0 160600 2312 1492 R 5.9 0.0 0:00.03 top
1 root 20 0 194192 5624 3160 S 0.0 0.0 1:34.17 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.51 kthreadd
4 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:+
6 root 20 0 0 0 0 S 0.0 0.0 0:04.68 ksoftirqd/0
7 root rt 0 0 0 0 S 0.0 0.0 0:00.66 migration/0
8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
9 root 20 0 0 0 0 S 0.0 0.0 8:21.46 rcu_sched
10 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 lru-add-dr+
11 root rt 0 0 0 0 S 0.0 0.0 0:00.41 watchdog/0
12 root rt 0 0 0 0 S 0.0 0.0 0:00.30 watchdog/1
13 root rt 0 0 0 0 S 0.0 0.0 0:00.61 migration/1
14 root 20 0 0 0 0 S 0.0 0.0 0:03.16 ksoftirqd/1
16 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/1:+
18 root rt 0 0 0 0 S 0.0 0.0 0:00.30 watchdog/2
19 root rt 0 0 0 0 S 0.0 0.0 0:00.60 migration/2
20 root 20 0 0 0 0 S 0.0 0.0 0:02.91 ksoftirqd/2
22 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/2:+
最后发现怎么改参数都没有用,最后通过下面的命令发现,原来是配置文件之间互相打架。
~# egrep -RIn '^[[:space:]]*(ServerLimit|MaxRequestWorkers|MaxClients|KeepAlive|KeepAliveTimeout|MaxKeepAliveRequests|Timeout)[[:space:]]+' \
> /etc/apache2/conf/httpd.conf /etc/apache2/conf.d /etc/apache2/conf.modules.d 2>/dev/null
/etc/apache2/conf/httpd.conf:70:ServerLimit 256
/etc/apache2/conf/httpd.conf:71:MaxRequestWorkers 150
/etc/apache2/conf/httpd.conf:73:KeepAlive On
/etc/apache2/conf/httpd.conf:74:KeepAliveTimeout 5
/etc/apache2/conf/httpd.conf:75:MaxKeepAliveRequests 100
/etc/apache2/conf/httpd.conf:76:Timeout 300
/etc/apache2/conf.d/includes/pre_main_global.conf:2: ServerLimit 256
/etc/apache2/conf.d/includes/pre_main_global.conf:3: MaxRequestWorkers 256
/etc/apache2/conf.d/includes/pre_main_global.conf:12:KeepAlive On
/etc/apache2/conf.d/includes/pre_main_global.conf:13:KeepAliveTimeout 1
/etc/apache2/conf.d/includes/pre_main_global.conf:14:Timeout 60
最终做了以下几件事:
- ListenBacklog 从 1024 → 4096
- KeepAliveTimeout 从 5 → 1
- Timeout 从 300 → 60
- MaxRequestWorkers 从 150 → 256
- 清理掉冲突配置
解决了问题。
~# httpd -v
Server version: Apache/2.4.66 (cPanel)
Server built: Dec 8 2025 14:42:54
~# cat /etc/centos-release
CentOS Linux release 7.9.2009 (Core)
~# php -v
PHP 7.2.34 (cli) (built: Oct 29 2025 00:17:30) ( NTS )
Copyright (c) 1997-2018 The PHP Group
Zend Engine v3.2.0, Copyright (c) 1998-2018 Zend Technologies
~# rpm -qa | egrep 'ea-apache24-mod_mpm_(prefork|event|worker)'
ea-apache24-mod_mpm_prefork-2.4.66-2.3.1.cpanel.x86_64
~# apachectl -M 2>&1 | egrep -i 'mpm|php|proxy_fcgi'
mpm_prefork_module (shared)
proxy_fcgi_module (shared)
操作系统很老。Apache版本很新。Apache结合PHP-FPM,还在用MPM Prefork。不知道为什么突然出现了443端口的连接的堆积。
现在都推荐将MPM Prefork切换到MPM Event,也许是长久之计。
No comments:
Post a Comment