Supervisor 守护进程托管 Jar 包时遇到的OOM问题和文件打开数问题 (托管 php 或者 python 程序时也可参考类似)
Centos 下安装 Supervisor:
https://blog.csdn.net/fenglailea/article/details/77146248
Ubuntu 下安装Supervisor :
apt-get install supervisor
确认
Y
查看
/etc/supervisor/conf.d
的路径是否存在
执行命令
supervisorctl status
是否正常
问题1 OOM问题 :
托管jar 包时 出现OOM 问题:
java.lang.OutOfMemoryError: unable to create new native thread
届时服务器CPU 和 内存均未满
Supervisor 配置启动如下:
[program:iot08-api-java]
LimitNOFILE=40960
LimitNPROC=40960
command=/usr/java/jdk1.8.0_281/bin/java -Xms3g -Xmx3g -XX:NewSize=3584m -XX:PermSize=64m -XX:SurvivorRatio=1 -XX:+UseParallelGC -XX:-UseAdaptiveSizePolicy -jar api-0.0.1-SNAPSHOT.jar --spring.profiles.active=prod
directory=/jy/iot08-api
user=root
autostart=true
autorestart=true
priority=200
stdout_logfile=/jy/iot08-api/logs/supervisor.log
设置Supervisor 启动时的 JVM 参数:
-Xms3g -Xmx3g
但是,报错时 使用
top
指令查看 对应的进程并没有达到最大限制内存, 由此可排除是 JVM 的参数内存设置不够引起的问题.
根据错误提示可猜测是否由于 线程数过多, 无法创建子线程引起的问题.
参考资料链接:
https://blog.csdn.net/Variazioni/article/details/104060854
查找资料后确定可能是由于
系统 systemctl 配置开机启动后 , systemctl 有进程限制的问题.
查看Supervisor 的进程限制
cd /sys/fs/cgroup/pids/system.slice
找到
cat supervisor.service/pids.max
查看是否过低
过低可修改
/etc/systemd/system.conf
中的
DefaultTasksMax=4656
使其默认值变大
https://blog.csdn.net/weixin_39606911/article/details/110815801
重载systemd配置文件
重启可以生效配置文件,但是服务器一般不能随便重启,systemd通过的重载配置的命令,不需要重启服务器,重载之后,重启一下服务即可。
重载systemd管理配置命令:
systemctl daemon-reexec
如过修改单个的service文件,直接重载一下配置文件即可
eg:
vi /usr/lib/systemd/system/supervisord.service
systemctl daemon-reload systemctl restart supervisord
问题2 句柄问题 Too many open files :
报错部分日志如下:
2021-08-29 21:16:00.152 ERROR 5116 --- [http-nio-8096-exec-192] o.a.c.c.C.[.[.[/].[dispatcherServlet] : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is org.springframework.web.multipart.MultipartException: Failed to parse multipart servlet request; nested exception is java.io.IOException: org.apache.tomcat.util.http.fileupload.impl.IOFileUploadException: Processing of multipart/form-data request failed. /tmp/tomcat.3014993044008228489.8096/work/Tomcat/localhost/ROOT/upload_e697e5f0_82e8_4e09_82c9_6722d7ea847a_00004483.tmp (Too many open files)] with root cause
java.io.FileNotFoundException: /tmp/tomcat.3014993044008228489.8096/work/Tomcat/localhost/ROOT/upload_e697e5f0_82e8_4e09_82c9_6722d7ea847a_00004483.tmp (Too many open files)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
at java.io.FileOutputStream.<init>(FileOutputStream.java:162)
at org.apache.tomcat.util.http.fileupload.DeferredFileOutputStream.thresholdReached(DeferredFileOutputStream.java:151)
at org.apache.tomcat.util.http.fileupload.ThresholdingOutputStream.checkThreshold(ThresholdingOutputStream.java:200)
at org.apache.tomcat.util.http.fileupload.ThresholdingOutputStream.write(ThresholdingOutputStream.java:126)
at org.apache.tomcat.util.http.fileupload.util.Streams.copy(Streams.java:105)
at org.apache.tomcat.util.http.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:291)
at org.apache.catalina.connector.Request.parseParts(Request.java:2869)
at org.apache.catalina.connector.Request.getParts(Request.java:2771)
at org.apache.catalina.connector.RequestFacade.getParts(RequestFacade.java:1098)
at org.springframework.web.multipart.support.StandardMultipartHttpServletRequest.parseRequest(StandardMultipartHttpServletRequest.java:95)
at org.springframework.web.multipart.support.StandardMultipartHttpServletRequest.<init>(StandardMultipartHttpServletRequest.java:88)
at org.springframework.web.multipart.support.StandardServletMultipartResolver.resolveMultipart(StandardServletMultipartResolver.java:87)
at org.springframework.web.servlet.DispatcherServlet.checkMultipart(DispatcherServlet.java:1178)
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1012)
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:943)
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006)
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:880)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:733)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:541)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:143)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343)
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:374)
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:868)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1590)
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
2021-08-29 21:16:01.368 ERROR 5116 --- [http-nio-8096-Acceptor] org.apache.tomcat.util.net.Acceptor : Socket accept failed
java.io.IOException: Too many open files
at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
at org.apache.tomcat.util.net.NioEndpoint.serverSocketAccept(NioEndpoint.java:469)
at org.apache.tomcat.util.net.NioEndpoint.serverSocketAccept(NioEndpoint.java:71)
at org.apache.tomcat.util.net.Acceptor.run(Acceptor.java:106)
at java.lang.Thread.run(Thread.java:748)
然后我查询系统的 句柄限制数:
root@iZwz998ewjqvn993qhie7hZ:~# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 15604
max locked memory (kbytes, -l) 16384
max memory size (kbytes, -m) unlimited
open files (-n) 1000000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 1000000
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
已经是 100W 的了,不可能不够,搜索资料查到:
https://baijiahao.baidu.com/s?id=1619745245952371100&wfr=spider&for=pc
有如下说明:
CentOS上使用系统自带的supervisor,使用systemd启动supervisord的服务。被supervisor管理的程序,继承的是systemd对应的限制,如果需要修改的话,就需要在启动.service文件里面修改对应的限制
vim /usr/lib/systemd/system/supervisord.service
里面有:
LimitNOFILE=40960
LimitNPROC=40960
这个2个参数
然后后查询systemd 的默认限制发现, 文件打开数不够:
root@iZwz998ewjqvn993qhie7hZ:~# systemctl show -p DefaultLimitNOFILE
DefaultLimitNOFILE=4096
root@iZwz998ewjqvn993qhie7hZ:~# systemctl show -p DefaultLimit
DefaultLimitAS DefaultLimitLOCKSSoft DefaultLimitRSS
DefaultLimitASSoft DefaultLimitMEMLOCK DefaultLimitRSSSoft
DefaultLimitCORE DefaultLimitMEMLOCKSoft DefaultLimitRTPRIO
DefaultLimitCORESoft DefaultLimitMSGQUEUE DefaultLimitRTPRIOSoft
DefaultLimitCPU DefaultLimitMSGQUEUESoft DefaultLimitRTTIME
DefaultLimitCPUSoft DefaultLimitNICE DefaultLimitRTTIMESoft
DefaultLimitDATA DefaultLimitNICESoft DefaultLimitSIGPENDING
DefaultLimitDATASoft DefaultLimitNOFILE DefaultLimitSIGPENDINGSoft
DefaultLimitFSIZE DefaultLimitNOFILESoft DefaultLimitSTACK
DefaultLimitFSIZESoft DefaultLimitNPROC DefaultLimitSTACKSoft
DefaultLimitLOCKS DefaultLimitNPROCSoft
root@iZwz998ewjqvn993qhie7hZ:~# systemctl show -p DefaultLimitNPROC
DefaultLimitNPROC=15604
于是手动修改
可以在 supervisord.service 中加上:
LimitNOFILE=1000000
LimitNPROC=1000000
使其每个托管的守护进程的文件打开句柄数默认都是此配置
也可以在对应的进程的 supervisord 配置文件中
[program:iot08-api-java]
LimitNOFILE=1000000
LimitNPROC=1000000
command=/usr/java/jdk1.8.0_281/bin/java -Xms3g -Xmx3g -XX:NewSize=3584m -XX:PermSize=64m -XX:SurvivorRatio=1 -XX:+UseParallelGC -XX:-UseAdaptiveSizePolicy -jar api-0.0.1-SNAPSHOT.jar --spring.profiles.active=prod
directory=/jy/iot08-api
user=root
autostart=true
autorestart=true
priority=200
stdout_logfile=/jy/iot08-api/logs/supervisor.log
重载supervisord配置文件
修改了supervisord 配置文件 的话需要 更新重启生效
supervisorctl update
重启对应修改程序
supervisorctl restart iot08-api-java