Supervisor 守护进程托管 Jar 包时遇到的OOM问题和文件打开数问题 (托管 php 或者 python 程序时也可参考类似)
   
Centos 下安装 Supervisor:
https://blog.csdn.net/fenglailea/article/details/77146248
Ubuntu 下安装Supervisor :
    
     apt-get install supervisor
    
    确认
    
     Y
    
    查看
    
     /etc/supervisor/conf.d
    
    的路径是否存在
   
    执行命令
    
     supervisorctl status
    
    是否正常
   
    
    
    
     问题1 OOM问题 :
    
   
托管jar 包时 出现OOM 问题:
java.lang.OutOfMemoryError: unable to create new native thread
届时服务器CPU 和 内存均未满
Supervisor 配置启动如下:
[program:iot08-api-java]
LimitNOFILE=40960
LimitNPROC=40960
command=/usr/java/jdk1.8.0_281/bin/java -Xms3g -Xmx3g -XX:NewSize=3584m -XX:PermSize=64m -XX:SurvivorRatio=1 -XX:+UseParallelGC -XX:-UseAdaptiveSizePolicy -jar api-0.0.1-SNAPSHOT.jar --spring.profiles.active=prod
directory=/jy/iot08-api
user=root
autostart=true
autorestart=true
priority=200
stdout_logfile=/jy/iot08-api/logs/supervisor.log
设置Supervisor 启动时的 JVM 参数:
    
     -Xms3g -Xmx3g
    
   
    但是,报错时 使用
    
     top
    
    指令查看 对应的进程并没有达到最大限制内存, 由此可排除是 JVM 的参数内存设置不够引起的问题.
   
根据错误提示可猜测是否由于 线程数过多, 无法创建子线程引起的问题.
参考资料链接:
https://blog.csdn.net/Variazioni/article/details/104060854
查找资料后确定可能是由于
系统 systemctl 配置开机启动后 , systemctl 有进程限制的问题.
查看Supervisor 的进程限制
    
     cd /sys/fs/cgroup/pids/system.slice
    
找到
    
     cat supervisor.service/pids.max
    
查看是否过低
过低可修改
    
     /etc/systemd/system.conf
    
    中的
    
     DefaultTasksMax=4656
    
    使其默认值变大
   
https://blog.csdn.net/weixin_39606911/article/details/110815801
    
    
    
     重载systemd配置文件
    
   
重启可以生效配置文件,但是服务器一般不能随便重启,systemd通过的重载配置的命令,不需要重启服务器,重载之后,重启一下服务即可。
重载systemd管理配置命令:
systemctl daemon-reexec
如过修改单个的service文件,直接重载一下配置文件即可
    eg:
    
     vi /usr/lib/systemd/system/supervisord.service
    
systemctl daemon-reload systemctl restart supervisord
    
    
    
     问题2 句柄问题 Too many open files :
    
   
报错部分日志如下:
2021-08-29 21:16:00.152 ERROR 5116 --- [http-nio-8096-exec-192] o.a.c.c.C.[.[.[/].[dispatcherServlet]    : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is org.springframework.web.multipart.MultipartException: Failed to parse multipart servlet request; nested exception is java.io.IOException: org.apache.tomcat.util.http.fileupload.impl.IOFileUploadException: Processing of multipart/form-data request failed. /tmp/tomcat.3014993044008228489.8096/work/Tomcat/localhost/ROOT/upload_e697e5f0_82e8_4e09_82c9_6722d7ea847a_00004483.tmp (Too many open files)] with root cause
java.io.FileNotFoundException: /tmp/tomcat.3014993044008228489.8096/work/Tomcat/localhost/ROOT/upload_e697e5f0_82e8_4e09_82c9_6722d7ea847a_00004483.tmp (Too many open files)
        at java.io.FileOutputStream.open0(Native Method)
        at java.io.FileOutputStream.open(FileOutputStream.java:270)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:162)
        at org.apache.tomcat.util.http.fileupload.DeferredFileOutputStream.thresholdReached(DeferredFileOutputStream.java:151)
        at org.apache.tomcat.util.http.fileupload.ThresholdingOutputStream.checkThreshold(ThresholdingOutputStream.java:200)
        at org.apache.tomcat.util.http.fileupload.ThresholdingOutputStream.write(ThresholdingOutputStream.java:126)
        at org.apache.tomcat.util.http.fileupload.util.Streams.copy(Streams.java:105)
        at org.apache.tomcat.util.http.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:291)
        at org.apache.catalina.connector.Request.parseParts(Request.java:2869)
        at org.apache.catalina.connector.Request.getParts(Request.java:2771)
        at org.apache.catalina.connector.RequestFacade.getParts(RequestFacade.java:1098)
        at org.springframework.web.multipart.support.StandardMultipartHttpServletRequest.parseRequest(StandardMultipartHttpServletRequest.java:95)
        at org.springframework.web.multipart.support.StandardMultipartHttpServletRequest.<init>(StandardMultipartHttpServletRequest.java:88)
        at org.springframework.web.multipart.support.StandardServletMultipartResolver.resolveMultipart(StandardServletMultipartResolver.java:87)
        at org.springframework.web.servlet.DispatcherServlet.checkMultipart(DispatcherServlet.java:1178)
        at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1012)
        at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:943)
        at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006)
        at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:880)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:733)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:541)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:143)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343)
        at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:374)
        at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
        at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:868)
        at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1590)
        at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.lang.Thread.run(Thread.java:748)
        
2021-08-29 21:16:01.368 ERROR 5116 --- [http-nio-8096-Acceptor] org.apache.tomcat.util.net.Acceptor      : Socket accept failed
java.io.IOException: Too many open files
        at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
        at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
        at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
        at org.apache.tomcat.util.net.NioEndpoint.serverSocketAccept(NioEndpoint.java:469)
        at org.apache.tomcat.util.net.NioEndpoint.serverSocketAccept(NioEndpoint.java:71)
        at org.apache.tomcat.util.net.Acceptor.run(Acceptor.java:106)
        at java.lang.Thread.run(Thread.java:748)
然后我查询系统的 句柄限制数:
root@iZwz998ewjqvn993qhie7hZ:~# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 15604
max locked memory       (kbytes, -l) 16384
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1000000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1000000
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
已经是 100W 的了,不可能不够,搜索资料查到:
https://baijiahao.baidu.com/s?id=1619745245952371100&wfr=spider&for=pc
有如下说明:
CentOS上使用系统自带的supervisor,使用systemd启动supervisord的服务。被supervisor管理的程序,继承的是systemd对应的限制,如果需要修改的话,就需要在启动.service文件里面修改对应的限制
vim /usr/lib/systemd/system/supervisord.service
里面有:
LimitNOFILE=40960
LimitNPROC=40960
这个2个参数
然后后查询systemd 的默认限制发现, 文件打开数不够:
root@iZwz998ewjqvn993qhie7hZ:~# systemctl show -p DefaultLimitNOFILE
DefaultLimitNOFILE=4096
root@iZwz998ewjqvn993qhie7hZ:~# systemctl show -p DefaultLimit
DefaultLimitAS              DefaultLimitLOCKSSoft       DefaultLimitRSS
DefaultLimitASSoft          DefaultLimitMEMLOCK         DefaultLimitRSSSoft
DefaultLimitCORE            DefaultLimitMEMLOCKSoft     DefaultLimitRTPRIO
DefaultLimitCORESoft        DefaultLimitMSGQUEUE        DefaultLimitRTPRIOSoft
DefaultLimitCPU             DefaultLimitMSGQUEUESoft    DefaultLimitRTTIME
DefaultLimitCPUSoft         DefaultLimitNICE            DefaultLimitRTTIMESoft
DefaultLimitDATA            DefaultLimitNICESoft        DefaultLimitSIGPENDING
DefaultLimitDATASoft        DefaultLimitNOFILE          DefaultLimitSIGPENDINGSoft
DefaultLimitFSIZE           DefaultLimitNOFILESoft      DefaultLimitSTACK
DefaultLimitFSIZESoft       DefaultLimitNPROC           DefaultLimitSTACKSoft
DefaultLimitLOCKS           DefaultLimitNPROCSoft       
root@iZwz998ewjqvn993qhie7hZ:~# systemctl show -p DefaultLimitNPROC
DefaultLimitNPROC=15604
于是手动修改
可以在 supervisord.service 中加上:
LimitNOFILE=1000000
LimitNPROC=1000000
使其每个托管的守护进程的文件打开句柄数默认都是此配置
也可以在对应的进程的 supervisord 配置文件中
[program:iot08-api-java]
LimitNOFILE=1000000
LimitNPROC=1000000
command=/usr/java/jdk1.8.0_281/bin/java -Xms3g -Xmx3g -XX:NewSize=3584m -XX:PermSize=64m -XX:SurvivorRatio=1 -XX:+UseParallelGC -XX:-UseAdaptiveSizePolicy -jar api-0.0.1-SNAPSHOT.jar --spring.profiles.active=prod
directory=/jy/iot08-api
user=root
autostart=true
autorestart=true
priority=200
stdout_logfile=/jy/iot08-api/logs/supervisor.log
    
    
    
     重载supervisord配置文件
    
   
修改了supervisord 配置文件 的话需要 更新重启生效
    
     supervisorctl update
    
重启对应修改程序
    
     supervisorctl restart iot08-api-java
    
 
