rabbitmq binary/other_system内存占用很高

Post author:xfxia
Post published:2023年9月21日
Post category:其他

最近有台服务器的MQ应用占用内存比较偏高，如下：

但是看控制台本身内存中消息积压并不多，

查看rabbtmqctl发现，binary data和other data占据了绝大部分的内存，如下：

{memory,

[{total,124441400},

{connection_readers,5548680},

{connection_writers,605560},

{connection_channels,2798608},

{connection_other,7775480},

{queue_procs,23561696},

{queue_slave_procs,0},

{plugins,489128},

{other_proc,14147408},

{mnesia,2040288},

{mgmt_db,9741144},

{msg_index,1946696},

{other_ets,2006528},

{binary,420491264},

{code,20160575},

{atom,711569},

{other_system,

323416776

}]},

binary（Memory used by shared binary data in the Erlang VM. In-memory message bodies show up here.）代表的是内存中的消息以及其他一些被queue或者exchange引用的元数据，但是在控制台里面有没有任何体现。

other_system：

Other memory used by Erlang. One contributor to this value is the number of available file descriptors。

看mq日志，发现服务器负载高导致响应慢或者OOM被killed之后，虽然进程本身还在，但收不到有些客户端的心跳，类似如下：

Missed heartbeats from client, timeout: 10s

Missed heartbeats from client, timeout: 10s

Missed heartbeats from client, timeout: 10s

Missed heartbeats from client, timeout: 10s

Missed heartbeats from client, timeout: 10s

跟应用的原开发者确认了下，有些客户端在异常后会不断的尝试重新连接MQ服务器，但是连接都建立了，但是随后心跳就异常了。

照理来说，MQ强制断开客户端连接后，相关的socket和内存占用会随之释放，但是出现异常的时候内存占用一直都很高、而且一两个小时后也没有下降到正常水平。

重启两个java应用后，MQ内存占用就立刻下来了，但是连接数和消费者数量和原来都一样。

这现象似乎和我们原来mysql出现的异常一样，服务端在某些地方可能存在非最常用特性被极为频繁的使用后，可能导致内存存在泄露的情况。

因为之前出现过两三次行情初始化和成交回报丢失的情况，所以趁着这次测试环境的异常大概找到了出现问题的规律，后续会在深入研究下rabbitMQ的binary和other_system里面到底是什么数据，这就又涉及到要分析erlang VM的内存dump才能精确确定了。

你可能也喜欢