简析gRPC client 连接管理
背景
客户端skd 使用gRPC作为通信协议,定时(大概是120s)向服务器发送pingServer 请求。
服务端是80端口,如xxx:80.
问题
发现客户端不断的端口重连服务器的。
使用netstat -antp
如图, 如标红的服务器地址连接是TIME_WAIT,后面有和服务器建立连接 ESTABLISHED。
TIME_WAIT 状态表明是client 端主动断开了连接。
这和我之前的认知有点冲突,gRPC 应该是长连接,为什么这里每次都断开呢,这样不就长了短连接了吗?
而且客户端主动断开的,会不会是client端哪里有问题?
带着疑问,在client 抓了一包,
发现client 总是受到一个 length 为17 的包,然后就开始FIN 包,走TCP 挥手的流程。
使用WireShark 对tcpdump的结果查看,发现这个length 17 的包,是一个GOAWAY 包。
如图:
这个是HTTP2定义的一个“优雅”退出的机制。
这里有HTTP2 GOAWAY stream 包的说明。
根据之前的对gRPC的了解,gRPC client 会解析域名,然后会维护一个lb 负载均衡,
这个应该是gRPC对idle 连接的管理。pingServer 的时间间隔是120s, 但是gRPC 认为中间是idle连接,
所以通知client 关闭空闲连接?
为了验证这个想法,修改了一下gRPC 的demo, 因为我们client 端使用是cpp 的gRPC 异步调用方式,
所以更加gRPC 的异步demo, 写了一个简单访问服务器的async_client
代码:
#include <iostream>
#include <memory>
#include <string>
#include <grpcpp/grpcpp.h>
#include <grpc/support/log.h>
#include <thread>
#include "gateway.grpc.pb.h"
using grpc::Channel;
using grpc::ClientAsyncResponseReader;
using grpc::ClientContext;
using grpc::CompletionQueue;
using grpc::Status;
using yournamespace::PingReq;
using yournamespace::PingResp;
using yournamespace::srv;
class GatewayClient {
public:
explicit GatewayClient(std::shared_ptr<Channel> channel)
: stub_(srv::NewStub(channel)) {}
// Assembles the client's payload and sends it to the server.
//void PingServer(const std::string& user) {
void PingServer() {
// Data we are sending to the server.
PingReq request;
request.set_peerid("1111111111111113");
request.set_clientinfo("");
request.set_capability(1);
request.add_iplist(4197554190);
request.set_tcpport(8080);
request.set_udpport(8080);
request.set_upnpip(4197554190);
request.set_upnpport(8080);
request.set_connectnum(10000);
request.set_downloadingspeed(100);
request.set_uploadingspeed(10);
request.set_maxdownloadspeed(0);
request.set_maxuploadspeed(0);
// Call object to store rpc data
AsyncClientCall* call = new AsyncClientCall;
// stub_->PrepareAsyncSayHello() creates an RPC object, returning
// an instance to store in "call" but does not actually start the RPC
// Because we are using the asynchronous API, we need to hold on to
// the "call" instance in order to get updates on the ongoing RPC.
call->response_reader =
stub_->AsyncPing(&call->context, request, &cq_);
// StartCall initiates the RPC call
//call->response_reader->StartCall();
// Request that, upon completion of the RPC, "reply" be updated with the
// server's response; "status" with the indication of whether the operation
// was successful. Tag the request with the memory address of the call object.
call->response_reader->Finish(&call->reply, &call->status, (void*)call);
}
// Loop while listening for completed responses.
// Prints out the response from the server.
void AsyncCompleteRpc() {
void* got_tag;
bool ok = false;
// Block until the next result is available in the completion queue "cq".
while (cq_.Next(&got_tag, &ok)) {
// The tag in this example is the memory location of the call object
AsyncClientCall* call = static_cast<AsyncClientCall*>(got_tag);
// Verify that the request was completed successfully. Note that "ok"
// corresponds solely to the request for updates introduced by Finish().
GPR_ASSERT(ok);
if (call->status.ok())
std::cout << "xNetClient received: " << call->reply.code() << " task:" << call->reply.tasks_size() <<" pinginterval:"<< call->reply.pinginterval() << std::endl;
else
//std::cout << "RPC failed" << std::endl;
std::cout << ": status = " << call->status.error_code() << " (" << call->status.error_message() << ")" << std::endl;
// Once we're complete, deallocate the call object.
delete call;
}
}
private:
// struct for keeping state and data information
struct AsyncClientCall {
// Container for the data we expect from the server.
PingResp reply;
// Context for the client. It could be used to convey extra information to
// the server and/or tweak certain RPC behaviors.
ClientContext context;
// Storage for the status of the RPC upon completion.
Status status;
std::unique_ptr<ClientAsyncResponseReader<PingResp>> response_reader;
};
// Out of the passed in Channel comes the stub, stored here, our view of the
// server's exposed services.
std::unique_ptr<srv::Stub> stub_;
// The producer-consumer queue we use to communicate asynchronously with the
// gRPC runtime.
CompletionQueue cq_;
};
int main(int argc, char** argv) {
// Instantiate the client. It requires a channel, out of which the actual RPCs
// are created. This channel models a connection to an endpoint (in this case,
// localhost at port 50051). We indicate that the channel isn't authenticated
// (use of InsecureChannelCredentials()).
if (argc < 2){
std::cout << "usage: " <<argv[0]<< " domain:port" << std::endl;
std::cout << "eg: " <<argv[0]<< " gw.xnet.xcloud.sandai.net:80" << std::endl;
return 0;
}
GatewayClient xNetClient(grpc::CreateChannel( argv[1], grpc::InsecureChannelCredentials()));
// Spawn reader thread that loops indefinitely
std::thread thread_ = std::thread(&GatewayClient::AsyncCompleteRpc, &xNetClient);
for (int i = 0; i < 1000; i++) {
xNetClient.PingServer(); // The actual RPC call!
std::this_thread::sleep_for(std::chrono::seconds(120));
}
std::cout << "Press control-c to quit" << std::endl << std::endl;
thread_.join(); //blocks forever
return 0;
}
接下来的时间很简单,运行一下。
使用netstat -natp 观察,可以重新。 async_client 也是断开,重连。
进一步调试发现,把发包的时间修改为10s 的时候,可以保持连接,大于10s基本上连接就会断开。
小结
小结一下:
gRPC 管理连接的方式,默认情况下,大于10s没有数据发送,gRPC 就会认为是个idle 连接。server 端会给client 端发送一个GOAWAY 的包。client 收到这个包之后就会主动关闭连接。下次需要发包的时候,就会重新建立连接。
目前还不知道是不是有配置项修改这个值,对gRPC 的机制还不是很熟,后面再研究一下。