Spring Cloud Sleuth 为 Spring Cloud实现了一个分布式链路跟踪解决方案。
Sleuth概念说明
- span(跨度):基本工作单元,用一个spanID作为唯一标识。还包含:描述、时间戳事件、span父ID等。初始化的span被称为“root span”,该span的id和trace的ID相等
- trace(跟踪):一组共享“root span”的span组成的树状结构,用traceID作为唯一标识,trace中所有span共享该traceID
- annotation(标注):用来记录事件的存在,核心annotation用来定义请求开始和结束
– CS(Client Sent客户端发送)
– SR(Server Received服务器端接收):如果用SR减去CS时间戳,得到网络延迟
– SS(Server Sent服务器端发送):要响应客户端时。如果用SS减去SR,得到服务器端处理请求的时间
– CR(Client Received客户端接收):客户端接收到响应。如果CR减去CS时间戳,得到客户端发送到最终收到响应总时间
Sleuth工程改造
继续沿用
nacos服务注册与发现
的nacos-provider 和 nacos-client两个工程做示例。
pom文件
对已有工程增加sleuth链路跟踪,只须要增加sleuth的依赖,代码层面不须要做改动
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-sleuth</artifactId>
</dependency>
日志比对
新启动nacos-client工程和nacos-provider工程,取nacos-client工程改造前后的启动日志,取不同的地方,其余相同的地方日志就不取了
启动前
2021-01-27 14:51:33.185 INFO 15740 --- [ main] c.a.n.c.c.impl.LocalConfigInfoProcessor : LOCAL_SNAPSHOT_PATH:C:\Users\PC\nacos\config
启动后
2021-01-27 14:52:48.607 INFO [nacos-client,,,] 11544 --- [ main] c.a.n.c.c.impl.LocalConfigInfoProcessor : LOCAL_SNAPSHOT_PATH:C:\Users\PC\nacos\config
对比前后的日志可以发现,改造前就是打印普通的INFO日志,改造后INFO后面增加了[nacos-client,,,],这里中括号里面有三个逗号,里面有四个数字位
其中各位分别表示 [工程名,trace Id,span Id,是否持久化日志],traceId 和 spanId概念上面有说。是否持久化日志信息,指的是是否有通过数据库或类似的zipkin/es/mq之类的工具将日志信息收集管理,未对接则为false,对接了则为true。
两个工程都启动后,在浏览器通过
http://localhost:8081/hi?name=zhangqi
访问
nacos-client工程可以达到打印日志
2021-01-27 15:10:26.412 INFO [nacos-client,df147d319856fd0a,f01cc58f75f68f9c,false] 10296 --- [nio-8081-exec-1] c.netflix.config.ChainedDynamicProperty : Flipping property: nacos-provider.ribbon.ActiveConnectionsLimit to use NEXT property: niws.loadbalancer.availabilityFilteringRule.activeConnectionsLimit = 2147483647
2021-01-27 15:10:26.437 INFO [nacos-client,df147d319856fd0a,f01cc58f75f68f9c,false] 10296 --- [nio-8081-exec-1] c.netflix.loadbalancer.BaseLoadBalancer : Client: nacos-provider instantiated a LoadBalancer: DynamicServerListLoadBalancer:{NFLoadBalancer:name=nacos-provider,current list of Servers=[],Load balancer stats=Zone stats: {},Server stats: []}ServerList:null
2021-01-27 15:10:26.445 INFO [nacos-client,df147d319856fd0a,f01cc58f75f68f9c,false] 10296 --- [nio-8081-exec-1] c.n.l.DynamicServerListLoadBalancer : Using serverListUpdater PollingServerListUpdater
2021-01-27 15:10:26.482 INFO [nacos-client,df147d319856fd0a,f01cc58f75f68f9c,false] 10296 --- [nio-8081-exec-1] com.alibaba.nacos.client.naming : new ips(1) service: DEFAULT_GROUP@@nacos-provider -> [{"clusterName":"DEFAULT","enabled":true,"ephemeral":true,"healthy":true,"instanceHeartBeatInterval":5000,"instanceHeartBeatTimeOut":15000,"instanceId":"192.168.187.1#8080#DEFAULT#DEFAULT_GROUP@@nacos-provider","ip":"192.168.187.1","ipDeleteTimeout":30000,"metadata":{"preserved.register.source":"SPRING_CLOUD"},"port":8080,"serviceName":"DEFAULT_GROUP@@nacos-provider","weight":1.0}]
2021-01-27 15:10:26.491 INFO [nacos-client,df147d319856fd0a,f01cc58f75f68f9c,false] 10296 --- [nio-8081-exec-1] com.alibaba.nacos.client.naming : current ips:(1) service: DEFAULT_GROUP@@nacos-provider -> [{"clusterName":"DEFAULT","enabled":true,"ephemeral":true,"healthy":true,"instanceHeartBeatInterval":5000,"instanceHeartBeatTimeOut":15000,"instanceId":"192.168.187.1#8080#DEFAULT#DEFAULT_GROUP@@nacos-provider","ip":"192.168.187.1","ipDeleteTimeout":30000,"metadata":{"preserved.register.source":"SPRING_CLOUD"},"port":8080,"serviceName":"DEFAULT_GROUP@@nacos-provider","weight":1.0}]
2021-01-27 15:10:26.510 INFO [nacos-client,df147d319856fd0a,f01cc58f75f68f9c,false] 10296 --- [nio-8081-exec-1] c.netflix.config.ChainedDynamicProperty : Flipping property: nacos-provider.ribbon.ActiveConnectionsLimit to use NEXT property: niws.loadbalancer.availabilityFilteringRule.activeConnectionsLimit = 2147483647
2021-01-27 15:10:26.517 INFO [nacos-client,df147d319856fd0a,f01cc58f75f68f9c,false] 10296 --- [nio-8081-exec-1] c.n.l.DynamicServerListLoadBalancer : DynamicServerListLoadBalancer for client nacos-provider initialized: DynamicServerListLoadBalancer:{NFLoadBalancer:name=nacos-provider,current list of Servers=[192.168.187.1:8080],Load balancer stats=Zone stats: {unknown=[Zone:unknown; Instance count:1; Active connections count: 0; Circuit breaker tripped count: 0; Active connections per server: 0.0;]
},Server stats: [[Server:192.168.187.1:8080; Zone:UNKNOWN; Total Requests:0; Successive connection failure:0; Total blackout seconds:0; Last connection made:Thu Jan 01 08:00:00 CST 1970; First connection made: Thu Jan 01 08:00:00 CST 1970; Active Connections:0; total failure count in last (1000) msecs:0; average resp time:0.0; 90 percentile resp time:0.0; 95 percentile resp time:0.0; min resp time:0.0; max resp time:0.0; stddev resp time:0.0]
]}ServerList:com.alibaba.cloud.nacos.ribbon.NacosServerList@1a92bfd5
2021-01-27 15:10:27.459 INFO [nacos-client,,,] 10296 --- [erListUpdater-0] c.netflix.config.ChainedDynamicProperty : Flipping property: nacos-provider.ribbon.ActiveConnectionsLimit to use NEXT property: niws.loadbalancer.availabilityFilteringRule.activeConnectionsLimit = 2147483647
通过日志信息可以看到 nacos-client工程,traceId为df147d319856fd0a,spanId为f01cc58f75f68f9c,是否持久化链路日志信息为false。
增加zipkin
Zipkin是一款开源的分布式实时数据追踪系统(Distributed Tracking System),基于 Google Dapper的论文设计而来,由 Twitter 公司开发贡献。其主要功能是聚集来自各个异构系统的实时监控数据。zipkin的更多信息可以去官方网站查看
https://zipkin.io/
zipkin下载
下载
zipkin-server
服务,列表中有不同的zipkin-server版本,进入到2.12.9版本,下载exec.jar执行版本。
下载后,启动命令窗口,到zipkin-server-2.12.9-exec.jar所在目录,通过java -jar zipkin-server-2.12.9-exec.jar 启动zipkin服务
启动zipkin服务后,可以在浏览器
http://localhost:9411/zipkin/
查看zipkin服务界面
搭建服务
工程pom文件增加zipkin依赖
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-zipkin</artifactId>
</dependency>
工程中增加zipkin-server配置
spring.zipkin.base-url=http://localhost:9411
spring.sleuth.sampler.probability=1
启动nacos-client和nacos-provider工程,在浏览器通过
http://localhost:8081/hi?name=zhangqi
访问
点下面的,可以看到请求的每个链路耗时
PS:各软件版本信息
nacos:1.4.1
springboot:2.2.5.RELEASE
springcloud:Hoxton.SR3
zipkin:2.12.9