uptime
Tell the current time, how long the system has been running, the number of users, and the system load averages for the past 1, 5 and 15 minutes.
Usage:
uptime [options]
Options:
-p, –pretty show uptime in pretty format
-h, –help display this help and exit
-s, –since system up since
-V, –version output version information and exit
$ uptime
09:34:30 up 12 min, 0 users, load average: 6.78, 6.98, 4.92
$ uptime -s
2021-01-28 09:22:10
/proc/loadavg
The first three columns measure CPU and IO utilization of the last one, five, and 10 minute periods. The fourth column shows the number of currently running processes and the total number of processes. The last column displays the last process ID used.
$ cat /proc/loadavg
6.64 7.00 5.63 4/3687 8040
/proc/stat
# cat /proc/stat
cpu 164819 22146 193253 814793 4009 35944 8266 0 0 0
cpu0 45538 3739 68998 161187 1012 23293 5696 0 0 0
cpu1 47436 3906 63997 180322 1758 9408 1408 0 0 0
cpu2 39190 6546 34574 227399 658 1821 718 0 0 0
cpu3 32655 7955 25684 245885 581 1422 444 0 0 0
intr 15149646 0 0 0 7548845 0 87104 566796 0 4 0 0 949 27659 0 0 0 0 0 0 0 0 0 0 0 0 1 85968 200 107290 2725 23561 0 14192 132 311930 0 0 0 0 80545 65 0 0 0 0 2854 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 33332 0 239935 0 0 0 0 10 0 0 0 0 29 0 3315 0 0 0 6 0 0 0 0 0 0 0 0 93 0 0 0 1 1 0 2 2 0 0 215440 527879 1 0 0 0 0 0 1327 62431 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 326 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 52 0 0 0 0 0 0 0 0 0 27651 0 0 0 0 0 0 0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 160908 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6779 0 0
ctxt 25319106
btime 1611796930
processes 12949
procs_running 1
procs_blocked 0
softirq 3007713 523014 741584 32312 104244 57538 0 237268 716914 0 594839
The very first line ”
cpu
” aggregates the numbers in all of the other ”
cpu
N
” lines.
These numbers identify the amount of time the CPU has spent performing different kinds of work. Time units are in USER_HZ or Jiffies (typically hundredths of a second).
The meanings of the columns are as follows, from left to right:
-
1st column :
user
= normal processes executing in user mode -
2nd column :
nice
= niced processes executing in user mode -
3rd column :
system
= processes executing in kernel mode -
4th column :
idle
= twiddling thumbs -
5th column :
iowait
= waiting for I/O to complete -
6th column :
irq
= servicing interrupts -
7th column :
softirq
= servicing softirqs
The “intr” line gives counts of interrupts serviced since boot time, for each of the possible system interrupts. The first column is the total of all interrupts serviced; each subsequent column is the total for that particular interrupt.
The “ctxt” line gives the total number of context switches across all CPUs.
The “btime” line gives the time at which the system booted, in seconds since the Unix epoch (January 1, 1970).
The “processes” line gives the number of processes and threads created, which includes (but is not limited to) those created by calls to the fork() and clone() system calls.
The “procs_running” line gives the number of processes currently running on CPUs (Linux 2.5.45 onwards)
The “procs_blocked” line gives the number of processes currently blocked, waiting for I/O to complete (Linux 2.5.45 onwards)
reference:
/proc/cpuinfo
查看cpu的基本信息
$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i7-4600M CPU @ 2.90GHz
stepping : 3
microcode : 0x28
cpu MHz : 2617.182
cache size : 4096 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts md_clear flush_l1d
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit srbds
bogomips : 5786.54
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
......
vmstat
vmstat命令是常见的Linux/Unix监控工具,可以展现给定时间间隔的设备的CPU使用率,内存使用,虚拟内存交换情况,IO读写情况
M01_AE:/ $ vmstat 2
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
7 0 0 2927748 69044 1958516 0 0 91 36 1 2561 12 16 72 0
0 0 0 2922104 69056 1964904 0 0 0 70 1 7317 11 14 75 0
0 0 0 2917764 69072 1970940 0 0 0 294 1 7874 13 17 69 1
vmstat 2 表示每2秒采集数据,一直采集,直到结束程序,这里采集了3次数据。
下面是每个参数的具体含义:
r
表示运行队列(就是说多少个进程真的分配到CPU),我测试的服务器目前CPU比较空闲,没什么程序在跑,当这个值超过了CPU数目,就会出现CPU瓶颈了。这个也和top的负载有关系,一般负载超过了3就比较高,超过了5就高,超过了10就不正常了,服务器的状态很危险。top的负载类似每秒的运行队列。如果运行队列过大,表示你的CPU很繁忙,一般会造成CPU使用率很高。
b
表示阻塞的进程,这个不多说,进程阻塞,大家懂的。
swpd
虚拟内存已使用的大小,如果大于0,表示你的机器物理内存不足了,如果不是程序内存泄露的原因,那么你该升级内存了或者把耗内存的任务迁移到其他机器。
free
空闲的物理内存的大小,我的机器内存总共8G,剩余3415M。
buff
Linux/Unix系统是用来存储,目录里面有什么内容,权限等的缓存,我本机大概占用300多M
cache
cache直接用来记忆我们打开的文件,给文件做缓冲,我本机大概占用300多M(这里是Linux/Unix的聪明之处,把空闲的物理内存的一部分拿来做文件和目录的缓存,是为了提高 程序执行的性能,当程序使用内存时,buffer/cached会很快地被使用。)
si
每秒从磁盘读入虚拟内存的大小,如果这个值大于0,表示物理内存不够用或者内存泄露了,要查找耗内存进程解决掉。我的机器内存充裕,一切正常。
so
每秒虚拟内存写入磁盘的大小,如果这个值大于0,同上。
bi
块设备每秒接收的块数量,这里的块设备是指系统上所有的磁盘和其他块设备,默认块大小是1024byte,我本机上没什么IO操作,所以一直是0,但是我曾在处理拷贝大量数据(2-3T)的机器上看过可以达到140000/s,磁盘写入速度差不多140M每秒
bo
块设备每秒发送的块数量,例如我们读取文件,bo就要大于0。bi和bo一般都要接近0,不然就是IO过于频繁,需要调整。
in
每秒CPU的中断次数,包括时间中断
cs
每秒上下文切换次数,例如我们调用系统函数,就要进行上下文切换,线程的切换,也要进程上下文切换,这个值要越小越好,太大了,要考虑调低线程或者进程的数目,例如在apache和nginx这种web服务器中,我们一般做性能测试时会进行几千并发甚至几万并发的测试,选择web服务器的进程可以由进程或者线程的峰值一直下调,压测,直到cs到一个比较小的值,这个进程和线程数就是比较合适的值了。系统调用也是,每次调用系统函数,我们的代码就会进入内核空间,导致上下文切换,这个是很耗资源,也要尽量避免频繁调用系统函数。上下文切换次数过多表示你的CPU大部分浪费在上下文切换,导致CPU干正经事的时间少了,CPU没有充分利用,是不可取的。
us
用户CPU时间,我曾经在一个做加密解密很频繁的服务器上,可以看到us接近100,r运行队列达到80(机器在做压力测试,性能表现不佳)。
sy
系统CPU时间,如果太高,表示系统调用时间长,例如是IO操作频繁。
id
空闲 CPU时间,一般来说,id + us + sy = 100,一般我认为id是空闲CPU使用率,us是用户CPU使用率,sy是系统CPU使用率。
wt
等待IO CPU时间
/sys/devices/system/cpu
此节点可以查看cpu调度相关的信息,包括每个核的工作情况,governor 设置, 动态频率设置等。
M01_AE:/sys/devices/system/cpu $ ls -al
drwxr-xr-x 9 root root 0 1970-01-01 08:00 .
drwxr-xr-x 7 root root 0 1970-01-01 08:00 ..
-r--r--r-- 1 root root 4096 2021-02-20 17:01 core_ctl_isolated
drwxr-xr-x 7 root root 0 2018-10-18 20:00 cpu0
drwxr-xr-x 6 root root 0 2018-10-18 20:00 cpu1
drwxr-xr-x 6 root root 0 2018-10-18 20:00 cpu2
drwxr-xr-x 6 root root 0 2018-10-18 20:00 cpu3
drwxr-xr-x 4 root root 0 2018-10-18 20:00 cpufreq
drwxr-xr-x 2 root root 0 2018-10-18 20:00 cpuidle
-r--r--r-- 1 root root 4096 2021-02-20 17:01 isolated
-r--r--r-- 1 root root 4096 2021-02-20 17:01 kernel_max
-r--r--r-- 1 root root 4096 2021-02-20 17:01 modalias
-r--r--r-- 1 root root 4096 2021-02-20 17:01 offline
-r--r--r-- 1 root root 4096 1970-01-01 08:00 online
-r--r--r-- 1 root root 4096 2018-10-18 20:01 possible
drwxr-xr-x 2 root root 0 2018-10-18 20:00 power
-r--r--r-- 1 root root 4096 2018-10-18 20:01 present
-rw-r--r-- 1 root root 4096 2018-10-18 20:00 uevent
M01_AE:/sys/devices/system/cpu/cpu0/cpufreq $ ls -al
total 0
drwxr-xr-x 4 root root 0 2018-10-18 20:00 .
drwxr-xr-x 4 root root 0 2018-10-18 20:00 ..
-r--r--r-- 1 root root 4096 2021-02-20 17:01 affected_cpus
-rw-rw-rw- 1 root root 4096 2018-10-18 20:00 cpuinfo_cur_freq
-r--r--r-- 1 root root 4096 2021-02-20 14:45 cpuinfo_max_freq
-r--r--r-- 1 root root 4096 2021-02-20 14:45 cpuinfo_min_freq
-r--r--r-- 1 root root 4096 2021-02-20 17:01 cpuinfo_transition_latency
drwxr-xr-x 2 root root 0 2018-10-18 20:00 interactive
-r--r--r-- 1 root root 4096 2021-02-20 17:01 related_cpus
-r--r--r-- 1 root root 4096 2018-10-18 20:00 scaling_available_frequencies
-r--r--r-- 1 root root 4096 2021-02-20 17:01 scaling_available_governors
-r--r--r-- 1 root root 4096 2018-10-18 20:01 scaling_cur_freq
-r--r--r-- 1 root root 4096 2021-02-20 17:01 scaling_driver
-rw-r--r-- 1 root root 4096 2018-10-18 20:00 scaling_governor
-rw-rw---- 1 system system 4096 2018-10-18 20:00 scaling_max_freq
-rw-rw-r-- 1 system system 4096 2018-10-18 20:00 scaling_min_freq
-rw-r--r-- 1 root root 4096 2021-02-20 17:01 scaling_setspeed
reference