4个Linux命令以查看页面错误统计信息
在Linux操作系统下,如何查看次要和主要页面错误统计信息?
您可以使用页面错误来提高Linux服务器的性能。
确保优化守护程序/程序以减少页面错误的数量。
一旦页面错误数量下降,守护程序的性能就会提高,整个Linux操作系统也会提高。
Linux(和大多数类似Unix的)系统使用虚拟内存到物理地址空间。 Linux内核根据需要使用按需技术来管理此映射。当进程访问虚拟地址空间中映射但未加载到物理内存中的页面时,将发生页面错误。在大多数情况下,页面错误不是错误。它们用于增加Linux和Unix中的程序(如使用虚拟内存的操作系统)可用的内存量。虚拟内存不过是Linux和许多其他现代操作系统使用的内存管理技术,这些技术结合了磁盘驱动器(硬盘/ssd)上的活动RAM和非活动内存以形成大量连续地址。
- 需要磁盘访问时发生"重大故障"。例如,启动一个名为Firefox的应用程序。 Linux内核将搜索物理内存和CPU缓存。如果数据不存在,Linux会发出严重的页面错误。
- 由于页面分配,发生了"小故障"。
您可以使用标准Linux命令(例如ps,top,time和sar)查看所有进程或特定进程的页面错误。
示例:ps命令
使用ps命令查看PID#1的页面错误,执行:
ps -o min_flt,maj_flt 1
输出示例:
MINFL MAJFL 3104 36
其中:
- min_flt:次要页面错误的数量。
- maj_flt:主要页面错误数。
您可能需要查看PID 1的其他详细信息,例如用户,组,命令及其参数,请执行:
# ps -o min_flt,maj_flt,cmd,args,uid,gid 1
输出示例:
MINFL MAJFL CMD COMMAND UID GID 3104 36 /sbin/init /sbin/init 0 0
要查看系统上的每个进程:
# ps -eo min_flt,maj_flt,cmd,args,uid,gid | less
示例:top命令
输入以下top命令(您也可以使用atop和htop):
# top
或以延迟时间间隔启动top命令:
# top -d 1
执行" F"以查看排序菜单,执行" u"以按故障排序。
最后,按[Enter]键。
示例:sar命令
程序sar可用于传递统计信息,包括页面活动。
执行以下命令:
# sar -B # sar -B 1 10
输出示例:
Linux 2.6.32-279.el6.x86_64 (server1.theitroad.local) Monday 05 November 2012 _x86_64_ (8 CPU) 12:46:48 CST pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff 12:46:49 CST 0.00 460.61 68.69 0.00 452.53 0.00 0.00 0.00 0.00 12:46:50 CST 0.00 276.00 170.00 0.00 642.00 0.00 0.00 0.00 0.00 12:46:51 CST 0.00 460.00 47.00 0.00 550.00 0.00 0.00 0.00 0.00 12:46:52 CST 0.00 228.00 49.00 0.00 705.00 0.00 0.00 0.00 0.00 12:46:53 CST 0.00 320.00 146.00 0.00 420.00 0.00 0.00 0.00 0.00 12:46:54 CST 0.00 164.00 69.00 0.00 479.00 0.00 0.00 0.00 0.00 12:46:55 CST 0.00 501.01 1144.44 0.00 991.92 0.00 0.00 0.00 0.00 12:46:56 CST 0.00 220.00 65.00 0.00 503.00 0.00 0.00 0.00 0.00 12:46:57 CST 0.00 280.00 156.00 0.00 514.00 0.00 0.00 0.00 0.00 12:46:58 CST 0.00 160.00 941.00 0.00 949.00 0.00 0.00 0.00 0.00 Average: 0.00 306.61 284.97 0.00 620.44 0.00 0.00 0.00 0.00
在sar手册页中:
-B Report paging statistics. Some of the metrics below are available only with post 2.5 kernels. The following values are dis played: pgpgin/s Total number of kilobytes the system paged in from disk per second. Note: With old kernels (2.2.x) this value is a num ber of blocks per second (and not kilobytes). pgpgout/s Total number of kilobytes the system paged out to disk per second. Note: With old kernels (2.2.x) this value is a number of blocks per second (and not kilobytes). fault/s Number of page faults (major + minor) made by the system per second. This is not a count of page faults that generate I/O, because some page faults can be resolved without I/O. majflt/s Number of major faults the system has made per second, those which have required loading a memory page from disk. pgfree/s Number of pages placed on the free list by the system per second. pgscank/s Number of pages scanned by the kswapd daemon per second. pgscand/s Number of pages scanned directly per second. pgsteal/s Number of pages the system has reclaimed from cache (pagecache and swapcache) per second to satisfy its memory demands. %vmeff Calculated as pgsteal / pgscan, this is a metric of the efficiency of page reclaim. If it is near 100% then almost every page coming off the tail of the inactive list is being reaped. If it gets too low (e.g. less than 30%) then the virtual memory is having some difficulty. This field is displayed as zero if no pages have been scanned during the interval of time.
示例:time命令
使用"/usr/bin/time"命令(不要使用shell内置的time命令)运行程序并总结系统资源使用情况,包括页面错误。
首先,找出实时命令的路径:
# type -a time
输出示例:
time is a shell keyword time is /usr/bin/time
现在,执行以下命令以查看ls命令页面错误:
$ /usr/bin/time -v ls /etc/resolv.conf
输出示例:
/etc/resolv.conf Command being timed: "ls /etc/resolv.conf" User time (seconds): 0.00 System time (seconds): 0.00 Percent of CPU this job got: 0% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 3456 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 280 Voluntary context switches: 1 Involuntary context switches: 3 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0
在这个例子中,我运行了xclock程序两次(记下输出):
$ /usr/bin/time -v xclock Major (requiring I/O) page faults: 4 Minor (reclaiming a frame) page faults: 1083 $ /usr/bin/time -v xclock Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 1087
xclock第一次启动时,存在许多主要故障。
但是,第二次启动xclock时,Linux内核不会发出任何重大错误,因为xclock已经在内存中。
如果发现特定过程的大量页面错误,请尝试以下建议以改善这种情况:
- 优化服务器进程。
- 通过调整配置文件(例如php.ini或httpd.conf或lighttpd.conf)中的参数,减少内存处理过程。
- 向系统添加更多RAM。
- 使用更好的页面替换算法,可以减少页面错误的发生。