4个Linux命令以查看页面错误统计信息

时间:2020-01-09 10:41:22  来源:igfitidea点击:

在Linux操作系统下,如何查看次要和主要页面错误统计信息?
您可以使用页面错误来提高Linux服务器的性能。
确保优化守护程序/程序以减少页面错误的数量。
一旦页面错误数量下降,守护程序的性能就会提高,整个Linux操作系统也会提高。

Linux(和大多数类似Unix的)系统使用虚拟内存到物理地址空间。 Linux内核根据需要使用按需技术来管理此映射。当进程访问虚拟地址空间中映射但未加载到物理内存中的页面时,将发生页面错误。在大多数情况下,页面错误不是错误。它们用于增加Linux和Unix中的程序(如使用虚拟内存的操作系统)可用的内存量。虚拟内存不过是Linux和许多其他现代操作系统使用的内存管理技术,这些技术结合了磁盘驱动器(硬盘/ssd)上的活动RAM和非活动内存以形成大量连续地址。

  • 需要磁盘访问时发生"重大故障"。例如,启动一个名为Firefox的应用程序。 Linux内核将搜索物理内存和CPU缓存。如果数据不存在,Linux会发出严重的页面错误。
  • 由于页面分配,发生了"小故障"。

您可以使用标准Linux命令(例如ps,top,time和sar)查看所有进程或特定进程的页面错误。

示例:ps命令

使用ps命令查看PID#1的页面错误,执行:

ps -o min_flt,maj_flt 1

输出示例:

MINFL  MAJFL
  3104     36

其中:

  • min_flt:次要页面错误的数量。
  • maj_flt:主要页面错误数。

您可能需要查看PID 1的其他详细信息,例如用户,组,命令及其参数,请执行:

# ps -o min_flt,maj_flt,cmd,args,uid,gid 1

输出示例:

MINFL  MAJFL CMD                         COMMAND                       UID   GID
  3104     36 /sbin/init                  /sbin/init                      0     0

要查看系统上的每个进程:

# ps -eo min_flt,maj_flt,cmd,args,uid,gid | less

示例:top命令

输入以下top命令(您也可以使用atop和htop):

# top

或以延迟时间间隔启动top命令:

# top -d 1

执行" F"以查看排序菜单,执行" u"以按故障排序。
最后,按[Enter]键。

示例:sar命令

程序sar可用于传递统计信息,包括页面活动。
执行以下命令:

# sar -B
# sar -B 1 10

输出示例:

Linux 2.6.32-279.el6.x86_64 (server1.theitroad.local) 	Monday 05 November 2012 	_x86_64_	(8 CPU)
 
12:46:48  CST  pgpgin/s pgpgout/s   fault/s  majflt/s  pgfree/s pgscank/s pgscand/s pgsteal/s    %vmeff
12:46:49  CST      0.00    460.61     68.69      0.00    452.53      0.00      0.00      0.00      0.00
12:46:50  CST      0.00    276.00    170.00      0.00    642.00      0.00      0.00      0.00      0.00
12:46:51  CST      0.00    460.00     47.00      0.00    550.00      0.00      0.00      0.00      0.00
12:46:52  CST      0.00    228.00     49.00      0.00    705.00      0.00      0.00      0.00      0.00
12:46:53  CST      0.00    320.00    146.00      0.00    420.00      0.00      0.00      0.00      0.00
12:46:54  CST      0.00    164.00     69.00      0.00    479.00      0.00      0.00      0.00      0.00
12:46:55  CST      0.00    501.01   1144.44      0.00    991.92      0.00      0.00      0.00      0.00
12:46:56  CST      0.00    220.00     65.00      0.00    503.00      0.00      0.00      0.00      0.00
12:46:57  CST      0.00    280.00    156.00      0.00    514.00      0.00      0.00      0.00      0.00
12:46:58  CST      0.00    160.00    941.00      0.00    949.00      0.00      0.00      0.00      0.00
Average:         0.00    306.61    284.97      0.00    620.44      0.00      0.00      0.00      0.00

在sar手册页中:

-B     Report  paging  statistics.  Some  of  the metrics below are available only with post 2.5 kernels. The following values are dis
              played:
 
              pgpgin/s
                     Total number of kilobytes the system paged in from disk per second.  Note: With old kernels (2.2.x) this value is a  num
                     ber of blocks per second (and not kilobytes).
 
              pgpgout/s
                     Total number of kilobytes the system paged out to disk per second.  Note: With old kernels (2.2.x) this value is a number
                     of blocks per second (and not kilobytes).
 
              fault/s
                     Number of page faults (major + minor) made by the system per second.  This is not a count of page  faults  that  generate
                     I/O, because some page faults can be resolved without I/O.
 
              majflt/s
                     Number of major faults the system has made per second, those which have required loading a memory page from disk.
 
              pgfree/s
                     Number of pages placed on the free list by the system per second.
 
              pgscank/s
                     Number of pages scanned by the kswapd daemon per second.
 
              pgscand/s
                     Number of pages scanned directly per second.
 
              pgsteal/s
                     Number of pages the system has reclaimed from cache (pagecache and swapcache) per second to satisfy its memory demands.
 
              %vmeff
                     Calculated  as pgsteal / pgscan, this is a metric of the efficiency of page reclaim. If it is near 100% then almost every
                     page coming off the tail of the inactive list is being reaped. If it gets too low (e.g. less than 30%) then  the  virtual
                     memory  is  having some difficulty.  This field is displayed as zero if no pages have been scanned during the interval of
                     time.

示例:time命令

使用"/usr/bin/time"命令(不要使用shell内置的time命令)运行程序并总结系统资源使用情况,包括页面错误。
首先,找出实时命令的路径:

# type -a time

输出示例:

time is a shell keyword
time is /usr/bin/time

现在,执行以下命令以查看ls命令页面错误:

$ /usr/bin/time -v ls /etc/resolv.conf

输出示例:

/etc/resolv.conf
	Command being timed: "ls /etc/resolv.conf"
	User time (seconds): 0.00
	System time (seconds): 0.00
	Percent of CPU this job got: 0%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 3456
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 280
	Voluntary context switches: 1
	Involuntary context switches: 3
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

在这个例子中,我运行了xclock程序两次(记下输出):

$ /usr/bin/time -v xclock 

	Major (requiring I/O) page faults: 4
	Minor (reclaiming a frame) page faults: 1083

$ /usr/bin/time -v xclock 
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 1087

xclock第一次启动时,存在许多主要故障。
但是,第二次启动xclock时,Linux内核不会发出任何重大错误,因为xclock已经在内存中。

如果发现特定过程的大量页面错误,请尝试以下建议以改善这种情况:

  • 优化服务器进程。
  • 通过调整配置文件(例如php.ini或httpd.conf或lighttpd.conf)中的参数,减少内存处理过程。
  • 向系统添加更多RAM。
  • 使用更好的页面替换算法,可以减少页面错误的发生。