Total Pageviews

Saturday, 6 October 2018

如何释放Linux的cache?

echo 2 > /proc/sys/vm/drop_caches
解决方法,换大内存机器,或者慢慢调整系统的proc/sys/vm/pagecache_limit* 等参数,还有文件系统的/proc/sys/vm/dirty_*,没太多经验,瞎子过河。
To free pagecache:
echo 1 > /proc/sys/vm/drop_caches
To free reclaimable slab objects (includes dentries and inodes):
echo 2 > /proc/sys/vm/drop_caches
To free slab objects and pagecache:
echo 3 > /proc/sys/vm/drop_caches
默认值是0,写入1 drop pagecache,写入2 drop reclaimable slab objects,写入3 都释放。
----------
一个经常被问的Linux问题:为啥我的Linux系统没运行多少程序,显示的可用内存这么少?
(http://www.linuxatemyram.com)
其实Linux与Win的内存管理不同,会尽量缓存内存以提高读写性能,通常叫做Cache Memory。
比较老的文件都会介绍Linux的cache占用很多没关系,因为Linux尽可能利用内存进行缓存,但是缓存的回收也是需要资源的,比较好的一篇文章是Poor Zorro写的Linux内存中的Cache真的能被回收么?
虽然大部分情况下我们看到cache很高没有问题,但是我们还是想弄清楚到底是哪个程序把cache弄的那么高,这居然不是一件容易的事。
内核的模块在分配资源的时候,为了提高效率和资源的利用率,都是透过slab来分配的。slab为结构性缓存占用内存,该项也经常占用很大的内存。不过借助slabtop工具,我们可以很方便的显示内核片缓存信息,该工具可以更直观的显示/proc/slabinfo下的内容。
slabtop -s c显示了一台机器缓存中占用对象的情况.
虽然上面的命令现实了cache中slab的情况,但是还是没有显示什么程序占用了cache。
linux-ftools这个工具可以显示某个文件占用的cache的情况, fincore是它其中的一个工具。
fincore的工作原理是将指定的文件的相应inode data与kernel的 page cache table做对比,如果page cache table有这个inode 信息,就找该inode对应的data block的大小。因为kernel的page cache table只存储data block的引用而不是文件名,即文件的inode信息。所以并没有任何一个工具运行一次就可以找出所有的文件使用缓存的情况。
所以使用linux-fincore只能加文件名,来判断该文件是否被缓存,如果缓存,大小是多少。问题是你不能随便猜哪个文件是否被缓存吧。
shanker提供了一个办法,那就查看哪些进程使用的物理内存最多,就找到该进程打开的文件,然后用fincore查看这些文件的缓存使用率。
这个办法在大部分情况下都可以找到占用cache较多的程序和进程。
他的这个脚本如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#!/bin/bash
#Author: Shanker
#Time: 2016/06/08
#set -e
#set -u
#you have to install linux-fincore
if [ ! -f /usr/local/bin/linux-fincore ]
then
echo "You haven't installed linux-fincore yet"
exit
fi
#find the top 10 processs' cache file
ps -e -o pid,rss|sort -nk2 -r|head -10 |awk '{print $1}'>/tmp/cache.pids
#find all the processs' cache file
#ps -e -o pid>/tmp/cache.pids
if [ -f /tmp/cache.files ]
then
echo "the cache.files is exist, removing now "
rm -f /tmp/cache.files
fi
while read line
do
lsof -p $line 2>/dev/null|awk '{print $9}' >>/tmp/cache.files
done</tmp/cache.pids
if [ -f /tmp/cache.fincore ]
then
echo "the cache.fincore is exist, removing now"
rm -f /tmp/cache.fincore
fi
for i in `cat /tmp/cache.files`
do
if [ -f $i ]
then
echo $i >>/tmp/cache.fincore
fi
done
linux-fincore -s `cat /tmp/cache.fincore`
rm -f /tmp/cache.{pids,files,fincore}

比较遗憾的是,linux-ftools看起来不再维护了。我在我的服务器也没有编译好这个程序,所以还得想办法。
后来找到pcstat这个工具,功能和linux-ftools一样,使用Go开发。
然后我修改了Shanker的脚本,让它使用pcstat进行处理,可以很好的找到cache占用的情况。
修改的脚本如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#!/bin/bash
#you have to install pcstat
if [ ! -f /data0/brokerproxy/pcstat ]
then
echo "You haven't installed pcstat yet"
echo "run \"go get github.com/tobert/pcstat\" to install"
exit
fi
#find the top 10 processs' cache file
ps -e -o pid,rss|sort -nk2 -r|head -10 |awk '{print $1}'>/tmp/cache.pids
#find all the processs' cache file
#ps -e -o pid>/tmp/cache.pids
if [ -f /tmp/cache.files ]
then
echo "the cache.files is exist, removing now "
rm -f /tmp/cache.files
fi
while read line
do
lsof -p $line 2>/dev/null|awk '{print $9}' >>/tmp/cache.files
done</tmp/cache.pids
if [ -f /tmp/cache.pcstat ]
then
echo "the cache.pcstat is exist, removing now"
rm -f /tmp/cache.pcstat
fi
for i in `cat /tmp/cache.files`
do
if [ -f $i ]
then
echo $i >>/tmp/cache.pcstat
fi
done
/data0/brokerproxy/pcstat `cat /tmp/cache.pcstat`
rm -f /tmp/cache.{pids,files,pcstat}

可以看到 uuid.log占用cache比较多。我的程序中这个文件是打开的,一直往里面写日志,Linux应该是把它缓存了。
参考文档
  1. https://code.google.com/p/linux-ftools/
  2. https://github.com/tobert/pcstat
  3. http://shanker.blog.51cto.com/1189689/1787378
  4. http://www.linuxatemyram.com
  5. http://colobu.com/2015/10/31/How-to-Clear-RAM-Memory-Cache-Buffer-and-Swap-Space-on-Linux/
  6. http://liwei.life/2016/04/26/linux内存中的cache真的能被回收么?

附录

和这篇笔记主题无关的一个问题,也非常值得深究:

drop_caches

Writing to this will cause the kernel to drop clean caches, as well as
reclaimable slab objects like dentries and inodes.  Once dropped, their
memory becomes free.

To free pagecache:
echo 1 > /proc/sys/vm/drop_caches
To free reclaimable slab objects (includes dentries and inodes):
echo 2 > /proc/sys/vm/drop_caches
To free slab objects and pagecache:
echo 3 > /proc/sys/vm/drop_caches

This is a non-destructive operation and will not free any dirty objects.
To increase the number of objects freed by this operation, the user may run
`sync' prior to writing to /proc/sys/vm/drop_caches.  This will minimize the
number of dirty objects on the system and create more candidates to be
dropped.

This file is not a means to control the growth of the various kernel caches
(inodes, dentries, pagecache, etc...)  These objects are automatically
reclaimed by the kernel when memory is needed elsewhere on the system.

Use of this file can cause performance problems.  Since it discards cached
objects, it may cost a significant amount of I/O and CPU to recreate the
dropped objects, especially if they were under heavy use.  Because of this,
use outside of a testing or debugging environment is not recommended.

You may see informational messages in your kernel log when this file is
used:

cat (1234): drop_caches: 3

These are informational only.  They do not mean that anything is wrong
with your system.  To disable them, echo 4 (bit 3) into drop_cache

No comments:

Post a Comment