Pages

Wednesday, 4 November 2015

确保程序持续运行在linux vps上的办法-monit


Monit 是用于对系统中的进程、文件、目录、以及设备等进行监视和管理的工具。当你所指定的server宕机或者没有反应,monit会将该进程杀死并重启该进程。并通过邮件进行通知。Monit 包含一个内嵌的 HTTP(S) Web 界面,你可以使用浏览器方便地查看 Monit 所监视的服务器。

  Monit的官方网站:http://www.mmonit.com

  下面将介绍monit在CentOS5.3上的安装和配置:
要先安装flex,bison,byacc.
(apt-get install flex bison byacc)
yum install flex bison byacc

download:http://mmonit.com/monit/dist/monit-5.1.1.tar.gz
shell >  tar xzvf monit-5.1.1.tar.gz
shell> cd monit-5.*
shell>./configure --without-ssl
shell>make
shell> make install
shell> cp monitrc /etc/
  在inittab文件中添加下面这句话,使init守护monit进程!!

shell> vi /etc/inittab
mo:2345:respawn:/usr/local/bin/monit -Ic /etc/monitrc
  现在安装完成了,是不是超简单!!

  接下来就是配置了!!

  /etc/monitrc文件

  ######################################################

set daemon  120        #设置monit检查的间隔时间,单位是秒!!
set logfile syslog facility log_daemon  #用syslog来记录log
set logfile /var/logs/monit.log #设置日志路径
set idfile /var/.monit.id #设置PID文件的位置
set mailserver  192.168.0.21,               # primary mailserver  邮件服务器的IP
set mail-format { from:monit@test.com}  #设置你的邮件从哪个账号发出
set alertphoneNumber@139.com                     #发到我的139邮箱,用139邮箱是因为139有一个邮件到达通知功能
set httpd port 2812 and      #设置monit监听的端口号
use address 192.168.0.21   # 设置monit服务器的IP,可以让你方便的http访问
allow admin:pass      #设置用户名和密码
  ####################################################

  ## Services

  ####################################################

#监控服务器的磁盘使用情况
check device system with path /dev/mapper/VolGroup00-LogVol00
if space usage > 85% for 5 times within 15 cycles then alert
if space usage > 95% then stop
if inode usage > 85% then alert
if inode usage > 95% then stop
  ####################################################

#sshd   监控sshd进程
check process sshd with pidfile /var/run/sshd.pid
start program "/etc/init.d/sshd start"
stop  program "/etc/init.d/sshd stop"
if failed host 127.0.0.1 port 22  then restart
if 5 restarts within 5 cycles then timeout
  ######################################################

##cron       监控crontab进程
check process cron with pidfile /var/run/crond.pid
group system
start program = "/etc/init.d/crond start"
stop program = "/etc/init.d/crond stop"
if 5 restarts within 5 cycles then timeout
depends on cron_rc
  ######################################################

  ######################################################

#scripts   监控nginx的日志切割脚本文件
check file cut_nginx_log.sh with path /scripts/cut_nginx_log.sh
group scripts
if failed checksum then unmonitor
if failed permission 755 then unmonitor
if failed uid root then unmonitor
if failed gid root then unmonitor
  ######################################################

  ######################################################

##systemfile  监控passwd文件和group文件
check file passwd with path /etc/passwd
group system
if failed checksum then unmonitor
if failed permission 644 then unmonitor
if failed uid root then unmonitor
if failed gid root then unmonitor
check file group with path /etc/group
group system
if failed checksum then unmonitor
if failed permission 644 then unmonitor
if failed uid root then unmonitor
if failed gid root then unmonitor
  ######################################################

  ######################################################

# 监控本机的25端口,110端口号
check host localhost with address 127.0.0.1
if failed port 25  with timeout 15 seconds then exec "/usr/bin/qmailctl restart"
if failed port 110 protocol pop with timeout 15 seconds then exec "/usr/bin/vpopmailctl restart"
if failed port 20000  with timeout 5 seconds then exec "/root/nysocks-old/nysocks_v1.2.10_linux_x64 server -p 20000 -k my-password -m fast > /dev/null &" 


基本的配置就是这样子,剩下的就是根据各自的实际情况修改/etc/monitrc文件,配置好,从浏览器登录查看下是否成功,接下来就是轻松的喝喝茶.
---------------------------------------------------------------
linux配置monit,自动监控程序的运行状态

为了监控程序正常运行,即当程序挂掉时,能够自动重启。
特配置如下:
check process pusher-admin with pidfile /var/run/admin.pid
    start program = "/etc/init.d/pusherd start"
    stop program = "/etc/init.d/pusherd stop"
    if 5 restarts within 5 cycles then timeout
    group pusherd

查看是否监控正常,运行命令如下:
[root@pusher-ECM-1-226 pusher]# monit status
The Monit daemon 5.2.5 uptime: 1m 

System '127.0.0.1'
  status                            running
  monitoring status                 monitored
  load average                      [0.00] [0.02] [0.05]
  cpu                               0.0%us 0.0%sy 0.0%wa
  memory usage                      1731472 kB [10.6%]
  swap usage                        0 kB [0.0%]
  data collected                    Mon Mar 24 16:44:25 2014

Process 'pusher-admin'
  status                            running
  monitoring status                 monitored
  pid                               9245
  parent pid                        1
  uptime                            5m 
  children                          6
  memory kilobytes                  978964
  memory kilobytes total            983880
  memory percent                    6.0%
  memory percent total              6.0%
  cpu percent                       0.0%
  cpu percent total                 0.0%
  data collected                    Mon Mar 24 16:44:25 2014

可以看到,monit已经监控到pusher-admin应用。

注意:在配置pusherd.monit的时候,如果添加上
mode manual
,则需要在/etc/init.d/pusherd中手动去monit这个pusherd应用,即在启动是monit上,在关闭时unmonit,代码示例如下:
start时:
if [ -f /etc/init.d/monit ] ; then
            for APPLICATION_TYPE in ${APPLICATION_TYPES}; do
                /usr/bin/monit monitor "${APPLICATION_PRODUCT}-${APPLICATION_TYPE}"
            done
        fi
        ;;
stop时:
if [ -f /etc/init.d/monit ] ; then
            for APPLICATION_TYPE in ${ALL_TYPES[@]}; do
                /usr/bin/monit unmonitor "${APPLICATION_PRODUCT}-${APPLICATION_TYPE}"
            done
        fi


另注:若monit stauts显示某一项服务为not monitored,则可以删除/var/monit/state文件,再重启monit即可。
------------

You can write simple config files which tell monit to watch e.g. a TCP port, a PID file etc
monit will run a command you specify when the process it is monitoring is unavailable/using too much memory/is pegging the CPU for too long/etc. It will also pop out an email alert telling you what happened and whether it could do anything about it.
We use it to keep a load of our websites running while giving us early warning when something's going wrong.
from https://mmonit.com/monit/
(http://stackoverflow.com/questions/298760/how-to-make-sure-an-application-keeps-running-on-linux)
-----------

相关帖子:http://briteming.blogspot.in/2014/04/deploying-nodejs-with-upstart-and-monit.html