Monitoring with monit

What is monit?

monit is a utility for managing and monitoring processes, files, directories and devices on a Unix system. It’s a daemon that wakes up every once in a while, goes through a list of things it is configured to check, and if something is not as it should be, takes a corrective action.

Examples of a check:

Examples of corrective action:

Why monit?

monit is included in RubyWorks stack primarily to watch Mongrel processes for memory leaks, endless loops, excessive CPU usage and other misbehaviors. It can also be used to monitor other services and system parameters.

Monit was chosen because it’s free, easy on system resources, has human-readable configuration language, comes with good documentation, and generally is a nice, lightweight monitoring package.

A major limitation of monit is that it only works on one server. If you are a part of a large IT organization that has standardized on some other, more powerful, monitoring system, you should look at monit configuration that comes with the RubyWorks stack, and set up your own system to perform the same checks and corrective actions.

Configuration

monit configuration is in a file /etc/rails/monit.conf. The language of monit configuration is easy to understand, so just read the configuration file, and if there is any setting that you don’t understand, look it up in monit manual

Essentially, configuration consists of some “set …” statements that configure various settings (e.g., how often to perform the checks, where to write a log, etc) and then a list of “check …” statements that go something like this:

    check process mongrel_3002
      with pidfile "/var/service/mongrel_3002/supervise/pid" 
      start program = "/usr/bin/sv up mongrel_3002" 
      stop program = "/usr/bin/sv down mongrel_3002" 
      if totalmem is greater than 110.0 MB for 4 cycles then restart
      if cpu is greater than 50% for 2 cycles then alert
      if cpu is greater than 80% for 3 cycles then restart
      if loadavg(5min) greater than 10 for 8 cycles then restart

The first four lines of the above statement tell monit that there is a service called “mongrel_3002”, which is identified by process id in file /var/service/mongrel_3002/supervise/pid. It can be started by executing /usr/bin/sv up mongrel_3002 and stopped by executing /usr/bin/sv down mongrel_3002.

The remaining lines of the above statement all look like “if [condition] then [corrective action]”.

Monit sleeps for some time (15 seconds, as configured), finds a current mongrel_3002 process and performs some basic process checks (that it is a running process, not a zombie, hasn’t changed it’s process ID since the last check, and so on). Then monit perform the additional checks specified in the configuration (memory, CPU utilization etc). Sleep, rinse, repeat.

One such iteration is called a “cycle”. Statement “if cpu is greater than 50% for 2 cycles then alert” means “send an email alert if on two consecutive checks this Mongrel process was using more than 50% of a CPU”.

Web console

Monit has a very simple web console, running on http://localhost:2812. By default it is configured to respond only to local requests (coming from localhost network interface), and perform basic HTTP authentication for admin:MonitPa$$w0rd.

Installation continued page has some instructions on how to make the console available across the network and more secure.

According to documentation, the web console can be turned off completely (by removing the entire set httpd… section from monit.conf), but monit’s command line interface (described below) doesn’t work without it.

Email alerts

Monit can (and should) be configured to send email alerts when certain checks fail. There is a commented out section in /etc/rails/monit.conf that starts with set mailserver ... statement, which has a very basic alerts setup. Simply uncommenting that section may be a good start. Don’t forget to edit SMTP server and recipient email address! You will want to customize your alerts eventually, and there are many ways to do it, all described in the manual.

Command line interface

As mentioned elsewhere in this documentation, the only right way to selectively start or stop some service or services in RubyWorks stack (such as HAProxy or one of the Mongrels) is via monit command line interface.

Examples:

  monit start haproxy
  monit restart all
    monit -g mongrels stop all

Other useful command-line actions are:

    monit status # Print full status information for each service.
    monit validate # Perform all service checks
    monit reload # Reload monit.conf

Read man monit for more information on the command line options.

Hint: If you symlink /etc/rails/monit.conf as /etc/monitrc, you can omit the -c option.

Notes on management and troubleshooting

Monit is installed as a runit service, and started by /var/service/monit/run script. Monit log file is /var/log/monit.log

To turn on verbose logging, open /var/service/monit/run in a text editor, add a -v option to monit parameters, then restart monit service:

sv restart monit

External links