Process management with runit

What is runit?

runit is a program to manage (start, stop and keep running) processes. It was intended to be a replacement for the standard Linux init, although RubyWorks stack uses it as an add-on instead.

Init (or whatever is the native way to launch daemon processes on a particular platform) is used to launch runit during the boot sequence, and then runit takes care of starting all other processes comprising the RubyWorks stack.

Why runit?

runit is a nice, uniform way to control services. It works the same way on all flavors of Linux, and doesn’t need actual services to be able to run in detached mode (Mongrel is notoriously weak in this area).

It also shortens boot time quite noticeably, especially on minimalist Linux configurations, by starting all services simultaneously, whereas init does it one process at a time.


As mentioned above, at boot time init has to start one process, /usr/sbin/runsvdir. On RedHat and CentOS RubyWorks installation adds this line to /etc/inittab:


On Debian and Ubuntu runsvdir is started by /etc/event.d/runit.

runsvdir then looks at directory /var/service/. Every subdirectory there is a service. In other words, a process that runit has to start and manage.

For every service subdirectory in /var/service/, runsvdir spawns a service controller process (/usr/sbin/runsv), which executes run script in service’s subdirectory. For example, HAProxy is launched by running /var/service/haproxy/run script.

Open /var/service/haproxy/run in a text editor, and you will note that it starts HAProxy in foreground mode, using exec, which is a shell keyword to replaces the shell process with a program specified in exec parameters. The end result is an haproxy process, which is a child of runsv process, which is a child of runsvdir process:

    init (process #1)
        runsvdir (runit's main process)
            runsv haproxy (HAProxy service controller)
                haproxy (the service itself)
            runsv mongrel_3002 (service controller for Mongrel on port 3002)
                rails_mongrel -p 3002 (Mongrel service)
            ... (and so on)

Observe that actual services always have a parent controller process (runsv). If a service process (haproxy in the example above) ever terminates, its controller receives a SIGCHILD signal, and promptly responds to it by spawning another haproxy. Thus, haproxy service is kept alive.

Another thing worth mentioning is that every service has a supervise directory, which has a number of files used by runit itself, but also some that are for the benefit of external programs and scripts. Thus, process id (PID) of the currently running haproxy process can be obtained by reading the contents of /var/service/haproxy/supervise/pid file, and the status of haproxy service (“up” or “down”) is in /var/service/haproxy/supervise/stat file.

Managing services

Runit has a command-line interface to control services. It is called sv, and can be used like this:

    sv up /var/service/haproxy   #bring up haproxy service by running /var/service/haproxy/run
    sv start haproxy             #same as 'sv up haproxy', but then wait for 7 seconds and fail if haproxy isn't running
    sv kill haproxy              #terminate HAProxy harshly, by sending it SIGKILL

Read man sv for many more options.

On thing you should keep in mind is that in RubyWorks stack running sv stop haproxy from the command line doesn’t really achieve the desired effect, because services are constantly monitored by a program called monit. As soon as it notices that HAProxy is down, it will ask runit to restart it.

Monit itself is the only RubyWorks service that should be started and stopped by sv.

Reading recent error messages through ps

runsvdir has a useful feature called ‘ps logging’. If any process under its control writes anything to its stderr, it appears on runsvdir’s ps entry. So, to have a quick glance at what’s currently going on in the stack, run a command like this: ps -ef | grep runsvdir, and you may see something like this:

    [root@vmtest haproxy]# ps -ef | grep runsvdir
    root      3153     1  0 02:09 ?        00:00:01 runsvdir -P /var/service log: aproxy' process is not running?
    'haproxy' trying to restart?monit: pidfile `/var/service/haproxy/supervise/pid' does not contain a valid pidn
    umber?monit: pidfile `/var/service/haproxy/supervise/pid' does not contain a valid pidnumber?'haproxy' start:
    /usr/bin/sv?monit: pidfile `/var/service/haproxy/supervise/pid' does not contain a valid pidnumber?'haproxy'
    process is running with pid 4081?...

The ps log has a fixed length, and runsvdir adds a dot to the end of it every 15 minutes. If nothing interesting (i.e., worth reporting in stderr) happens for a long time in any service, eventually you see just a long line of dots.

External links