The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

OpenMosix::HA -- High Availability (HA) layer for an openMosix cluster

SYNOPSIS

  use OpenMosix::HA;

  my $ha = new OpenMosix::HA;

  # start the monitor daemon 
  $ha->monitor;

DESCRIPTION

This module provides the basic functionality needed to manage resource startup and restart across a cluster of openMosix machines.

This gives you a high-availability cluster with low hardware overhead. In contrast to traditional HA clusters, we use the openMosix cluster membership facility, rather than hardware serial cables or extra ethernet ports, to provide heartbeat and to detect network partitions.

All you need to do is build a relatively conventional openMosix cluster, install this module on each node, and configure it to start and manage your HA processes. You do not need the relatively high-end server machines which traditional HA requires. There is no need for chained SCSI buses (though you can use them) -- you can instead share disks among many nodes via any number of other current technologies, including SAN, NAS, GFS, or Firewire (IEEE-1394).

BACKGROUND

Normally, a process-migration-based cluster computing technology (such as openMosix) is orthogonal to the intent of high availability. When openMosix nodes die, any processes migrated to those nodes will also die, regardless of where they were spawned. The higher the node count, the more frequently these failures are likely to occur.

But if processes are started via OpenMosix::HA, any processes and resource groups which fail due to node failure can be configured to automatically restart on other nodes. OpenMosix::HA detects process failure, selects a new node out of all currently available, and deconflicts the selection so that two nodes don't restart the same process or resource group.

While similar to the normal inittab format, the configuration file for OpenMosix::HA includes an extra "resource group" column -- this is what enables you to group processes, disk mounts, virtual IP addresses, and related resources into resource groups.

Any given node only needs to be able to support a subset of all resource groups. OpenMosix::HA provides an extra "test" runmode (beyond init's normal 'wait', 'once', and 'respawn'), enabling the module to automatically test a given node for fitness before considering starting a given resource group there.

There is no "head" or "supervisor" node in an OpenMosix::HA cluster -- there is no single point of failure. Each node makes its own observations and decisions about the start or restart of processes and resource groups.

IO Fencing (also STOMITH or STONITH, the art of making sure that a partially-dead node doesn't continue to access shared resources) can be handled as it is in conventional HA clusters, by a combination of exclusive device logins when using Firewire, distributed locks when using GFS or other SAN, and brute-force methods such as X10 or network-controlled powerstrips. OpenMosix::HA provides a callback hook which can be used to trigger the latter.

METHODS

new(%parms)

Loads Cluster::Init, but does not start any resource groups.

Accepts an optional parameter hash which you can use to override module defaults. Defaults are set for a typical openMosix cluster installation. Parameters you can override include:

mfsbase

MFS mount point. Defaults to /mfs.

mynode

Mosix node number of local machine. You should only override this for testing purposes.

varpath

The local path under / where the module should look for the hactl and cltab files, and where it should put clstat and clinit.s; this is also the subpath where it should look for these things on other machines, under /mfsbase/NODE. Defaults to var/mosix-ha.

timeout

The maximum age (in seconds) of any node's clstat file, after which the module considers that node to be stale, and calls for a STOMITH. Defaults to 60 seconds.

XXX STOMITH callback.
monitor()

Starts the monitor daemon. The monitor ensures the resource groups in cltab are each running somewhere in the cluster, at the runlevels specified in hactl. Any resource groups found not running are candidates for a restart on the local node.

Before restarting a resource group, the local monitor announces its intentions in the local clstat file, and observes clstat on other nodes. If the monitor on any other node also intends to start the same resource group, then the local monitor will detect this and cancel its own restart. The checks and restarts are staggered by random times on various nodes to prevent oscillation.

XXX document run levels: plan test run stop

INSTALLATION

FILES

XXX list files and their purposes; refer to Cluster::Init default filenames

SUPPORT

XXX describe commercial support available for both openMosix and OpenMosix::HA

AUTHOR

        Steve Traugott
        CPAN ID: STEVEGT
        stevegt@TerraLuna.Org
        http://www.stevegt.com

COPYRIGHT

Copyright (c) 2003 Steve Traugott. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

The full text of the license can be found in the LICENSE file included with this module.

SEE ALSO

IS::Init, openMosix.Org, qlusters.com