Daemon News Ezine BSD News BSD Mall BSD Support Forum BSD Advocacy BSD Updates

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Options for synchronising filesystems

Isaac Levy wrote:
Hi Brian, All,

This email has one theme: GEOM! :)

On Sep 24, 2005, at 10:10 AM, Brian Candler wrote:


I was wondering if anyone would care to share their experiences in
synchronising filesystems across a number of nodes in a cluster. I can think of a number of options, but before changing what I'm doing at the moment I'd
like to see if anyone has good experiences with any of the others.

The application: a clustered webserver. The users' CGIs run in a  chroot
environment, and these clearly need to be identical (otherwise a CGI running
on one box would behave differently when running on a different box).
Ultimately I'd like to synchronise the host OS on each server too.

Note that this is a single-master, multiple-slave type of filesystem
synchronisation I'm interested in.

I just wanted to throw out some quick thoughts on a totally different approach which nobody has really explored in this thread, solutions which are production level software. (Sorry if I'm repeating things or giving out info yall' already know:)

http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/geom- intro.html

The core Disk IO framework for FreeBSD, as of 5.x, led by PHK:

This framework itself is not as useful to you as the utilities which make use of it,

Geom Gate:

Network device-level client/server disk mapping tool.
(VERY IMPORTANT COMPONENT, it's reportedly faster, and more stable than NFS has ever been- so people have immediately and happily deployed it in production systems!)

Gvinum and Gmirror:


(Sidenote: even Greg Lehey (original author of Vinum), has stated that it's better to use Geom-based tools than Vinum for the forseeable future.)

In a nutshell, to address your needs, let me toss out the following example setup:

I know of one web-shop in Canada, which is running 2 machines for every virtual cluster, in the following configuration:

2 servers,
4 SATA drives per box,
quad copper/ethernet gigabit nic on each box

each drive is mirrored using gmirror, over each of the gigabit ethernet nics
each box is running Vinum Raid5 across the 4  mirrored drives

The drives are then sliced appropriately, and server resources are distributed across the boxes- with various slices mounted on each box. The folks I speak of simply have a suite of failover shell scripts prepared, in the event of a machine experiencing total hardware failure.

Pretty tough stuff, very high-performance, and CHEAP.

With that, I'm working towards similar setups, oriented around redundant jailed systems, with an eventual end to tie CARP (from pf) into the mix to make for nearly-instantaneous jailed failover redundancy- (but it's going to be some time before I have what I want worked out for production on my own).

Regardless, it's worth tapping into the GEOM dialogues, as there are many new ways of working with disks coming into existence- and the GEOM framework itself provides an EXTREMELY solid base to bring 'exotic' disk configurations up to production level quickly. (Also noteworthy, there's a couple of encrypted disk systems based on GEOM emerging now too...)

I think the original poster (and I at least) knew about this already, but what I still fail to see is how you can get several machines using the same data at the same time, and still do updates to that data? The only way I know of is to use a syncing tool (like rsync) or a shared filesystem (like NFS, or CXFS, or Polyserve FS, opengfs, etc), none of which run on FreeBSD.

What I read from above, is a redundant server setup, not a high-performance setup (meaning multiple machines serving the same data to many clients). If I'm missing something, please fill me in..


Eric Anderson        Sr. Systems Administrator        Centaur Technology
Anything that works is better than anything that doesn't.