OT: How to cluster two computers

Mon Oct 4 16:56:13 PDT 2004

Confusious (Enrique Arredondo) say:
> > What kind of cluster do you want?  Distributed computing cluster?  Web
> > server cluster?
> >
> > What exactly is your goal?
> 
> Let's start with Distributed computing cluster, I guess once you have this 
> one running you can add the web server cluster later right ?

We'll hold in abeyance the fact that they're really two (almost) entirely
orthogonal types of clustering. :)

The key question remains the same:  What is your exact goal?

People don't generally set up clusters for the mere joy of it.  Well, I
tell a lie, I know some people that have.  But by and large, there has to
be a purpose behind it.

So, assuming you want a distributed computing cluster, you need something
to compute in a distributed fashion.  You also need something that requires
the kind of horsepower that a cluster would provide.

Very -large- distributed projects in the past might be exemplified by both
DESChal, and Seti at Home.  That's specialised client-server based software
that divides a task into chunks that are distributed between machines in
the cluster as each becomes free pursuant to completing its prior segment of
work.

On a smaller scale, you can do a number of projects that exceed the scope
of one system.  Ignore the fact that "Linux" is strewn throughout this
document, and peruse the brief contents of:
     http://www.beowulf.org/overview/index.html

Generally speaking, unless you're doing some really heavy statistical data
analysis, or data mining of web sites for storage into fP, I don't see any
particular case where a HPC cluster would net enough gain to be worth
coding the software.

Now, High Availability clustering is readily achievable relatively
inexpensively, although the logistics behind keeping multiple instances of
fP's data synchronised -in near realtime- are a bit dodgy without something
similar to the pseudo-RSS transaction scheme I proposed about two months
ago.  One could use a single storage platform, but then one also contends
with a single fault point--the data storage system.  If that becomes
unavailable for any reason, the cluster is no longer highly available--it
is, in fact, unavailable until the single fault is resolved.  An HA cluster
with non-realtime data (say, nightly or at least more periodic
synchronisations where single server is taken 'down' at a time in a public
data serving capacity for synchronisation to avoid data corruption) is
far more readily achievable.

Before we get into a huge discussion about the pros, cons and wherefores,
let us first discuss the whys.

I reiterate:  What -precisely- do you want/need to achieve, and why do you
perceive a cluster as the way to go about achieving it?

mark->
-- 
Bring the web-enabling power of OneGate to -your- filePro applications today!

Try the live filePro-based, OneGate-enabled demo at the following URL:
               http://www2.onnik.com/~fairlite/flfssindex.html