Critical uptime question (Was "Looking for some upgrade advice")

Mon May 22 07:45:58 PDT 2006

  _____  

From: filepro-list-bounces at lists.celestial.com
[mailto:filepro-list-bounces at lists.celestial.com] On Behalf Of Boaz
Bezborodko
Sent: Sunday, May 21, 2006 10:17 PM
To: filepro-list at lists.celestial.com; John Esak
Subject: Critical uptime question (Was "Looking for some upgrade advice")

John Esak wrote: 

Date: Fri, 19 May 2006 07:37:16 -0400

From: "John Esak"  <mailto:john at valar.com> <john at valar.com>

Subject: OT: RE: Looking for some upgrade advice

To: "Fplist (E-mail)"  <mailto:filepro-list at seaslug.org>
<filepro-list at seaslug.org>

Cc: Rick Walsh  <mailto:rick at nexusplastics.com> <rick at nexusplastics.com>

Message-ID:  <mailto:JIECJPPMJGMIINMGGNGAAEHJPBAA.john at valar.com>
<JIECJPPMJGMIINMGGNGAAEHJPBAA.john at valar.com>

Content-Type: text/plain;	charset="us-ascii"

I suspect the only reason I haven't seen comparable uptimes on my linux

systems is because the kernel updates require a reboot.  I talked directly

to the 2nd in charge of the kernel, as well as some of the other kernel

devs, and the consensus was that if I wanted a hot-swappable kernel, I

could go and write the hot-swap code myself.  They didn't consider it a

priority, or even desirable.

As you know, the *last* thing in the world I want to do is start a Linux

thread here. :-)

BUT... this is something I hadn't considered in our upcoming major move to

SuSe Linux. We have a situation where the main *nix server (currently SCO

OpenServer 5.6) can NOT go down at all. Literally, it is used to produce

various things, mostly bar code lables 365/24/7... with absolutely NO down

time at all except for two week long vacations during the year and some

other extremely special circumstances... hardly would I called these

"planned maintenance"... mor like get in whatever we can because the system

went down for some unforeseen reason! :-)  Very occasionally, and I mean

very occasionally, we can stop the constant transactional postings (and

label printing) for a few minutes... rally, just a few. Otherwise, it

becomes much like the "I Love Lucy" chocolate factory conveyor belt scene.

What, seriously, are we going to do in this situation. I was kin of hoping

we could find a *stable* Linux... meaning a kernel that does not need that

much or *any* patching. Are you talking about real security problems, or

feature upgrades? We simply can not bring the mahcine down for either

reason... at least not on *any* kind of ongoing basis.... how in the world

does *anyone* cope with such a situation.

Yes, yes, I'm constantly considering and devising possible methods to

de-reference our main databases and CPU's from this immediate hardware

interface... but to date, I have not come up with anything that would work

well enough to meet the need. Our systems are currently up-to-the-minute and

pretty much *have* to stay that way.

Suggestions?

John Esak

John, 

I was thinking about this over the weekend.  It seems to me that you could
give yourself a whole lot of flexibility if you could somehow duplicate the
database you're working with.  I think that I could do this if the database
was not stored on the same machine as that which is executing the filePro
code.

Here is how I see it working:
Run two different servers each with its own copy of the database files.  One
is the one that is directly accessed by the users while the other gets
updated with all of its transactions.  Whenever it gets a transaction it
generates a record of the transaction as a separate file for the second
database to read.  The second database would have a process that will look
for these transactions and update the files on its database.  

You could set up a controlled switch from the server running the first
database to the one running the second database.  At the end of each
transaction executed on the data of the primary database you can have code
that will check some status flag as to the condition of the server of that
database.  You can program the process to force the user to exit out of the
application if it sees a flag that tells it to switch to the secondary
database files.  Once it exits you change the clients' configuration to work
from the secondary database upon re-execution.  All transactions will
eventually move on to the second database server until all processes have
transfered leaving the first free for any changes or updates.  In the
meantime that secondary database has now started acting as the primary
database and is building up a list of transactions that the original
database will have to update to bring it to the same condition as the first
once it is started up again.  (Or you might be able to just copy files.)

I don't know much about Linux, but I could see how an application working on
the same computer as the data would have a much harder time detecting and
adjusting for a switch to run off a different server.  But if all you're
doing is going to a virtual drive similar to how I do it now--by running the
processes on separate windows machines while they look to mapped network
drives for the data--it is a much easier process to have a script that will
exit out of the process, change the mapped network drive to point to the new
server, and then re execute.

It seems like this will work well enough, but not knowing the actual
application I don't know if this is a good solution for you.

Boaz 

There are some Windows server based programs that will allow you to mirror 2
servers.  While not a true cluster, these programs will keep 2 servers in
sync and allow for a quick switch in the event that the primary server
fails.

Richard Kreiss
GCC Consulting

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.celestial.com/pipermail/filepro-list/attachments/20060522/abe4a9d6/attachment-0001.html