Critical uptime question (Was "Looking for some upgrade advice")

Mon May 29 20:00:33 PDT 2006

>Message: 1
>Date: Sun, 28 May 2006 08:51:54 -0400
>From: "John Esak" <john at valar.com>
>Subject: RE: Critical uptime question (Was "Looking for some upgrade
>	advice")
>To: "Fplist (E-mail)" <filepro-list at seaslug.org>
>Message-ID: <JIECJPPMJGMIINMGGNGAKEPNOIAA.john at valar.com>
>Content-Type: text/plain; charset="iso-8859-1"
>
>
>  -----Original Message-----
>  From: Boaz Bezborodko [mailto:boaz at mirrotek.com]
>  Sent: Sunday, May 21, 2006 10:17 PM
>  To: filepro-list at lists.celestial.com; John Esak
>  Subject: Critical uptime question (Was "Looking for some upgrade advice")
>
>
>
>
>  John Esak wrote:
>Date: Fri, 19 May 2006 07:37:16 -0400
>From: "John Esak" <john at valar.com>
>Subject: OT: RE: Looking for some upgrade advice
>To: "Fplist (E-mail)" <filepro-list at seaslug.org>
>Cc: Rick Walsh <rick at nexusplastics.com>
>Message-ID: <JIECJPPMJGMIINMGGNGAAEHJPBAA.john at valar.com>
>Content-Type: text/plain;	charset="us-ascii"
>
>  I suspect the only reason I haven't seen comparable uptimes on my linux
>systems is because the kernel updates require a reboot.  I talked directly
>to the 2nd in charge of the kernel, as well as some of the other kernel
>devs, and the consensus was that if I wanted a hot-swappable kernel, I
>could go and write the hot-swap code myself.  They didn't consider it a
>priority, or even desirable.
>
>As you know, the *last* thing in the world I want to do is start a Linux
>thread here. :-)
>
>BUT... this is something I hadn't considered in our upcoming major move to
>SuSe Linux. We have a situation where the main *nix server (currently SCO
>OpenServer 5.6) can NOT go down at all. Literally, it is used to produce
>various things, mostly bar code lables 365/24/7... with absolutely NO down
>time at all except for two week long vacations during the year and some
>other extremely special circumstances... hardly would I called these
>"planned maintenance"... mor like get in whatever we can because the system
>went down for some unforeseen reason! :-)  Very occasionally, and I mean
>very occasionally, we can stop the constant transactional postings (and
>label printing) for a few minutes... rally, just a few. Otherwise, it
>becomes much like the "I Love Lucy" chocolate factory conveyor belt scene.
>
>What, seriously, are we going to do in this situation. I was kin of hoping
>we could find a *stable* Linux... meaning a kernel that does not need that
>much or *any* patching. Are you talking about real security problems, or
>feature upgrades? We simply can not bring the mahcine down for either
>reason... at least not on *any* kind of ongoing basis.... how in the world
>does *anyone* cope with such a situation.
>
>Yes, yes, I'm constantly considering and devising possible methods to
>de-reference our main databases and CPU's from this immediate hardware
>interface... but to date, I have not come up with anything that would work
>well enough to meet the need. Our systems are currently up-to-the-minute and
>pretty much *have* to stay that way.
>
>Suggestions?
>
>John Esak
>
>  John,
>
>  I was thinking about this over the weekend.  It seems to me that you could
>give yourself a whole lot of flexibility if you could somehow duplicate the
>database you're working with.  I think that I could do this if the database
>was not stored on the same machine as that which is executing the filePro
>code.
>
>  Here is how I see it working:
>  Run two different servers each with its own copy of the database files.
>One is the one that is directly accessed by the users while the other gets
>updated with all of its transactions.  Whenever it gets a transaction it
>generates a record of the transaction as a separate file for the second
>database to read.  The second database would have a process that will look
>for these transactions and update the files on its database.
>
>  You could set up a controlled switch from the server running the first
>database to the one running the second database.  At the end of each
>transaction executed on the data of the primary database you can have code
>that will check some status flag as to the condition of the server of that
>database.  You can program the process to force the user to exit out of the
>application if it sees a flag that tells it to switch to the secondary
>database files.  Once it exits you change the clients' configuration to work
>from the secondary database upon re-execution.  All transactions will
>eventually move on to the second database server until all processes have
>transfered leaving the first free for any changes or updates.  In the
>meantime that secondary database has now started acting as the primary
>database and is building up a list of transactions that the original
>database will have to update to bring it to the same condition as the first
>once it is started up again.  (Or you might be able to just copy files.)
>
>  I don't know much about Linux, but I could see how an application working
>on the same computer as the data would have a much harder time detecting and
>adjusting for a switch to run off a different server.  But if all you're
>doing is going to a virtual drive similar to how I do it now--by running the
>processes on separate windows machines while they look to mapped network
>drives for the data--it is a much easier process to have a script that will
>exit out of the process, change the mapped network drive to point to the new
>server, and then re execute.
>
>  It seems like this will work well enough, but not knowing the actual
>application I don't know if this is a good solution for you.
>
>  Boaz
>
>
>  I certainly appreciate the suggestion. however....  Well, I don't exactly
>know how to state the however's... except that unfortunately, I think you
>have minimized the do-ability of the procedures you are suggesting.  if
>filePro were a true RDBMS system with all that this entails  (a la
>Codd/Date) then yes, this would be possible.  With filePro being what it
>is... a flat file system using code written in this case by me or Rick to
>make it function... there is I'm afraid to say, no viable way to emulate the
>massive and deeply entwined   transactional nature of such databases.
>
>  I don't mean to belittle the programming genius of either myself or my
>colleague, Rick.  :-)  At least 200 people on this list will certify that I
>feel I am the best filePro programmer in the world... :-)  and have stated
>so many times.  :-)  Is that enough smiley's to let everyone know that I am
>fully aware that once you reach a certain level of ability, the most you can
>be is occasionally first among equals... and there are a hell of a lot of
>"equal" filePro programmers out there to me... possibly even yourself... but
>of course I don't know you or your prowess so I'll just give you the benefit
>of the doubt and guess that you are thoroughly capable of writing whatever
>you want in filePro as the rest of us.  HOWEVER, and finally it comes...
>there is simply no way and no one I know, including even Ken (either Brody
>or White) who could possibly do what you suggest.  The reasons are many, but
>mostly they lay in the misnomer of filePro itself and sometimes just
>datasets themselves and Oracle all being called a "database".  Truly, if I
>could cull out the particular "database" as you are referring to it and
>manipulate a transaction log and transaction data file in the right ways, I
>could move between two application servers... maybe, possibly.... perhaps...
>The reality, however, (there's that word again) is that I have more than 700
>active data files, thousands of processing procedures, and hundreds of
>system/shell scripts to do the work in any particular sub-application of
>what goes on at Nexus. If the module at hand interfaces with our gut-level
>accounting system (also completely filePro based), then many dozens of any
>of these several hundreds/thousands of items may have to interact... and in
>a very tightly knit way.  The task of building in (from within filePro
>itself) some registering, transactional  process is not only daunting, but,
>I think, impossible. I am certain, after many years of contemplating such
>things... and even implementing many smaller subsets of this idea for myself
>and others... that on a large full-scale basis, the only way to do it with
>full fail-safe operation is to have the process and control already built-in
>to the RDBMS itself... as it, of course, is in those systems. In other
>words, I wouldn't ever try "duplicating" this with filePro. I am one of the
>longest-standing, loudest proponents of filePro and all that it can do,
>which is nothing short of staggering... but I am also a realist and know it
>is not internally capable of working in this way.
>
>  The most I can hope for  is a methodology wherein all of our machinery
>that is "tied" to filePro can be simultaneously paused for a specified
>length of time during which a controlled transfer to a duplicate system can
>be managed.  Along with this nearly impossible task, a controlled transfer
>of all user processes to the duplicate system must also be attained... and
>this is even harder to conceive.  It can be planned and done (mostly in a
>manual way) making sure IP's and all related matters are dealt with
>properly... but it has never been something I am capable of handling
>"automatically".
>
>
>  Short of absolutely pausing "production" for some length of time, there is
>no cut-over or fail-over procedure which can be accomplished as our system
>now sits.  We can immediately fail-over from communications lines failures
>like T1's going down between plants... but if the "main" server itself
>fails, we always have a 15 minute to 30 minute period of real havoc and
>repair that has to go on before things are back to normal on either "main"
>or  "duplicate" server.  I hate the situation, but as I said, haven't been
>able to come up with a solution yet.
>
>  We have looked into some of the older "mirroring server" software
>solutions, but they are extremely expensive and never seem to guarantee that
>what we are  looking for ... is what they provide.  If you, or anyone, know
>such a system, speak up.  The old days of 1776 and others like that are
>gone.  At least, our searches for such a solution do not ever include SCO
>Open Server.  Perhaps, we will have more luck with Linux.  (Although, I
>don't have high hopes for an affordable solution.
>
>  In the very first place, no one has considered the "licensing" for a
>completely hot-swappable server?  In terms of filePro, I think it was
>brought up during the beta testing for 5.6, but I don't remember if there
>was a precise plan regarding this.  Imagine that I did have a pair of
>mirrored servers, would I need to get a license for both  machines?  I
>wouldn't think so, but it is part of the whole cost thing to some degree.
>
>  Sorry to be so wordy in all this... and yes, I would be thoroughly happy
>if someone were to say, why haven't you checked out "thus-and-such" it does
>exactly what you need... but I don't think it is going to happen.
>
>  And to get back to your suggestion for just one second... I probably could
>have stated all this in the few words....  "Good idea, but it is just too
>hard for me to conceive of doing from within filePro."  :-)  I've been
>waiting for a large download to finish... and had some time on my hands...
>:-)
>
>
>  John
>  
>
John,

I realize that the amount of work necessary is related to how large the 
database is.  Given what you're telling me it does seem that while it 
may be possible, the amount of work necessary would almost be like 
rewriting every operation from scratch.

Boaz