differences between rsync and pseudo-RSS

Fairlight fairlite at fairlite.com
Wed Jul 21 15:04:02 PDT 2004


Yo, homey, in case you don' be listenin', Brian K. White done said:
> 
> Yes rsync of fp data should only be done while fp users on both systems are
> locked out, cron reports suspended, etc...

Agreed.

> I rsync live data to backup boxes while running live, but that is a
> conscious decision weighing the advantages & disadvantages.

Ouch.  May it never bite you.  *crosses fingers*

> The worst case scenario is is both not very bad and extremely unlikely.
> rsync happens in the middle of a year-end process.
> live box dies later
> 
> So it just means you restore yesterdays data from one of the 2 nightly
> backups (cmpressed tar of fp data, tape of everything)
> re-enter a days work, and re-run the process.
> 
> far more likely is:
> rsync happens while a few different orders are being entered
> live box dies later.
> 
> In that case if they had to switch to running on the backup box they just do
> a rebuild-all-indexes initially and delete/re-enter the couple buggy orders
> when they discover them.

A lot of cases--and those are the ones you -thought- of.  It's always the
ones we never think of that bite us.  :-/

For my money, anything that can go wrong is sufficient cause to avoid
potentially inciting it.  I don't think I'd be that cavalier about it,
myself.  Something may be unlikely to happen, but I like knowing that
something is as bulletproof as I can make it.  It could very well save me
(or someone else) time and money down the road.

I'm sufficiently anal retentive enough that I'll re-release (silently if
I know nobody has obtained the version yet) for a spelling typo in the
--changelog feature of a program, even if I know nobody but me looks at
it.  I simply won't consider knowingly setting something up that has the
potential for actual procedural functionality failure.  I'm not criticising
you, I'm saying that for -me- it's not a matter of weighing benefits and
risks, it's a matter of any significant risk being unacceptable.  I'm the
kind of person that needs to know the carpet is not only tacked down, but
that it's been sealed around with liquid nails, can withstand a category
two hurricane without coming up at even one corner, and can generally
survive majorly unlikely disasters, including acts of God or the general
ignorance of end-users.  I "just need to be -absolutely- sure" of things,
for my own peace of mind.

But that's me.

> But this is for backup and having some kind of safety in the event the main
> server dies where it's more important that the data be as complete and as
> current as possible and it's ok if a few transactions are incomplete or
> inconsistent. I agree rsync is no good at all for keeping multiple live
> servers in sync unless you could take both servers off-line for the duration
> and exit any currently running fp process. For that you need to encapsulate
> the updates into some form of transaction and have fp perform the updates
> exactly as if a user were to do the same work. Which is exactly what you are
> on to.

Right.  And the question is, is it worth pursuing, now that I have a
fairly fleshed out spec?  Given that I'd have to code 26 conditions (no
big deal) for the index handling, I -believe- I can make a general parser
work correctly, assuming what I remember about using DIM is correct.  I'm
pretty sure it is, as someone just showed it to me a few weeks back.  So
in theory, I could make one parser handle multiple tables at once from
one feed, and all should go as planned.  I wonder if you can do "CALL
(variable)" and have it work correctly.  If you could, then you could
actually have an optional <post_processing> tag if you needed one, which
would be a really nice extension to the scheme.

Given a general parser/storage handler, the rest is documentation of how it
works procedurally, and possibly a generic exporter.

> I've toyed with the idea a couple years ago ever since I made my first cgi
> that exports data on demand. I could see it would be not much harder to
> allow it to import & update and not much harder to allow it to have generic
> access to all the files, and not much more to have it use the "guts" of
> various input process tables and reports to duplicate actual user work and
> not much more to have the update routines (both normal interactive and cgi)
> include a gosub to send update info to other boxes. By "not much more" I
> guess I mean it desn't seem like an exoctic puzzle to solve, just a wamba
> big boatload of more or less straightforward tedious work.

Basically.  I have most of the XML parsing code already.  It would need to
be modified to handle this DTD though.  I never wrote a generic one.  I may
yet do that, but I dunno.  Depends how long it takes fP-Tech to release
their native stuff, and if someone taps me with a need for a generic one
first.  I'd want to do it properly, which would involve full DTD -and-
schema (the latter being the side I never bothered getting to know, for
several reasons) validation against the document to ensure structural
integrity.  Not a "light" project, IMHO.  Then again, neither is this.  :)

For me, the "fun" (read: tedious) parts of this specific RSS-like project
would be all the fP code related to making sure things were stored
correctly and that I didn't miss anything.  I know it's a RAD environment,
but just due to my mindset, I code about 3-5 times slower in fP than I do
in perl.  It's probably slightly faster than C for me right about now.
Both take a serious sustained effort at this point, just because I'm geared
to other things.  That's not a slam on the product or the language.  You
just get mentally tuned to something, and switching to something else
entirely that you're no longer used to working in constantly feels like
going from surfing on water to surfing on maple syrup.  Not that I
surf--not even the web...I -browse-.  :)

I'm still trying to work out a unique ID scheme without resorting to
externals, in a non-CGI context (which would have the unique ID coming
from an external that was the parent rather than a child).  Random numbers
seem like a -really- bad idea.  I need to get granular to the point where
I have a unique identifier -inside- a single second, on the same machine,
since each -record- in the generated feed needs one.  Parent PID is out,
then.  That leaves me going out to USER to grab something from perl or
elsewhere for each record, which adds overhead I'd rather not have, unless
someone knows of a really solid uniqueness generation scheme for fP.  I'd
rather not have to use externals.  I almost need like a sha1 hash on the
entire host, time, date, and all record data--or at least unique parts of
it.  I'm -not- about to rewrite sha1 in fP. :) But you get the idea of what
I need to come up with.  Anything guaranteed to always be unique.  Okay, I
suppose worst-case, I use a control file on the -sending- end to track what
I've sent, and each feed record is assigned a sequential ID as well, so
you check host, date, time, and id.  Crude, but it could work, off the top
of my head.

But before I get to that point, I need to know if it's worth even bothering
with the project in general.  Based on the lukewarm response so far, I'm
guessing not, even though I think it's a great idea and could really be
handy in any number of situations.  It's not worth doing if nobody wants
to use it though.  I don't need it personally, and none of my clients have
asked for such a beast (yet).  So unless there's some interest, I guess
I should file the spec away somewhere and sit on it until/unless needed,
then let someone float the cost of development if someone ever ends up
wanting/needing it.  "Neat" isn't anywhere near enough to justify the time
and effort expenditure in this case.  I'd need a reasonable expectation of
tangible returns to tackle the full implementation of something like this,
no matter how useful/neat/elegant it is to my mind.  If I want a 'hobby'
project, I'd rather do another GUI program...my first one was an enjoyable
enough experience, surprisingly.

mark->
-- 
Fairlight->   ||| "Ziggy played for time... / jiving | Fairlight Consulting
  __/\__      ||| us that we were voodoo, / the kids |
 <__<>__>     ||| were just crass, / he was the      | http://www.fairlite.com
    \/        ||| nazz... / He took it all too far-- | info at fairlite.com
              ||| but boy could he play guitar!" --  |
              ||| Bowie                              |


More information about the Filepro-list mailing list