differences between rsync and pseudo-RSS

Wed Jul 21 22:27:33 PDT 2004

The honourable and venerable Brian K. White spoke thus:

> Fairlight wrote:
>
> > tables at once from one feed, and all should go as planned.  I wonder
> > if you can do "CALL (variable)" and have it work correctly.
>
> Yes. I have lines like that in practically every table.

Ahh, cool.  Thanks!

> > I'm still trying to work out a unique ID scheme without resorting to
> > externals, in a non-CGI context (which would have the unique ID coming
> > from an external that was the parent rather than a child).
>
> Probably have to use a typical fp control-file mechanism.
> you have a file with a single record (or single record per
> company/usr/qual/tty/client-ip/etc...) and you do a protected lookup to
> that field, read it, add one to it, write it, close it
>
> In this case it's main drawback dowsn't apply, which is, you don't care
> if numbers get used perfectly consecutively with no missing/unused
> numbers. If a transaction is started and aborted, you don't have to worry
> about accounting for the control number it procured.
>
> If performance is a concern (dealing with a file instead of calling
> some purely software function), remember that this files activities
> will probably be largely software anyways thanks to the OS caching disk
> activity and the inode being continually repeatedly accessed.  And if

Well, as I see it, you could toss the tracking records on the receiving end
say, 72hrs after the last successful retrieval/update.  One would probably
want to keep the sending records for a longer period, just in case the
polling machines go down.  One might work out a feedback mechanism where if
it sees all sources it expected actually succeeded, it could wipe them
after 72hrs or something.  I mean, what if one machine is out of action for
a week or two?  Well, in that case, you plug it in and run a job that does
a special export feed like emergency.xml and just have it span dates, then
poll manually for that to get back up to speed.  Three days seems
reasonable to catch most problems.  Maybe seven.  The idea is to keep it
light and fast on its feet though.

> I think it's a great idea and the kind of far reaching basic utility (the
> adjective, not the noun) that can't even be summarized at this point,
> but you can be sure, once it existed would turn out to be the answer or
> a key ingredient in all manner of valuable projects, most of which can't
> even be imagined yet and will boggle you the original developer even when
> explained much as I'm sure members of fp staff have been boggled from
> time to time with the perversions people have put fp through to get it to
> do things they want that fp didn't think to provide.

Yes, I've thought that with a number of things and had less than stellar
success.  The jury is still out on some of them though.  :)

> idea 1: the easier of the two to impliment and understand by the client:
> use it as a backup mechanism. It would be superior to rsync in that
> data would never be corrupt, not even my "acceptable most recent record
> or two" and the backup would be 100% up to the minute. Also issues
> of differing filepro user ids, permissions/ownership, and differing
> filepro/filesystem installation layout are no issue. The backup box could
> even be a different OS, It just needs a working installation of fp and a
> web server and a wget/curl type utility (the noun, not the edjective:)
> whereas the rsync method works best if the source and target machines are
> as similar as possible.

Ooh...that is a plus.

> idea 2: offer to actually do the work of insinuating ffrtt (fairlite
> filepro rss transaction transport, yeah, gotta come up with something a
> little less verbal trainwreck...) into their app. paid work of course

*laugh*  I know.  I'm horrible at naming things.  Hmm.  SyncLight?  
Fairlight SyncWhole?  *chuckle*

> and it would be a large long term project (which would keep you busy
> for a good chunk of while) The payoff would be that the app would end
> up being scaleable almost without limit by just adding more servers.
> for every file that stores data that grows (vs files that have more or
> less "static" data) you'd have to add a little more info to every lookup
> that shows which server to find the info on, similar to qualifiers. The
> idea is not to have the machines all be duplicates, but to have machine
> A be able to fetch info from machine B (and know that it was there
> etc...). This would allow basically all aspects of the app to scale,
> (cpu/disk space/disk io), users could work on all boxes allowing more
> than any one box could manage.

Whoa.  So basically the fP equivalent of InnoDB's disk-spanning tables,
but you get to map whole machines?  Boy, I'd never have thought of -that-
one.  That's...an interesting logistical nightmare you've conjured up.  I'm
not sure I'll sleep well now.  "This exercise exceeds the scope of this
document..." :)

In all seriousness, it -does- sound cool.  It sounds like a migraine
waiting to happen, but it would be cool to pull it off.

> hmm, problem with #2 though, it makes the collective system more delicate
> in that for the whole to be valid, not only must the usual single server
> be up & running and all data & hardware be un-corrupt, but all servers in
> the association. A down box would be like a chunk of a hard drive being
> missing and a bunch of user accounts dissapearing. You'd probably have to

No doubt.  I'm not entirely sure a distributed model of -that- kind is
technically sound, given those kinds of ramifications.  It's definitely a
cool and different take, but the technical logistics make it seem unwieldly.

> *sigh* I guess I _don't_ see this as being "just a boatload of
> straightforward work"

Not given -that- example. :) The thing is, it has multiple applications,
as I said last night and you just pointed out more to me.  I think the
basic toolset of one or two modules, and a good set of documentation would
be a good place to start.  Keeping a "cluster" (non-topographical) in
sync is one thing, and the first thing that comes to mind for me.  Doing
a distributed "cluster" is much more involved.  But given the tools and
conceptual framework, others could extend it to use as they see fit, as
well--kind of like people have done with fP itself.

> maybe a 3rd short route to commercial viability, and medium easy
> to impliment, is to use it as a mechanism for sattelite offices to
> update a central box. The sattelites could be mostly autonomous
> stand-alone sites and only some data is sent in to the home office & vice
> versa. They would generate their own job numbers and maintain their own
> customer/vendor/inventory databases etc.  That wouldn't require the sort
> of gene-therapy #2 turns out to be, and be more useful and interesting
> than just another form of backup.

Yeah, #2 is kind of like the master ring.  It's cool, but it'll mess with
your head if you go near it.  :)

#1 and #3 sound the closest to the scenarios I was thinking of.

> This kind of idea-tossing is perfect voice-chat fodder....

Yup.  I can't tonight though, for various reasons.

Thanks for the feedback so far!

mark->
-- 
Bring the web-enabling power of OneGate to -your- filePro applications today!

Try the live filePro-based, OneGate-enabled demo at the following URL:
               http://www2.onnik.com/~fairlite/flfssindex.html