differences between rsync and pseudo-RSS

Wed Jul 21 21:34:30 PDT 2004

Fairlight wrote:

> Right.  And the question is, is it worth pursuing, now that I have a
> fairly fleshed out spec?  Given that I'd have to code 26 conditions
> (no big deal) for the index handling, I -believe- I can make a
> general parser work correctly, assuming what I remember about using
> DIM is correct.  I'm pretty sure it is, as someone just showed it to
> me a few weeks back.  So in theory, I could make one parser handle
> multiple tables at once from
> one feed, and all should go as planned.  I wonder if you can do "CALL
> (variable)" and have it work correctly.

Yes. I have lines like that in practically every table.

>  If you could, then you could
> actually have an optional <post_processing> tag if you needed one,
> which would be a really nice extension to the scheme.
>
> Given a general parser/storage handler, the rest is documentation of
> how it works procedurally, and possibly a generic exporter.
>
>> I've toyed with the idea a couple years ago ever since I made my
>> first cgi that exports data on demand. I could see it would be not
>> much harder to allow it to import & update and not much harder to
>> allow it to have generic access to all the files, and not much more
>> to have it use the "guts" of various input process tables and
>> reports to duplicate actual user work and not much more to have the
>> update routines (both normal interactive and cgi) include a gosub to
>> send update info to other boxes. By "not much more" I guess I mean
>> it desn't seem like an exoctic puzzle to solve, just a wamba big
>> boatload of more or less straightforward tedious work.
>
> Basically.  I have most of the XML parsing code already.  It would
> need to be modified to handle this DTD though.  I never wrote a
> generic one.  I may yet do that, but I dunno.  Depends how long it
> takes fP-Tech to release their native stuff, and if someone taps me
> with a need for a generic one first.  I'd want to do it properly,
> which would involve full DTD -and- schema (the latter being the side
> I never bothered getting to know, for several reasons) validation
> against the document to ensure structural integrity.  Not a "light"
> project, IMHO.  Then again, neither is this.  :)
>
> For me, the "fun" (read: tedious) parts of this specific RSS-like
> project would be all the fP code related to making sure things were
> stored correctly and that I didn't miss anything.  I know it's a RAD
> environment, but just due to my mindset, I code about 3-5 times
> slower in fP than I do in perl.  It's probably slightly faster than C
> for me right about now. Both take a serious sustained effort at this
> point, just because I'm geared to other things.  That's not a slam on
> the product or the language.  You just get mentally tuned to
> something, and switching to something else entirely that you're no
> longer used to working in constantly feels like going from surfing on
> water to surfing on maple syrup.  Not that I surf--not even the
> web...I -browse-.  :)
>
> I'm still trying to work out a unique ID scheme without resorting to
> externals, in a non-CGI context (which would have the unique ID coming
> from an external that was the parent rather than a child).

Probably have to use a typical fp control-file mechanism.
you have a file with a single record (or single record per
company/usr/qual/tty/client-ip/etc...)
and you do a protected lookup to that field, read it, add one to it, write
it, close it

In this case it's main drawback dowsn't apply, which is, you don't care if
numbers get used perfectly consecutively with no missing/unused numbers. If
a transaction is started and aborted, you don't have to worry about
accounting for the control number it procured.

If performance is a concern (dealing with a file instead of calling some
purely software function),
remember that this files activities will probably be largely software
anyways thanks to the OS caching disk activity and the inode being
continually repeatedly accessed.
And if that isn't good enough you could create a small ramdisk and copy this
file to it and symlink it back into the normal spot, or mount the ramdisk
right over the real file since a fp file just happens to be a directory and
a mount point does not have to be empty. Setting the thing up and copying in
the files and fixing perms could all be scripted and automated at boot
easily and even automatically re-created on the fly by having a little test
in the cgi. Now accessing the control file is pure ram/cpu instead of disk,
and the mechanism is pure filepro and not even using filepro i/o commands
but plain fp lookup commands.

>  Random
> numbers seem like a -really- bad idea.

Completely insufficient.

> I need to get granular to the
> point where I have a unique identifier -inside- a single second, on
> the same machine, since each -record- in the generated feed needs
> one.  Parent PID is out, then.  That leaves me going out to USER to
> grab something from perl or elsewhere for each record, which adds
> overhead I'd rather not have, unless someone knows of a really solid
> uniqueness generation scheme for fP.  I'd rather not have to use
> externals.  I almost need like a sha1 hash on the entire host, time,
> date, and all record data--or at least unique parts of it.  I'm -not-
> about to rewrite sha1 in fP. :) But you get the idea of what I need
> to come up with.  Anything guaranteed to always be unique.  Okay, I
> suppose worst-case, I use a control file on the -sending- end to
> track what I've sent, and each feed record is assigned a sequential
> ID as well, so you check host, date, time, and id.  Crude, but it
> could work, off the top of my head.
>
> But before I get to that point, I need to know if it's worth even
> bothering with the project in general.  Based on the lukewarm
> response so far, I'm guessing not, even though I think it's a great
> idea and could really be handy in any number of situations.  It's not
> worth doing if nobody wants to use it though.  I don't need it
> personally, and none of my clients have asked for such a beast (yet).
> So unless there's some interest, I guess
> I should file the spec away somewhere and sit on it until/unless
> needed, then let someone float the cost of development if someone
> ever ends up wanting/needing it.  "Neat" isn't anywhere near enough
> to justify the time and effort expenditure in this case.  I'd need a
> reasonable expectation of tangible returns to tackle the full
> implementation of something like this, no matter how
> useful/neat/elegant it is to my mind.  If I want a 'hobby' project,
> I'd rather do another GUI program...my first one was an enjoyable
> enough experience, surprisingly.

I think it's a great idea and the kind of far reaching basic utility (the
adjective, not the noun)
that can't even be summarized at this point, but you can be sure, once it
existed would turn out to be the answer or a key ingredient in all manner of
valuable projects, most of which can't even be imagined yet and will boggle
you the original developer even when explained much as I'm sure members of
fp staff have been boggled from time to time with the perversions people
have put fp through to get it to do things they want that fp didn't think to
provide.

The quickest way to commercial viability I can think of now are both kind of
plebian. Necessarily so since the people you need to sell it to will not be
able to get behind anything wackier at first.

idea 1: the easier of the two to impliment and understand by the client:
use it as a backup mechanism. It would be superior to rsync in that data
would never be corrupt, not even my "acceptable most recent record or two"
and the backup would be 100% up to the minute. Also issues of differing
filepro user ids, permissions/ownership, and differing filepro/filesystem
installation layout are no issue. The backup box could even be a different
OS, It just needs a working installation of fp and a web server and a
wget/curl type utility (the noun, not the edjective:)
whereas the rsync method works best if the source and target machines are as
similar as possible.

idea 2: offer to actually do the work of insinuating ffrtt (fairlite filepro
rss transaction transport, yeah, gotta come up with something a little less
verbal trainwreck...) into their app. paid work of course and it would be a
large long term project (which would keep you busy for a good chunk of
while) The payoff would be that the app would end up being scaleable almost
without limit by just adding more servers. for every file that stores data
that grows (vs files that have more or less "static" data) you'd have to add
a little more info to every lookup that shows which server to find the info
on, similar to qualifiers. The idea is not to have the machines all be
duplicates, but to have machine A be able to fetch info from machine B (and
know that it was there etc...). This would allow basically all aspects of
the app to scale, (cpu/disk space/disk io), users could work on all boxes
allowing more than any one box could manage.

hmm, problem with #2 though, it makes the collective system more delicate in
that for the whole to be valid, not only must the usual single server be up
& running and all data & hardware be un-corrupt, but all servers in the
association. A down box would be like a chunk of a hard drive being missing
and a bunch of user accounts dissapearing. You'd probably have to have the
app refuse to run unless all boxes were up. Maybe some functions could be
allowed to proceed and sort of pile-up as pending work. people could enter
orders, but not run financial reports, probably _not_ write checks since you
wouldn't be able to protect against overdraft. hmm, actually, if you want to
work from the position of simply assuming that all servers were in sync
about such universal data like that (current bank balances, inventory counts
etc...) Then I guess you could actually allow some financial activity, just
not any kind of reports or end-of-period process.

*sigh* I guess I _don't_ see this as being "just a boatload of
straightforward work"

maybe a 3rd short route to commercial viability, and medium easy to
impliment, is to use it as a mechanism for sattelite offices to update a
central box. The sattelites could be mostly autonomous stand-alone sites and
only some data is sent in to the home office & vice versa. They would
generate their own job numbers and maintain their own
customer/vendor/inventory databases etc.
That wouldn't require the sort of gene-therapy #2 turns out to be, and be
more useful and interesting than just another form of backup.

This kind of idea-tossing is perfect voice-chat fodder....

Brian K. White  --  brian at aljex.com  --  http://www.aljex.com/bkw/
+++++[>+++[>+++++>+++++++<<-]<-]>>+.>.+++++.+++++++.-.[>+<---]>++.
filePro BBx  Linux SCO  Prosper/FACTS AutoCAD  #callahans Satriani