variable lookups possible?
Bill Vermillion
fp at wjv.com
Wed Jul 21 17:48:47 PDT 2004
On or about Wed, Jul 21 14:47 , while attempting a Zarathustra
emulation Fairlight thus spake:
> On Wed, Jul 21, 2004 at 02:32:28PM -0400, Walter Vaughan may or may not have
> proven themselves an utter git by pronouncing:
> > Fairlight wrote:
> > > I now believe that I can make the pseudo-RSS parsing module a reality
> > How does this add any fuctionality that rsync doesn't already offer?
> > Interesting minds want to know.
> Glad you asked. I just went through the man page for rsync
> again to make sure, (rsync 2.5.6 protocol 26)
> Basically the docs even claim it's basically just a fancy
> replacement for rcp that can use multiple transport layers.
Did you miss the part that rsync is much faster when the far
file exists, and then only transmits the diffs of the files and
then edits the far file using those diffs. It also used file
compression to make transfer time shorter.
That surely sound like it is far more than a 'fancy replacement for
rcp'.
I've been using rsync for backups to a spare machine from multiple
servers. The first time you run rsync on a large directory tree
it can take a long time, and then from then on it can be very
short.
Current version of rsync is 2.6.2_1
I also had a client move and entire remote web-site across the net
using rsync. He called and said he had only limited room to make
tar files where he was hosted, and wanted to know if I knew of a
better way. I talked him through rsync, and then he just turned it
loose.
> The problem with this is that it has no way to let fP know that
> it may be working on the file (ie., it may be only partially
> copied/diff'd at any given point during its execution).
Valid point.
> In short, you could end up working with a corrupt table if you
> hit something at the wrong time. The docs for rsync make zero
> mention of any sort of file locking options or functionality.
> Every search for 'lock' only referred to blocking/non-blocking
> I/O, or the block size. In short, I think it's unsafe for
> use during fP's continued operation, and you couldn't really
> rely on this without making fP inaccessible via some downtime
> with httpd, or via code that checks for a lockfile that's
> created/removed by a wrapper around rsync, etc.
That wounds like the 'snapshots' in the FreeBSD 5 would ideal.
It looks like late summer will be the time that the current 5.x
becomes the current STABLE branch - suitable for production.
Snapshots dont take too long. Then you can backup the snapshot
while the system continues. Once you make a snapshot, you can
copy it, make a tape backup, dump it, etc. Snapshots also mean
that you can be up and running after an inadvertent reboot without
waiting a long time - probably over an hour or so on a
multi-terabyte file-system. Charts on his site shows it
takes 3.5 seconds to snapshot a 7.7GB file system on an idle
system and 12.1 seconds to snapshot on active file system.
If you look at the paper that Marshal Kirk McKusick presented
you'll find some interesting concepts complete with diagrams of
how he handles the inodes.
Instead of posting the long Usenix link, just go to
http://www.mckusick.com/softdep/index.html and the click on the
link in the last line about background-fsck. That will take
you to the Usenix site. There you can read the file in HTML of
download it in PDF or Postscript.
As drives get larger the need for this becomes stronger.
...
> Yes, you could rsync more than nightly, or even a few times a
> day. But as you do, your chances of colliding would naturally
> go up. :(
Night-time rsync usually isn't a problem, but as you point out it
can be during the day. The snapshot can be run often. That means
if someone accidentally deletes a file they can retrieve it from
the earlier snapshot. Since a snapshot only copies only block
that are actively used they dont take up that much space.
Really interesting concept. And a good read.
--
Bill Vermillion - bv @ wjv . com
More information about the Filepro-list
mailing list