OT: redhat

Sun Nov 7 09:05:27 PST 2004

On Fri, Nov 05 18:35 , Men gasped, women fainted, and small children 
were reduced to tears as Bill Campbell confessed to all:" 

> On Fri, Nov 05, 2004, Fairlight wrote:
> >This public service announcement was brought to you by Bill Vermillion:

> >> I wish more people would take the time to understand file systems
> >> and how they work.   As above - I still see many advocating 
> >> one huge file system - and one of their reasons is so they don't
> >> run out of space in any one file system.  I think they must
> >> be MS converts.

> >Not necessarily. I've seen systems where people allocated the
> >defaults from the vendor and ended up running low on /usr,
> >etc. Having /usr separate was also a pain depending on what
> >was and wasn't dynamically linked against what at boot. :(

> When I installed my FreeBSD system here, I looked at the
> default sizes that came up with an automatic allocation, and
> doubled them all.

The newest FreeBSDs do have larger default sizes.  The 40MB for
/var got a bit cramped.  I just routinely allocate 125MB for /
and 125MB for /var and have no problems at all.  And I tend to use
more /var as I rotate the large logs daily and keep several days
in order to look for problems when someone says "about 5 days ago
.."  [I wish people would report problems when they first find them
instead of trying for days to get something to work.]

> Every couple of years it seems that I have to double the amount
> of space I allocate for the ``/'' file system on Linux systems.

I've seen that too. Using 2GB+ is not unusual in Linux, but using 
more than 100MB in FreeBSD / is unusual.  Though the 5.x has
much larger defaults.  Now that 5.3 is officially released and
6.0 is now about two steps past the bleeding edge, well start
seeing more people migrating to the 5.x.

I suspect that is going to catch a few people unaware who have
applications that depend on file system structures and expect
such things as the long standing 128byte inode size.  And with the
large inodes we have a create time for the first time.

SCO mis-documented ctime as innode creation time instead of CHANGE
time, and that mis-understanding has propagated widely.  So
now there is access time, modifciation time, change time, and
creation time.   It also means that we can have much larger files
being accessed with only block pointers in the inode being used
for much larger files.  In the default SysV structure files over
about 10K needed to look to an indirect block [assuming 2 phsyical
blocks for one logical block.  Larger logical blocks meant larger
files could be found directly out of the base inode.] 

> ...

> >With journalling to avoid fsck's, and good backup policies, is
> >it even as much of an issue these days, that a large single /
> >really -needs- to be avoided?

> As a rule, if a file system gets nuked, it's ``/'', and I
> really like to have all my critical data in another file
> system. I've also found that when a journaling file system goes
> bad, it goes *REALLY* bad.

I've seen the same on corrupt /'s.    Thankfully the worst was on a
system that had nightly verified BackupEdge backups.

At a time like that having a separare / also makes system recovery
much faster as you can remake / and reload that.  In my largest
system [before they got big enough to bring in their own FT admin]
a system restore would take about 2.5 hours.  And anytime I made
any system changes I'd unmount all, and make a / backup only.

That backup took 8 minutes, and booting from a RecoverEdge disk, 
remaking /, and then reinstalling / took about 15 to 20 minutes.
When you have lots of users on line loosing 2 hours of production
time - particulary where counter people write invoices manually
when the system is down - is a major loss of $$.

Just that alone is one very good readon IMO - to have a separate
/ that had only the OS files.   That also means that if there is a
problem with other file systems you can remake and reload /, and
then have fast on system tools to help you recover anything else if
needed.

I >>HATE<< it when backups are stale and you have to mount
/ in read-only mode and then slowly recover things a bit at a time.
Worst was years ago when the lost+found was not large enough to be
able save things, and answering Y to remove things would leave a
large data loss.  So that was a mount read-only, save files
externally, and do it over and over.

That was the worst - and it was about 2 days.  But better than
losing everything an all of their records.

An aside on those who backup to another HD and do not make
backups that can be made offsite.

I got an email one day with a 1/2 dozen images.  Smoke and flames
rolling out of the building.  NOTHING was salvageable.  And his
only backup of the Linux filePro was made from an MS machine that
truncated all names that did not fit in 8.3 name-space.

Quoted a lower than normal hourly rate and figured it would
be at least $5000 to get it back to the state it was in the month
before the fire.  In the end I think he wrote it all off, and
actually focused on building his other businesses.

Haveing a BE/LT daily backup to tape with at least one a week going
off site would have meant he would have been operational in less
then a day.

Bill
-- 
Bill Vermillion - bv @ wjv . com