OT: SCO 6.0 - MAJOR BUG!!!!!

Fairlight fairlite at fairlite.com
Mon Oct 24 10:39:18 PDT 2005


In the relative spacial/temporal region of
Mon, Oct 24, 2005 at 12:18:15PM -0400, Kenneth Brody achieved the spontaneous
generation of the following:
> 
> It looks like you copied the same executable back onto itself.  (It
> should have failed to copy back to centericq-4.20.0 with "text file
> busy" if someone was running it.)

It doesn't fail.  And you're correct, I was copying back the same binary.

> What happens if someone is running testcicq and then you do:
> 
>     echo hello >testcicq

Apparently, testing with a copy of the same binary is a Bad Idea.  It works
fine.  Doing what you suggest, or copying a different version onto the one
being run results in a bus error, although no core dump.

Okay, Sun didn't fix it either.  I retract my previous harsh commentary
about SCO's bug, posthaste.  Bad testing methodology on my part.  (I still
say it should act as we all know it should, on both platforms--"text file
busy" and a failed copy.)

What made you suspect that using the same binary was what made it work, out
of curiosity.  It might use the same starting inode, but the blocks could
potentially all be different, couldn't they?  That's why I didn't actually
think it made a difference so long as you used cp to replace it.

Okay, I decided to take this testing further, based on the above thinking.

With centericq-4.20.0 running, I did:

[FLAdmin] [~/storage/bin] [1:09pm]: rm centericq-4.20.0
[FLAdmin] [~/storage/bin] [1:12pm]: echo "hello" > centericq-4.20.0
[FLAdmin] [~/storage/bin] [1:13pm]: cp centericq-4.12.0 centericq-4.20.0

No bus errors, and the program chugs on.  An unlink first lets it keep
going without issue.  Does that work on SCO?

So the difference is that if you unlink it, it does what?  Drags the
-entire- thing to memory or swap so that it can keep running, but if
you overwrite the file at the same inode without an unlink then it gets
corrupted memory?

Okay, I might buy that.  But since that works, I looked at 'man cp':

     -f    Unlink.  If a file descriptor for a  destination  file
           cannot  be obtained, attempt to unlink the destination
           and proceed.

So, I tried (with a 4.12.0 version of cicq in the centericq-4.20.0
position):

cp -f testcicq centericq-4.20.0

Supposedly this should act the same as unlinking first, but it caused a bus
error, same as without -f.  This leads me to believe that the vague "If a
file descriptor for a destination file cannot be obtained" clause (they
don't say -where- the fd should be found, is what I find vague) must be
kicking into effect, and no unlink was performed.

Either way, it's just as whacked.  Neither platform's model is right.  The
only difference is that now I have to take back my comments about SCO,
since Sun apparently never fixed it either.  I guess I get to eat those
harsh words about SCO.  "Hmmm...tastes like crow." *chagrinned and slightly
pained look*

Can't win 'em all, I guess.  I'd still like to know how you knew my test
was flawed.  That's gotta be some major experience talking, to your credit,
since I've never run into it in 16 years, and I'd also not have intuitively
thought of it (obviously).  Educate me, please.  It's annoying being wrong
if you can't at least learn from it.

I'd also love to know what OSR6 does if you issue a 'rm' on the file first.
If it acts identically, it would actually go a long way towards
corroborating the claim of the bug's origination.

mark->
-- 
There is no "I" in TEAM.
This would be the primary reason I've chosen not to join one.


More information about the Filepro-list mailing list