Disk caching and data reading (was Re: create())

Fairlight fairlite at fairlite.com
Mon Sep 9 14:39:12 PDT 2013


On Mon, Sep 09, 2013 at 10:22:04AM -0400, Kenneth Brody thus spoke:
> On 9/6/2013 7:57 PM, Fairlight wrote:
> >
> >I always looked at it as the cache being FIFO, but it sounds like, from
> >what you're saying, it basically presents a FIFO-like appearance, even when
> >it's not purely FIFO.  Either way, I agree with you that it should not be
> >problematic.
> 
> Whether or not a write cache is physically written to the HD in FIFO
> order or not, a cache is not FIFO, nor any other order.  (If you
> need a name, it would be "random access".)  The cache is simply "RAM
> is faster than HD, so let's keep things in RAM when possible".  For
> writing, it means that the program doesn't have to wait for the data
> to be physically written to the HD.  (And the O/S will eventually
> get around to writing it to disk.)  For reads, it means that, if the
> data is already in the cache, it is available immediately.  (And
> read-ahead cache can mean that it's already in RAM, even if it was
> never accessed until now.)  Otherwise, it must wait until it gets
> around to reading it from the HD.

We're talking about two different views of caching.  I was referring to the
cache buffer that is used between the time a write is requested, and the
time it's actually committed to disk.  I've always considered that FIFO.
You're referring to the general VFS cache that maintains a cache of
recently accessed disk data.

Even when they're part of the same caching mechanism, that data has to be
flushed to disk -sometime- or it would never get there.

> The point being, if you write new data, and it's still in the cache
> and not on the physical device, if something else tries to read that
> data, it will fetch the new data from the cache, rather than attempt
> to re-read the physical device and get the old data.

Yeah, that makes sense.  One assumes (I'm not a low-level driver developer)
that it just flags writes as committed, even if it keeps the data in the
VFS cache.  It has to know whether or not it actually wrote the data...

> >Open/close on files in heavily populated directories should not be slow.
> 
> If a directory has 2000 files in it, especially if the directory
> itself is fragmented, then (depending on filesystem type) it may
> require numerous disk reads before the directory entry is found.
> (On some filesystems, if the file is the 2000th entry in the
> directory, the O/S needs to read all 2000 entries and do a string
> compare on each of them before it will find the correct entry.)

I've seen this on a network storage device before, but it was wholly
irrelevant how many files were in the directory.  Technically, open()
doesn't rely on readdir(), and thus shouldn't be affected.

> >The only operation I know of, at least on *nix systems, which is adversely
> >affected by heavily (>500 files) populated directories is readdir().
> >That's due to double inode redirection being triggered over about that
> >point.
> 
> The file must first be found (probably using the kernel's readdir
> equivalent) before it can be opened.

In practise, the time has not been a noticeable consideration on any sane
filesystem.  Note the operative word "sane".  NTFS need not apply.
Actually, NTFS isn't -that- bad, aside from its lack of
auto-defragmentation.  It's pretty hard to permanently break NTFS.

> However, once opened, the number of files in the directory is
> irrelevant to any I/O on that file.

Also, the first open() may be slow, in the cases you cite, but disk caching
would drastically speed up subsequent open() functions.  I can think of one
exception, and that's this network storage appliance system my ISP uses.
Any time I write to a specific directory heirarchy, even in a -very- tiny
directory with less than 5 files in it, opening a file with vim takes
forever and a day.  But in my home directory, I can do an `ls` on a
directory with 667 files, it takes upwards of 10 seconds the first time,
but is immediate due to cache subsequent times, as long as it stays cached.
But in this other filesystem, it's just horrid...I can open a file, edit
it, save it, and it takes just as long to re-open if I immediately hit
up-arrow, return.  Drives me insane.  That's Solaris with NFS to an
unspecified RAID-based NAS.

> Many (most?) AV programs have the *option* to do such things
> automatically.  Whether that option is on by default, I don't know.

I'd drop any vendor that made it so.

mark->
-- 
Audio panton, cogito singularis.


More information about the Filepro-list mailing list