Rapid once deployed too

Thu May 4 09:31:05 PDT 2006

Four score and seven years--eh, screw that!
At about Thu, May 04, 2006 at 11:38:13AM -0400,
Howie blabbed on about:
> No offense taken Mark.  I'm sure you know a lot more about this stuff than I 
> do.

Some.  Not all, as you'll see in a sec.  Glad you took no offense though.

> last pid:  3580;  load averages:  0.24,  0.36,  0.27                   11:
> 1415 processes:1387 sleeping, 7 running, 19 zombie, 2 onproc

There you go.  Seven processes actually running out of 1415 total.  With
only four of them being clerk or report, that's 228 processes that don't
really even count towards the any sort of performance benchmark because
they're not actually doing anything significant.  19 processes aren't even
"there" at all, having exited already and they're waiting for their parent
to exit.

> CPU states: 17.1% idle, 26.5% user, 39.8% system, 16.5% wait,  0.0% sxbrk

This is less useful to me without a per-cpu breakdown.  However, the idle
and the wait combined is 33.6%.  That means the system is 1/3 unloaded,
more or less.  The overall picture is that there is more going on in
running the system (I don't claim to know what differentiates "system"
from "user" unless it's either processes run by root vs other users, or
processes whose immediate parent is init...those are my best two guesses,
and a -quick- check of the docs for one version of top isn't very clear
about it at all) than there actually is being run -on- the system, by
roughly a 6:5 ratio.

>   PID USERNAME PRI NICE   SIZE   RES  STATE   TIME  COMMAND
>  3567 stertris  23    4  1116K  1116K onpr    0:00  rclerk
>   171 root      23    4  1184K  1184K onpr    0:00  top
>  3315 astmanny  23    4  2488K  2488K run     0:00  dclerk
>  3508 root      23    4  1532K  1532K run     0:00  fct_vtpd
> 11233 root      23    4  1532K  1532K run     0:00  fct_vtpd
> 28338 mayhjw    23    4  1316K  1316K run     0:00  rclerk
>  1943 headbran  21    4  2856K  2856K run     0:05  dclerk
>  3570 dcikevw   11    4  3912K  3912K run     0:00  pcl6
>  3545 racmata   -4   14  1184K  1184K run     0:04  rreport-copy

Those were the only relevant processes actually in play when you did this.
All the others were in sleep mode, as I predicted.

I am unsure of the exact difference between "on processor" and "runnable
(run queue)" as far as a process state.  My -assumption- (and I hate to
make one because you know what happens then!) is that there are two active
processes actually taking a slice of each cpu at any one instant, while
there are another seven in the run queue, waiting for their cpu slices
because someone may actually be doing something like typing, or it may be
waiting to read/write to disk [the latter would not necessarily show as a
disk wait state, as you might surmise glancing at the 'ps' docs].  Someone
that would like to educate me, please tell me what the exact difference
between onproc and run queue is, and if I'm right?  Someone here brighter
than me must know for sure.  Ken?  John?  JPR?  Bob?  Jay?  Anyone?  :)

However, of those 228 clerk and report sessions (assuming nothing much
changed between the first email and second--let's just call it 200 or
so), exactly four of them currently counted towards anything that affects
performance, and I'd be willing to bet that the 2 in run rather than on
processor are just getting keyboard input.

Look also at the CPU time taken by each process.  Two of those, dclerk and
rreport-copy have accumulated more than one second of CPU time.  The rest
have not.  In other words, people may be "using" the system, but they're
not USING the system in any way that really taxes the CPU's.  It's not a
matter of fast performance, it's a matter of nothing really going on, for
the most part.  It's not "fast", it's idle by and large--at least as far
as the system is concerned.  If you were running 20 reports and could show
me really low load averages, I might be impressed.  :)  But that won't
happen; the combination is technically impossible to achieve.

> Please tell me what that tells you.

Pretty much what I suspected to begin with; most of the processes on that
system are idle or next to it, and the metric is invalid for denoting how
well fP scales, performs, etc.  Incidentlly, I actually doublechecked the
defintion of "nascent" because I had an odd feeling about it after the
fact, and I did -not- mean to use that word in my last post.  Bad memory.
In laymans terms, what I -meant- was "idle", basically.  That's what I get
for trying to not be repetative about things up a bit without checking the
dictionary.  Sorry about that.  I'm still trying to remember which word I
-intended- to use, but can't.  Kinda getting sleepy.  :)

Start doing intensive things like index rebuilding or a lot of report
generation--even heavy use of lookups with a lot of DROP processing,
for instance)--and you'll see an entirely different picture take shape
regarding fP's system usage.  Doesn't look like there's much -real- action
going on though.  The inflated numbers look great due to high user count,
but there's no real heavy activity behind it when you look closely.

Again, I can't say fP doesn't scale or perform well.  I'm not saying
it, I can't say it because I don't have hard evidence to that point.
However, given the evidence presented, I can't say it does, either.  Hard
to tell -how- it scales when most instances are actually idle.  That's
like saying your disk subsystem is "really lightning fast" because you
can read one 100MB file in no time flat.  Sure, and what happens when you
actually try reading 5, 10, 15, 50 of those at a time?  An IDE system
will start choking to death.  A SCSI system will fare better due to the
availability of disconnect mode.  But anything needs -use- in order to
benchmark it.  That's what they make things like iozone for in regards
to disk performance.  In this case, you'd have to have 200+ users doing
something like generating reports, doing a lot of lookups, rebuilding
indexes, etc., to even start substantiating a performance claim.  Sheer
number of instances is not enough.  You need qualitative evidence in
addition to quantitative to make any sort of performance claim that's worth
making.

mark->