Is 4000 users too high a number for filePro

Thu Jun 9 12:56:33 PDT 2005

----- Original Message ----- 
From: "Fairlight" <fairlite at fairlite.com>
To: "filePro mailing List" <filepro-list at lists.celestial.com>
Sent: Thursday, June 09, 2005 2:54 PM
Subject: Re: Is 4000 users too high a number for filePro

> The honourable and venerable Lerebours, Jose spoke thus:
>> I can control one but not the others.  I do worry about
>> filePro standing up to the task of 4000 concurrent users.
>
> Concurrent?  Are you dead serious?  Oy.  Okay, let's examine it based on
> that criterion.
>
> Unless you're putting them on something stellar, I wouldn't recommend
> that.  Depending how many people are doing what, nobody may be able to do
> anything, and the keystrokes may echo back 15 minutes later--each.
>
> Even -on- something stellar, I would probably rather work out a clustering
> scenario and spread the resources, plus have the room for fail-over
> redundancy if one fails.
>
>> I like to see the quote - It seems to me that it would
>> be wiser to buy fpTech all together.   :)
>
> When I've heard that an upgrade (UPGRADE!) from SCO5 4.8 -> FBSD 5.0 for a
> 128 -runtime- was $7K, I think you're talking astronomical costs for 4000
> users.
>
> Keeping in mind that I seriously doubt they have a version that can handle
> it.  I'm guessing the most they can handle is 256.  This is not backed by
> any direct evidence.  It's supposition based on what I've seen of how they
> designed files, and knowing a little about how they do the tracking 
> (shared
> memory, for *nix).  The current power-of-two licensing kind of bears this
> out--I think they must be storing it as a bitmask in a single byte, which
> would make the maximum 256 in current versions.
>
> Now when 5.6 comes out (assuming it comes out), they're supposed to have
> all new licensing, and this may or may not be a factor.
>
> As some have pointed out regularly, fP was designed in like '78 or '79,
> long before a lot of the technology existed.  It's held up fairly well in
> many areas, but I don't think it's -that- scalable.  I'd say that the
> memory use of all those instances alone would be enough to severely burden
> even a beefy system.  Looking at the linux binaries (which are dynamic
> compiles), you're talking (let's round) 500K/each for rreport alone.  Now
> multiply by 4000.
>
> 500000 bytes * 4000
> 2,000,000,000
>
> Two gigs of RAM -just- for fP's binary, assuming nobody is running any
> other binaries (It's actually about 550K for rclerk) [these are linux
> 5.0.13 numbers, btw].  This also assumes there's no other overhead--that
> was the binary size.  Virtual memory use came out (just getting rreport
> running to the point of the box that chooses the table) at 2.4MB.
>
> 2400000 bytes * 4000
> 9,600,000,000
>
> That's the actual VSZ (virtual memory size of the program as it's being
> used, rather than the raw binary size.  This is the number that counts.
> Okay, 9 gigs.  That's doable, expensively.  It also doesn't count loading
> anything into memory, but that's probably fairly negligible if there are 
> no
> memory leaks in fP.
>
> HOWEVER.  :)  That's not the only thing running.  Let's assume they're
> coming in via ssh.  The virtual memory size for a -fresh- sshd connection
> is 5.4MB.  Multiply -that- by 4000.
>
> 5400000 bytes * 4000
> 21,600,000,000
>
> If I'm reading that correctly, 21 gigs just for all 4000 logging into
> the system via ssh.  filePro is just icing on a VERY large cake after 
> this.
>
> It's at a point like this that you start to realise what you're thinking
> about may not be entirely tenable or economically feasible.  You're 
> talking
> about (give or take) 30gigs RAM minimum without paging to disk like mad.
>
> By contrast, a decentralised system with one daemon taking up a limited
> amount of resources on a server, connected to by clients that spread the
> client cost in resources around (think SQL here, where a socket connection
> costs you hardly anything in terms of memory) makes far more sense.  I'm
> not actually sure what the concurrent socket connection limit is going to
> be.  It used to be 256 back in the day.  Then I know it was raised to 
> 1024.
> Thing is, in a transactional environment, you'll have so much churn and so
> brief a connection time that it's not actually something you're likely to
> hit.  It might even be higher now, and if it's not by default, at least
> under linux (and probably even FBSD), it's probably tunable to some
> degree--maybe to 2048.  Suddenly you're looking at something that a server
> or two could handle, though, assuming the client systems are all capable 
> of
> running a client.
>
> fP was designed in an age when 4000 users concurrently wasn't really even
> something most places were considering.  The math I'm looking at, even
> cursorily, doesn't seem to give much hope for running it in this kind of
> upscaled environment.  Nothing against fP--it's all in the numbers.  The 
> OS
> doesn't hold up very well either, unless you have a real beast of a 
> machine
> running, to be perfectly fair.
>
> (Even without running the numbers, I could have told you that you don't
> want 4000 users on a system concurrently.  A wise question might be, "Is
> the OS even capable of making that many pty's, even if it can handle that
> many sockets to the same service?")
>
> This discounts anything else the server might be doing (and it will be
> doing something, as I didn't account for anything from init on down--only
> sshd's and figuring for rreport.
>
> The short answer is, "No, don't even think about doing it that way."
>
> This assumes you honestly meant 4000 concurrently.  If you didn't, at 
> least
> you have some numbers with which to calculate what you'd need.
>
> 4000 users on a system isn't just too high for fP, even assuming you could
> get such a license.  It's too high for most available systems in general,
> OS-wise.
>
> A decentralised RDBMS solution is your best bet, methinks.

There a lot of things that simply don't work at such high numbers regardless 
how well the machine seems to handle the workload at almost-that-high 
numbers.
No matter how big and strong you are, you still can't carry one drop more 
than 5 gallons of water in a 5 gallon bucket.
Better yet, make it acid instead of water, since it's probably _bad_ to ever 
spill even a drop, so you really need to not only stay within various 
limits, but actually stay well away from them even so that you never spike 
up to them.

This guy makes a few educative points.
http://www.kegel.com/c10k.html
It's mostly about threading models but there are some enlightening general 
concepts pointed out along the way.
ex: "Too much thread-local memory is preallocated by some operating systems; 
if each thread gets 1MB, and total VM space is 2GB, that creates an upper 
limit of 2000 threads."

He's talking about trying to support 10,000 users, but he's only talking 
about doing a tiny tiny bit of work for each one, just serve up a couple 
kbytes of text in the form of a web page request.
I wont pretend to be sure, since I'm not Ken, but as efficient as filePro 
is, I beleive that's still miles away from running an instance of clerk or 
report in the context of a typical application.

Brian K. White  --  brian at aljex.com  --  http://www.aljex.com/bkw/
+++++[>+++[>+++++>+++++++<<-]<-]>>+.>.+++++.+++++++.-.[>+<---]>++.
filePro BBx  Linux SCO  Prosper/FACTS AutoCAD  #callahans Satriani