SBC woes

Fairlight fairlite at fairlite.com
Thu Jul 27 17:17:36 PDT 2006


The honourable and venerable Bob Rasmussen spoke thus:
> It seems I remember some comments here about problems with SBC DSL links.
> Other than grumbling, I'm looking for solutions.
> 
> I have a customer who gets intermittent lags and/or drops over his SBC DSL
> connections, using various protocols such as SSH and VPN.
> 
> Any help would be appreciated.

This could be a fun bug hunt.  :)

Fairly standard checklist:

* Are -all- devices -except- the DSL modem filtered?  This includes devices
  on jacks elsewhere in the location that are on the same line

* Does the phone line run near any rheostats?  If so, that can be
  problematic.

* Does the phone line run near any flourescent light ballasts?  If so, it
  needs to run -perpendicular- to them, not parallel.

* Is there any noise or other oddity (sounding like you're on a
  low-signal cell phone, for instance) with the voice segment?

* Does the phone line run within 2 feet of a UPS or other device that would
  generate substantial EM interference?  If so, relocate them apart.

* Is there more than 3K feet of phone line between the demarcation point
  and the DSL modem?

As for the drops, you don't mention whether it's SSH/VPN drops, or whether
he's actually losing sync.  If he's actually losing sync, that's known as
intermittant sync, and SBC -should- take it seriously and help him diagnose
it.  If he -isn't- losing sync, that's more complex.

What kinds of lags?  Using wget to fetch something via ftp (always use ftp,
http is not as solid a benchmark) from -just- the other side of the DSL
gateway, I can demonstrate situations where I have issues.  Keep in mind
that I'm talking to a server that's -just- on the other side of the router
that handles my DSL.  You don't want something 10+ hops out.  2, tops, in
this case--the DSL router and the core router that gets me to the ftp
server.  That's a pretty ideal test.  Example:

[cobalt] [~] [5:09pm]: testget
--19:40:51--
ftp://xxxxxxx:*password*@ftp.xxxxx.com/yyyyyyy/zzz/mozilla-win32-1.6-installer.exe
           => `mozilla-win32-1.6-installer.exe'
Resolving ftp.xxxxx.com... nnn.nnn.nn.nn
Connecting to ftp.xxxxx.com[nnn.nnn.nn.nn]:21... connected.
Logging in as xxxxxxx ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD /zzzzz/yyyyyyy/aaaaaaa/bbb ... done.
==> PASV ... done.    ==> RETR mozilla-win32-1.6-installer.exe ... done.
Length: 12,269,264 (unauthoritative)

    0K ................ ................ ................  3%  138.41 KB/s
  384K ................ ................ ................  6%  142.06 KB/s
  768K ................ ................ ................  9%  142.06 KB/s
 1152K ................ ................ ................ 12%  142.16 KB/s
 1536K ................ ................ ................ 16%  141.96 KB/s
[SNIP FOR BREVITY]
10752K ................ ................ ................ 92%  142.05 KB/s
11136K ................ ................ ................ 96%  142.06 KB/s
11520K ................ ................ ................ 99%  142.05 KB/s
11904K .........                                         100%  141.20 KB/s

19:42:16 (141.94 KB/s) - `mozilla-win32-1.6-installer.exe' saved [12269264]

0.23user 0.98system 1:25.09elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (267major+34minor)pagefaults 0swaps


I'm wrapping that in `time`, hence the final display.  Now, occasionally,
due to bad storms, weather changes (hot/cold snaps will do this), or even
just strong EM fields from other sources, I will start seeing (assuming
this is a valid throughput...it's actually about 10KB/sec lower than max)
some "burstiness" where I'll get all those 142 lines for 5 lines, but then
get an 87KB/sec for a line.  If you watch it in mid-line (it draws these
dots as it goes, similar to the old ftp's hash readouts), it's not a total
throughput drop, it's actually a burst of latency at one point, and it
resumes on the rest of the line at normal speed.  The whole line isn't
slow, it's one instant of lag that can drag it from 10KB/sec to 70KB/sec
lower.  More if there is heavy electrical storm activity, to the point you
could be running at 20KB/sec or even lose sync entirely.  (DSL is -really-
touchy to EM.)  It can bounce up and down, acting fine and then giving a
burst of latency where it stops dead in its tracks for a split second to
several depending on severity.  It might be mild and do it once, it might
do it sporadically but fairly often.  

My suggestion is to get a setup, even if it's http (but preferably ftp if
you at all can), similar to how I'm set up, and try getting a file just
from the other side.  See what the throughput acts like over about 12MB.
Anything over 10MB should give you an idea, really.  You just don't want to
do a really short file, as you may not see issues if it's really sporadic.
When it happens to me, it can go 65% of the way through 12MB and look
solid, and then I'll get either a mild or severe latency spike.

Presumably, it's interference on the line.  Our phone lines in this area
are...less than stellar.

But you're looking to see what the performance looks like at a more
granular level, assuming you have sync.  If it hangs for 20 seconds or so,
you're likely losing sync--check the modem's sync light.

Now for me (and I -don't- lose sync), what usually would fix this, assuming
it can be fixed (ie., no storms, etc.) is a manual resync of the DSL modem.
As I understand it, there are channels on different frequencies.  As
interference hits the line in whatever fashion, certain channels can get
flagged as unusable.  This flag will persist until a connection has been
dropped for 30-45 seconds.  Powering off the modem and powering it back
on after 45 seconds and getting a hard resync manually where you should
have cleared all the flags on the DSLAM is one thing to try.  Sometimes
this doesn't actually work for 1-5 tries.  I've gone up to 10 times before
getting it ironed out to where it's smooth as glass and will stay that way
for weeks, barring storms.  It's a fickle beast of a technology, really.
Like SCSI, there may be specs, but in the end it's more magic/artform than
science in some cases.  It's pretty much subject to many of the same EM
problems, bad shielding, etc.

When I do those resyncs, it might come in at a slightly lower speed, or it
may actually come in at the same or a higher speed.  Either way, it usually
can be smooth as glass.

This assumes it's -not- intermittent sync, which really would not be
rectified by this methodology.  I'm assuming you very well may mean SSH
drops, not the sync drops.  SSH is particularly picky about routes
vanishing, and is the only thing I can think of that, with a static IP#,
will actually disconnect if that route goes away for even a few seconds
during attempted transmission.  At least that's been the case with PuTTY to
OpenSSH.  Everything else that's stateful seems to stay connected.  Even
ftp sessions.  I'm thinking one or the other was coded to play ball like
that.

If it can't be ironed out with manual resyncs, or if he's actually got
intermittent sync, it's really something he's going to have to deal with
SBC over.  They can at least check up to the demarc point and see if it's
anything up to and through the NID.  If they give that a clean bill of
health, it very well could be internal wiring, bad filters, dying filters,
etc.

Actually, that's a valid question:  Is this a new installation, or one that
was previously working fine?  If filters are involved, know that those
-can- go bad, both on their own, or if they take a jolt during a storm or
the like.  Might try replacing any filters in play and see if that doesn't
clear up the problem.

You're looking internally for anything that generates EM fields that would
affect the line, or anything that may be putting impedance on the line,
rolling off the most vulnerable high frequencies.  Sometimes even filtered
devices can cause this--newer phones with electronics in them and their own
power sources are a good example.  A standard request by diagnostic techs
will be to unplug everything -except- the DSL modem and see if the problem
persists.  This would include the phone that may be plugged into the
filter in the back of the modem.  If nothing can be found as the culprit
internally, SBC should get involved in checking up to the demarc, possibly
internally if it's feasible.

Another potential solution is to actually get a splitter at the NID, break
off the DSL circuit and run a direct line to where the modem resides, and
route the phone through the regular equipment that's been in place.  These
can generally only be purchased online and aren't cheap.  I do know that
someone told me the splitters are actually just industrial grade filters,
and you can break it out this way with a regular filter if you know what
you're doing.  This break-out is something SBC could do for him--for a fee,
of course, no doubt.  It's one fairly common way of resolving internal
wiring issues.

Another possible way to test sync issues (if he's losing sync entirely and
fairly regularly) is, if he actually has access to his NID (I don't, living
in an apartment), he could take the modem to the NID and plug it directly
into the port there and see if it drops sync or not.  If it drops sync
-there-, it's at least partly an external problem SBC must fix, although
internal problems may remain.

One thing to note...  I don't know how true it is with SBC, if at all, but
with BellSouth, they don't even have a complaint ticket type for "poor
speeds".  There simply isn't one.  Their apparent philosophy is that if
you've got intermittent sync, they'll look into it.  If you maintain solid
sync, it's at least 128kbit (the fallback minimum for maintaining sync),
and you might be passing -nothing- as far a traffic, but if you have sync
they don't care what your throughput speed is, even if it's zero.  SBC may
or may not differ, but if the problems can't be solved, they may just let
him punt if he doesn't like it.  I've heard of that happening.

The description of the problem was a bit vague as to what exactly is the
condition of the line--sync'd or not, so I covered most of the standard
stuff one would look at in general for both conditions.  Sorry for the
length, trying to help.

Luck!

mark->


More information about the Filepro-list mailing list