base64 / mime prc table

Tue Nov 9 17:44:47 PST 2004

On Tue, Nov 09, 2004 at 07:20:04PM -0500, Brian K. White, the prominent pundit,
witicized:
> Mark had said he though I had written a prc table to do base64, and I almost 

Maybe we were talking at cross-purposes at the time.  It happens.

> is working great and it has one advantage in that it's a single black box 
> that both fp processes and cgi & cron scripts use to code/decode stuff for 
> each other and I can make any kind of change I want in that script and not 
> break anything else as long as the script still decodes what it encodes)

Of course, it's a single fault point if you do break it, which then breaks
all things dependant on it.  :)

> http://www.aljex.com/bkw/filepro/#urlenc

:"$&+,/;=?@ <>#%{}|^~[]`'"{chr("92"){chr("34") co urlenc_ic:urlenc_oc = "%"
{ base(asc(urlenc_ic),"10","16"):

I have a few problems with that as written.  

You don't want to encode the @ character at all--that's a valid part of the
user:pass at host syntax in URLs, and should be left unmolested.  The browsers
generally take care of themselves, but if you insist on replacing it, you
should make sure you've done a state check for passing the first / that
indicates the beginning of $PATH_INFO and/or $QUERY_STRING, or that you've
already passed one @ prior ot that (in which case, someone feeding it a bad
URL gets what they deserve, which is a failed connection).

Likewise, you're also nuking valid & and ? characters that are a valid part
of URLs of the GET style.  You need to track having passed the appropriate
place in the URL for ?, similar to the @ example.  The & however, should
-never- be encoded unless it comes -before- the end of the hostname
section, and even then it's illegal and should be removed.  There's just
no reason to encode one.  I think you may have been lumping in HTML entity
encoding and URL encoding in the same cargo container when you wrote this.

The other problem is that it's not robust enough to check for pre-encoded
characters.  You could get %7E in a URL and need to filter the rest, but
you've not checked to make sure that your % is not followed by a valid hex
code, therefore you'd be re-encoding it as %377E, which would -not- give
you what you wanted, as they only decode one layer deep when urldecoded
by any piece of software I've seen.  Some people will give URL's that are
partially encoded--you'll have an encoded tilde as above, but other things
need changing.  You should really check for a valid hex value read-ahead
after any %, just to make sure you don't do something like the above.
It would extend the logic by a few lines, but not bloat it hugely or
unmanageably.

No offense intended.  It's elegant in its simplicity, but if someone fed it
half-encoded URL's, it could lead to error as written.  It could easily be
made more robust though, with those few changes.

mark->
-- 
Bring the web-enabling power of OneGate to -your- filePro applications today!

Try the live filePro-based, OneGate-enabled demo at the following URL:
               http://www2.onnik.com/~fairlite/flfssindex.html