email edit
Brian K. White
brian at aljex.com
Wed Nov 1 09:50:31 PST 2006
----- Original Message -----
From: "Jay R. Ashworth" <jra at baylink.com>
To: <filepro-list at lists.celestial.com>
Sent: Wednesday, November 01, 2006 11:42 AM
Subject: Re: email edit
> On Wed, Nov 01, 2006 at 09:59:48AM -0500, Nancy Palmquist wrote:
>> The discussion makes it clear why it is not possible to make a distinct
>> edit. Some tuning would have to happen and it would not always work,
>> but like a date edit you could design one for the standard and one for
>> the 2 dot emails and test the email against both, if it passes either it
>> is a good email address.
>>
>> I think the solution is some edits for the various formats and then some
>> processing to see if it complies with any edit.
>>
>> Your scope of customers might determine if one would work or you need
>> more depth for global email options.
>
> The other thing that kills people is Germany.
>
> They tend to have insanely long domain names over there; I usually use
> 50 chars for an email field, and *that's* been too short, at least
> once.
>
> Anyone who's *really* serious about this probably ought to pull about
> 50K email addresses off Usenet over a month, and test their edit
> against them. That's a sufficiently large corpus, I think, to catch
> all the corner cases.
Indeed, almost anything can be a valid email.
You don't need even one @
I wouldn't be surprised if you can have more than one @
You don't need even one .
You can have any number of .'s
You can have one @ and no dots
then theres: network!hosta!hostb!hostc!user
and: john%node.bitnet at cunyvm.cuny.edu
Those are all odd cases but perfectly correct if thats the way your
buisiness happens to be set up.
I think there are only a few bad characters you could really test for and
safely say it's bad... except then there is utf8 and utf16 multibyte
encodings coming our way pretty soon and any "bad" character might just be
part of a multibyte japanese address or something.
The only correct way to validate an email address is to actually validate
it.
Which requires the machine running fp to have a working mta.
Or you could use one of the many 3rd party apps out there that claim to do
just this.
In the end, you _still_ can't really know because you have no control over
the recipient mail servers.
Some send back a failure code or bounce the email back, many just silently
accept and discard anything destined for an invalid address. They do this
deliberately to fight spam. If they responded nicely "thats a bad address"
it makes it real easy for smap scripts to pelt them with random values and
distill themselves a nice list of known valid addresses.
I'm just using processing in a gosub that does only really crude testing for
a few common problems I happen to see in my data.
Like:
user at host.com;otheruser at otherhost.net
user at host.com # Jim's dad
user at host.com Bill
In these cases I can and do clip out the valid email and keep it
certain special case values I see I just test for the actual value exactly
(well, case insensatively)
obviously the larger this list of explicit values gets the slower the
process gets
none
n/a
really, "none" is a completely valid email address and by rights I have no
right to delete it
Sure it's "don't be stupid" common sense, right up until some acount named
"N One" comes along...
I caved in and added a test for at least one @ too but I don't like it.
google "validate email" and witness the whole industry devoted to this.
Thus it's not going to be handled properly by anything as simple as an edit.
And I happen to rather not touch the data rather than touch it improperly.
Brian K. White -- brian at aljex.com -- http://www.aljex.com/bkw/
+++++[>+++[>+++++>+++++++<<-]<-]>>+.>.+++++.+++++++.-.[>+<---]>++.
filePro BBx Linux SCO FreeBSD #callahans Satriani Filk!
More information about the Filepro-list
mailing list