Import in the face of non-printable chars.
Brian K. White
brian at aljex.com
Sat Feb 2 19:46:59 PST 2013
On 1/12/2013 5:23 PM, Jean-Pierre A. Radley wrote:
> Bill Campbell propounded (on Fri, Jan 11, 2013 at 04:25:20PM -0800):
> | On Fri, Jan 11, 2013, Jean-Pierre A. Radley wrote:
> | >Jean-Pierre A. Radley propounded (on Fri, Jan 11, 2013 at 06:38:27PM -0500):
> | >| While importing from a .csv file which was generated from an
> | >| Excel spreadsheet, there is a field from which I want to retain
> | >| only alphanumeric characters, discarding punctuation like dashes,
> | >| parentheses, commas, even spaces. I have an edit which does this
> | >| handily.
> | >|
> | >| But I came a cropper today when some input fields contained a high-ascii
> | >| character (happend to represent the degree symbol), and my edit gave me
> | >| a blank result. Even when I bypassed the edit and tried to import the
> | >| .csv field as-is, the filePro field came up blank.
> | >|
> | >| How to eliminate or ignore a character outside the 0-128 range?
> | >
> | >Ah, I had defined the filePro field as ALLUP. If I change that to *,
> | >then I can import fine with no edit, but that doesn't satisfy the need
> | >to delete all but alphanumeric characters.
> |
> | I don't know about Excel offhand, but using "Save As" with the
> | OpenOffice.org/LibreOffice spreadsheets allows one to specify the
> | character set which defaults to UTF-8. I change that to the
> | appropriate Latin/USASCII setting as the first step.
>
> I was about to use a filter script to ditch the non-alphanumeric
> characters when I got your reply. I use OpenOffice too, and changing
> the character set now emits '?' in place of high-ASCII characters.
>
> Thanks.
>
In a few similar situations where no edit does what I want and I'd
rather have the logic written in readable code instead of in bizarcane
edit syntax, I made a little gosub that loops through every character in
a variable and tests if chr() is above or below a given range, or 3
ranges for number, lower, upper. It's brute force for the computer but
it's simple code and simple to use. It takes advantage of the fact that
the printable chars are all in a few consecutive ranges. Usually I just
put the modified result right back in the same variable so the usages is
just n=inp["1"] ;gosub clean ;1=n
I do try to use such tight loops as little as possible of course.
--
bkw
More information about the Filepro-list
mailing list