Import in the face of non-printable chars.
Bill Campbell
bill at celestial.com
Fri Jan 11 16:25:20 PST 2013
On Fri, Jan 11, 2013, Jean-Pierre A. Radley wrote:
>Jean-Pierre A. Radley propounded (on Fri, Jan 11, 2013 at 06:38:27PM -0500):
>| While importing from a .csv file which was generated from an
>| Excel spreadsheet, there is a field from which I want to retain
>| only alphanumeric characters, discarding punctuation like dashes,
>| parentheses, commas, even spaces. I have an edit which does this
>| handily.
>|
>| But I came a cropper today when some input fields contained a high-ascii
>| character (happend to represent the degree symbol), and my edit gave me
>| a blank result. Even when I bypassed the edit and tried to import the
>| .csv field as-is, the filePro field came up blank.
>|
>| How to eliminate or ignore a character outside the 0-128 range?
>
>Ah, I had defined the filePro field as ALLUP. If I change that to *,
>then I can import fine with no edit, but that doesn't satisfy the need
>to delete all but alphanumeric characters.
I don't know about Excel offhand, but using "Save As" with the
OpenOffice.org/LibreOffice spreadsheets allows one to specify the
character set which defaults to UTF-8. I change that to the
appropriate Latin/USASCII setting as the first step.
A simple python/perl/sed script should then be able to replace or
delete non alphanumeric characters. I think the python/perl
expression would be to replace all special characters with a '.'
would be "s/\W/./g"
Bill
--
INTERNET: bill at celestial.com Bill Campbell; Celestial Software LLC
URL: http://www.celestial.com/ PO Box 820; 6641 E. Mercer Way
Voice: (206) 236-1676 Mercer Island, WA 98040-0820
Fax: (206) 232-9186 Skype: jwccsllc (206) 855-5792
If the personal freedoms guaranteed by the Constitution inhibit the
government's ability to govern the people, we should look to limit those
guarantees. -- President Bill Clinton, August 12, 1993
More information about the Filepro-list
mailing list