Import in the face of non-printable chars.

Bill Campbell bill at celestial.com
Fri Jan 11 16:25:20 PST 2013


On Fri, Jan 11, 2013, Jean-Pierre A. Radley wrote:
>Jean-Pierre A. Radley propounded (on Fri, Jan 11, 2013 at 06:38:27PM -0500):
>| While importing from a .csv file which was generated from an
>| Excel spreadsheet, there is a field from which I want to retain
>| only alphanumeric characters, discarding punctuation like dashes,
>| parentheses, commas, even spaces.  I have an edit which does this
>| handily.
>| 
>| But I came a cropper today when some input fields contained a high-ascii
>| character (happend to represent the degree symbol), and my edit gave me
>| a blank result.  Even when I bypassed the edit and tried to import the
>| .csv field as-is, the filePro field came up blank.
>| 
>| How to eliminate or ignore a character outside the 0-128 range?
>
>Ah, I had defined the filePro field as ALLUP.  If I change that to *,
>then I can import fine with no edit, but that doesn't satisfy the need
>to delete all but alphanumeric characters. 

I don't know about Excel offhand, but using "Save As" with the
OpenOffice.org/LibreOffice spreadsheets allows one to specify the
character set which defaults to UTF-8.  I change that to the
appropriate Latin/USASCII setting as the first step.

A simple python/perl/sed script should then be able to replace or
delete non alphanumeric characters.  I think the python/perl
expression would be to replace all special characters with a '.'
would be "s/\W/./g"

Bill
-- 
INTERNET:   bill at celestial.com  Bill Campbell; Celestial Software LLC
URL: http://www.celestial.com/  PO Box 820; 6641 E. Mercer Way
Voice:          (206) 236-1676  Mercer Island, WA 98040-0820
Fax:            (206) 232-9186  Skype: jwccsllc (206) 855-5792

If the personal freedoms guaranteed by the Constitution inhibit the
government's ability to govern the people, we should look to limit those
guarantees.  -- President Bill Clinton, August 12, 1993


More information about the Filepro-list mailing list