IMPORT/CLOSE issues?

Fairlight fairlite at fairlite.com
Mon Apr 10 19:34:12 PDT 2006


Four score and seven years--eh, screw that!
At about Mon, Apr 10, 2006 at 09:28:49PM -0400,
Kenneth Brody blabbed on about:
> Quoting Fairlight (Mon, 10 Apr 2006 20:58:32 -0400):
> 
> > Found a weird problem with a client's code today.  They're going through
> > multiple records with selection sets and importing data into records in
> > another table.  The filename is in common here, but it's kind of
> > irrelevant.
> [...]
> > getline::import ascii ref=(FILENAME) r=\n f=\n:
> > :not ref:return:
> > ::aa(400,*)=ref(1):
> > ::return:
> 
> You do realize that once the import hits EOF, "aa" will continue to
> hold the value of the last line imported?  Also, your main loop did
> not check for EOF.  (At least not in the code listed.)  Could this
> be the cause of your infinite loop?

I realised it when presented with the facts, and confirmed my suspicions
when I saw the code as it stood and fixed the results.  I figured it was
hanging onto the last line or returning a blank one.


> It does.  Each time you hit the IMPORT line, the next "record" (if
> any) is imported.  Just as each time you hit an EXPORT, a new export
> record is started.

If that is true, why does the change in FILENAME (the variable pointing
to the data source) not switch you to "record" 1 of the new file?  I'm
answering inline here, just read your paragraph below, and now know the
answer--which was what I suspected or wouldn't have suggested using it,
thus solving the issue.

However, it's NOT intuitive.

> Once open, IMPORT/EXPORT will continue using the filename specified
> at the time it was opened.  In order to force a new filename, you
> need to close it.

I figured that out.  Again, not intuitive.  That's like saying that a
LOOKUP should just keep pointing to the last record unless closed.  Now I'm
fairly sure that's not the case, given the plethora of people who say CLOSE
isn't even needed anywhere, and it all happens automagically.

Both IMPORT and LOOKUP are "kind of" made to act similarly.  This is one
major area in which they differ, unless I'm wrong about LOOKUP.

> By "behavior", I assume you mean my statement above about needing
> to close the import/export in order to change filenames?  Yes, this
> has always been the behavior, since variable filenames were first
> allowed.
> 
> If you meant something else, please elaborate.

That's what I meant.

> > Prior to seeing this code, I believed you had to step
> > through a file with different subscripts to get each successive line.
> 
> Subscripts are for each field within the record, not for different
> records within the file.

Not a big fan of import.  I used it a couple times from '93-'95.  Not much.
I never really (despite the definition of the separators) actually consider
a flatfile as "fields" and "records".  Obviously it's being used to
correlate a file's contents into such, but I'm not really used to thinking
about file contents that way.  One reason I never really used it more than
about 5-10 times, if that.

> It's not "line 1", it's "field 1".  Each time you execute the import,
> the next record (in this case, a line in a text file) is read, giving
> you a new "field 1" (and "field 2", and "field 3", and so on).

Ahhhh.  Well, I'd have expected it to never reach past field 1, then.
Either way, it looked like it should be broken unless IMPORT was just
ignored.

> > Intentional or bug?
> 
> Intentional.  How else could it work?

Mind you, it's been ages since I used IMPORT myself, but I thought it
worked more like LOOKUP; you invoke it once to open the file, then get the
contents.  Now that I'm seeing the field/record description again, I'm
seeing your point, albeit slightly hazily.  Just not used to thinking about
it in these terms.  Like I said, had I written it from scratch, it would
have been entirely open/readline/close.  That's just what I'm comfortable
with.

> > My own $0.02:  This is totally unconventional behaviour as regards file
> > I/O in pretty much any language,
> 
> Huh?  What other method of sequential I/O exists in these other
> languages of which you speak?

C for one, strangely enough.


#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>


int main() {
        char buffer[20],*clearer;
        int ffd;

        ffd=open("file1",O_RDONLY);
        read(ffd,&buffer,20);
        clearer=buffer;
        clearer+=20;
        *clearer = '\0';
        printf("%s\n",buffer);
        /* NOTE THE ---LACK--- OF A close() CALL */
        ffd=open("file2",O_RDONLY);
        read(ffd,&buffer,20);
        clearer=buffer;
        clearer+=20;
        *clearer = '\0';
        printf("%s\n",buffer);
        exit(0);
}


This would actually give you different data in the buffer on the second
run-through.  When you invoke IMPORT with a different file source, I
would heartily and fully expect that it would at least pick up on using a
different data source and reassign the internal file descriptor and any
relevant pointers.

> > gives you no sane way to detect EOF
> 
> You use "NOT importname", exactly as you have done.

Wasn't me.  I said it wasn't my code.  I think, upon reflection, that
the problem with the "not importname" in this case (just from memory) was
that it did a simple RETURN, which fed back into the same loop instead of
going on to save the record (if appropriate) and end.  I think the original
code for that actually should be modified, and I'll take it up with the
developer and suggest it, although given the data definition constraints it
is likely not an active issue for worry in this particular case.

> > unless there's something in the file you know will be the last item you
> > need to look for (there thankfully is in this case), and is entirely
> > contrary to what one would expect from the code as written.
> 
> Given that your code uses "not ref" to detect EOF, I don't understand
> your problem here.

Like I said, not me.  And the problem is that it RETURNed...except the
RETURN feeds straight back into the same loop did the GOSUB in the first
place--that tries to read the next field again, thus creating an infinite
loop no matter whether EOF was hit or not.  Second unaddressed fault in the
original code, and I'll bring it to the developer's attention now that I
recognise it.

> Other than the "you need to close it in order to switch to a different
> filename", what part is "unanticipated"?

That's not enough?

> > The fix is logical given the behaviour exhibited--but the
> > behaviour exhibited is something I'd never have thought would work.
> 
> Why not?

See the C example above.  Even without close(), C manages to do the right
thing (frankly, I'm surprised--I'd have definitely expected perl to do it,
but I decided to test with straight C and it works without incident, so I
can really claim that it's "not just a perl thing").  Extrapolate to pretty
much most languages developed from C.  I'm at a loss to think of one that
won't redirect the fd to the right file and start from SEEK_SET.

Bottom line:

You call something that does an implicit open(), you expect it to do so on
subsequent iterations and pick up the new file location.  That's intuitive.
Ignoring the new location, even without an explicit close(), is not
intuitive.  IMHO.

Could be just me...but that's my take.

mark->


More information about the Filepro-list mailing list