IMPORT/CLOSE issues?

Kenneth Brody kenbrody at bestweb.net
Tue Apr 11 08:50:17 PDT 2006


Quoting Fairlight (Mon, 10 Apr 2006 22:34:12 -0400):

> Four score and seven years--eh, screw that!
> At about Mon, Apr 10, 2006 at 09:28:49PM -0400,
> Kenneth Brody blabbed on about:
> > Quoting Fairlight (Mon, 10 Apr 2006 20:58:32 -0400):
> >
> > > Found a weird problem with a client's code today.  They're going
> > > through multiple records with selection sets and importing data
> > > into records in another table.  The filename is in common here,
> > > but it's kind of irrelevant.
> > [...]
> > > getline::import ascii ref=(FILENAME) r=\n f=\n:
> > > :not ref:return:
> > > ::aa(400,*)=ref(1):
> > > ::return:
> >
> > You do realize that once the import hits EOF, "aa" will continue to
> > hold the value of the last line imported?  Also, your main loop did
> > not check for EOF.  (At least not in the code listed.)  Could this
> > be the cause of your infinite loop?
>
> I realised it when presented with the facts, and confirmed my suspicions
> when I saw the code as it stood and fixed the results.  I figured it was
> hanging onto the last line or returning a blank one.

Well, it's not the IMPORT that's doing it... it's the "return" on EOF
that skips the assignment, leaving field aa unchanged.

[...]
> > Once open, IMPORT/EXPORT will continue using the filename specified
> > at the time it was opened.  In order to force a new filename, you
> > need to close it.
>
> I figured that out.  Again, not intuitive.  That's like saying that a
> LOOKUP should just keep pointing to the last record unless closed.  Now
> I'm fairly sure that's not the case, given the plethora of people who
> say CLOSE isn't even needed anywhere, and it all happens automagically.
>
> Both IMPORT and LOOKUP are "kind of" made to act similarly.  This is one
> major area in which they differ, unless I'm wrong about LOOKUP.

No, LOOKUP and IMPORT/EXPORT differ on this aspect.

On the other hand, each LOOKUP is a distinct operation, whereas IMPORT
and EXPORT are not.  Each subsequent IMPORT/EXPORT depends on the
actions of the previous iteration, just as a GETNEXT does.

[...]
> > > Prior to seeing this code, I believed you had to step
> > > through a file with different subscripts to get each successive
> line.
> >
> > Subscripts are for each field within the record, not for different
> > records within the file.
>
> Not a big fan of import.  I used it a couple times from '93-'95.  Not
> much. I never really (despite the definition of the separators) actually
> consider a flatfile as "fields" and "records".  Obviously it's being
> used to correlate a file's contents into such, but I'm not really used
> to thinking about file contents that way.  One reason I never really
> used it more than about 5-10 times, if that.

Well, you can read the entire "record" as one "field", as you have done,
and parse it manually, but many files really are logically multiple
"fields" per "record".  (Consider parsing CSV files, for example.)

Think of "f=" being similar to $IFS, and the subscripts being similar to
$1, $2, $3, and so on.

> > It's not "line 1", it's "field 1".  Each time you execute the import,
> > the next record (in this case, a line in a text file) is read, giving
> > you a new "field 1" (and "field 2", and "field 3", and so on).
>
> Ahhhh.  Well, I'd have expected it to never reach past field 1, then.

Well, given that you have "f=\n" as well as "r=\n", no record can have
more than one field.  (Which, in your case, is what you want, as you
said that you wanted to parse it manually.)

[...]
> > > My own $0.02:  This is totally unconventional behaviour as regards
> > > file I/O in pretty much any language,
> >
> > Huh?  What other method of sequential I/O exists in these other
> > languages of which you speak?
>
> C for one, strangely enough.
[... Snip C example with open() and read() ...]
>
> This would actually give you different data in the buffer on the second
> run-through.  When you invoke IMPORT with a different file source, I
> would heartily and fully expect that it would at least pick up on using
> a different data source and reassign the internal file descriptor and
> any relevant pointers.

Well, you have to remember that you don't explicitly open an import/export
file.  Rather, there is an implicit open when you execute it and the file
isn't already open.  Think of IMPORT as fgets(), and not fopen().  Each
time you execute the fgets(), you will get the next line in the file.
Only if the file isn't already open will fopen() be used.

> > > gives you no sane way to detect EOF
> >
> > You use "NOT importname", exactly as you have done.
>
> Wasn't me.  I said it wasn't my code.  I think, upon reflection, that
> the problem with the "not importname" in this case (just from memory)
> was that it did a simple RETURN, which fed back into the same loop

Correct.  If EOF was hit, the subroutine left field aa untouched, and as
the main code didn't check for EOF, it would simply "see" the same line
being returned over and over.  Imagine the same code with a GETNEXT rather
than IMPORT, and you'll see the problem.

[...]
> > > unless there's something in the file you know will be the last item
> > > you need to look for (there thankfully is in this case), and is
> > > entirely contrary to what one would expect from the code as written.
> >
> > Given that your code uses "not ref" to detect EOF, I don't understand
> > your problem here.
>
> Like I said, not me.

Well, now that I know it's someone else's code:

    s/your code/the code/

:-)

> And the problem is that it RETURNed...except the RETURN feeds straight
> back into the same loop did the GOSUB in the first place--that tries to
> read the next field again, thus creating an infinite loop no matter
> whether EOF was hit or not.

Yes, the main loop needs to check for EOF as well, or the subroutine
needs to set some flag (perhaps aa="*EOF*") that the main loop will
check for instead.  (Imagine a C function which would read() a piece
of the file, but that function is called in a loop which never checks
for EOF.)

[...]
> > Other than the "you need to close it in order to switch to a different
> > filename", what part is "unanticipated"?
>
> That's not enough?

Well, if you didn't know that part, I suppose it's "enough".  :-)

But, now that you know, you know.

> > > The fix is logical given the behaviour exhibited--but the
> > > behaviour exhibited is something I'd never have thought would work.
> >
> > Why not?
>
> See the C example above.  Even without close(), C manages to do the
> right thing (frankly, I'm surprised--I'd have definitely expected perl
> to do it, but I decided to test with straight C and it works without
> incident, so I can really claim that it's "not just a perl thing").
> Extrapolate to pretty much most languages developed from C.  I'm at a
> loss to think of one that won't redirect the fd to the right file and
> start from SEEK_SET.

Well, like I said, you're thinking of IMPORT as open(), rather than as
read()/fgets()/etc.

> Bottom line:
>
> You call something that does an implicit open(), you expect it to do so
> on subsequent iterations and pick up the new file location.  That's
> intuitive. Ignoring the new location, even without an explicit close(),
> is not intuitive.  IMHO.

Well, just remember that the implicit open() only occurs if the file isn't
already open.

    FILENAME=/tmp/foo
    while read something
    do
        stuff
        FILENAME=/tmp/bar
    done <$FILENAME

> Could be just me...but that's my take.

Yup, it's just you.  :-)

--
KenBrody at BestWeb dot net        spamtrap: <g8ymh8uf001 at sneakemail.com>
http://www.hvcomputer.com
http://www.fileProPlus.com


More information about the Filepro-list mailing list