Best way to import XML Files

John Hemmer hemmerjohn at hotmail.com
Tue Jun 15 09:41:20 PDT 2004


----- Original Message -----
From: "George Simon" <flowersoft at compuserve.com>
To: "John Esak" <john at valar.com>
Cc: "filePro mailing list" <filepro-list at seaslug.org>
Sent: Tuesday, June 15, 2004 8:51 AM
Subject: Re: Best way to import XML Files


> Hi John, this is a Windows application so the copy *.xml big.xml wlill
merge
> all the xml files into one big one (big.xml)
> My concern was how to parse this large file when there are no record
> delimiters (each file would be one record).
> Field delimeters I'm not too concerned about because I can search for the
> XML tags.
> I could try to import the entire file into one record and then search for
> the opening and ending tags.
> Maybe Nancy has some suggestions since she does this all the time.
>
>
> ----- Original Message -----
> From: "John Esak" <john at valar.com>
> To: "George Simon" <flowersoft at compuserve.com>
> Cc: "filePro mailing list" <filepro-list at seaslug.org>
> Sent: Tuesday, June 15, 2004 12:06 AM
> Subject: RE: Best way to import XML Files
>
>
> >
> > George,
> > the command you have below will not work the way you think... you would
> use
> > such a command to copy all the *xml files to a directory called
"big.xml".
> > What you are probably trying to do is con"cat"enate all the files into
one
> > big file... I would do something like:
> >
> > for name in *.xml
> > do
> > cat $name >>big.xml
> > done
> >
> > (You might want to make sure big.xml is null before each run of this
> script.
> > Or put that into the head of the script. It's all short enough to do
from
> > the command line, though.)
> >
> > John
> >
> > Visit The FP Room www.tinyurl.com/yuag7 24/7
> >
> >
> >   -----Original Message-----
> >   From: filepro-list-bounces at lists.celestial.com
> > [mailto:filepro-list-bounces at lists.celestial.com]On Behalf Of George
Simon
> >   Sent: Monday, June 14, 2004 11:27 PM
> >   To: Filepro 2 List
> >   Subject: Best way to import XML Files
> >
> >
> >   I have to import and process several XML files at a time.
> >   What I'm planning to do now, since I don't know the name of the files,
> is:
> >   copy *.xml big.xml
> >   and parse the big.xlm file.
> >   Since there are no single-character field and record delimiters, what
is
> > the best way to import this big.xml file?
> >
> >   Thanks
>
> _______________________________________________
> Filepro-list mailing list
> Filepro-list at lists.celestial.com
> http://mailman.celestial.com/mailman/listinfo/filepro-list
>

George,

Here is the processing code I use to input large XML files of
unknown size ...

  1  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
disxml . If: ' INPUT BUFFER

       Then:   U(30000)="";
 18  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
       . If: '    DEFINE INPUT FILE NAME

       Then:      rf="/home/filepro/xml/dst.xml"   ' INPUT FILE NAME

 19  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
       . If:

       Then:      rh(8,.0,g)=OPEN(rf,"rt")   ' OPEN INPUT FILE

 20  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
       . If:      rh le "0"                  ' NO INPUT FILE - THEN EXIT

       Then:      show "@no xml file for"<rf<"aborting";  exit

 21  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
       . If:

       Then:      rs(9,.0,g)=FILESIZE(rh)    ' GET SIZE OF INPUT FILE

 22  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
       . If:

       Then:     ' INITIALIZE READ FILE VARIABLES

 23  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
       . If:     ' CURRENT READ BYTE POINTER

       Then:       rp(9,.0,g)="0"
 24  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
       . If:     ' REMAINING READ BYTES

       Then:       rr(9,.0,g)=rs

 25  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
       . If:     ' READ BUFFER SIZE

       Then:       rb(9,.0,g)=MIN("30000",rr)

 26  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
       . If:                     ' INPUT FIRST BUFFER OF DATA

       Then:       GOSUB READBUF

Then everytime you need to get byte from the buffer do the following test:

      . If: iu GT lu        ' READ BUFFER EMPTY

       Then: GOSUB READBUF   '      YES - GET NEXT BUFFER FULL

where:  iu points to next byte and lu is number of bytes in buffer.


179  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
       . If:            .
       Then: '            READ A BUFFER OF INPUT DATA

180  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
READBUF. If: rr le "0"       ' END OF INPUT FILE REACHED

       Then: GOTO CLOSEND        ' YES, CLOSE ALL FILES AND EXIT(END)
PROCESS  .
181  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
       . If:

       Then: rb=MIN(rb,rr)

182  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
       . If:

       Then: aa=seek(rh,rp)        ' ...... read seek instruction

183  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
       . If: '       READ INPUT BUFFER AND ADVANCE POINTERS

       Then: lu(8,.0,g) = READ(rh,U,rb); rp=rp+rb; rr=rr-rb; iu(8,.0,g)="1"
       . If: lu GT "0"

       Then: return

187  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
       . If:

       Then: show "@NO MORE READ BYTES - EXIT - THIS SHOULD NOT OCCUR"

188  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
       . If:

       Then: er(8,.0)=close(rh)

189  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
       . If:

       Then: EXIT

190  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
CLOSEND. If:

       Then: SHOW "@END OF INPUT FILE REACHED disRec#"<SN<"- ENDING PROCESS"
191  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
       . If:

       Then: CLOSE

192  -------   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
       . If:

       Then: END


Hope this Helps.

John



More information about the Filepro-list mailing list