Faster way then append from SDF

Faster way then append from SDF

Postby Marc Vanzegbroeck » Thu Feb 14, 2019 8:02 pm

Hi,

In an application, I read a text-file with more then 1300000 lines, in a database with one field.
After reading it, I go thrue the file, and depending of the information on each line, I extract information then I need into another table.
It's working fine, but the process take a while.

I found out, that only read the file with 'APPE FROM .... SDF take more than 30sec.
I the a faster way to read the information, and check the content of each line?

Thanks
Regards,
Marc

FWH32+xHarbour | FWH64+Harbour | BCC | DBF | ADO+MySQL | ADO+MariaDB | ADO+SQLite
Marc Vanzegbroeck
 
Posts: 1157
Joined: Mon Oct 17, 2005 5:41 am
Location: Belgium

Re: Faster way then append from SDF

Postby hmpaquito » Fri Feb 15, 2019 8:07 am

Hi,

Don´t write, only read.

Use FileEval() function from https://github.com/ibarrar/clipper/blob/master/CLIPPER5/SOURCE/SAMPLE/FILEIO.PRG

Regards
hmpaquito
 
Posts: 1482
Joined: Thu Oct 30, 2008 2:37 pm

Re: Faster way then append from SDF

Postby Marc Vanzegbroeck » Fri Feb 15, 2019 11:14 am

Thank you, I wiil try it
Regards,
Marc

FWH32+xHarbour | FWH64+Harbour | BCC | DBF | ADO+MySQL | ADO+MariaDB | ADO+SQLite
Marc Vanzegbroeck
 
Posts: 1157
Joined: Mon Oct 17, 2005 5:41 am
Location: Belgium

Re: Faster way then append from SDF

Postby Jack » Fri Feb 15, 2019 4:45 pm

Hi,
Did you try this ?

oTxt:=TTxtFile():New("yourtxtfile.txt")
wnbl:=oTxt:recCount() && number of line
if wnbl>0
for wbcl = 1 to wnbl
aadd(vtxt,oTxt:ReadLine())
oTxt:skip()
next
endif
oTxt:end()
*

Try it .
Philippe from Belgium
Jack
 
Posts: 280
Joined: Wed Jul 11, 2007 11:06 am

Re: Faster way then append from SDF

Postby Marc Vanzegbroeck » Fri Feb 15, 2019 5:45 pm

Philippe,

Thank you for the suggestion, but unfortunately, it's slower.
It took 64sec to read it. The 'append from' was 35sec.


Jack wrote:Hi,
Did you try this ?

oTxt:=TTxtFile():New("yourtxtfile.txt")
wnbl:=oTxt:recCount() && number of line
if wnbl>0
for wbcl = 1 to wnbl
aadd(vtxt,oTxt:ReadLine())
oTxt:skip()
next
endif
oTxt:end()
*

Try it .
Philippe from Belgium
Regards,
Marc

FWH32+xHarbour | FWH64+Harbour | BCC | DBF | ADO+MySQL | ADO+MariaDB | ADO+SQLite
Marc Vanzegbroeck
 
Posts: 1157
Joined: Mon Oct 17, 2005 5:41 am
Location: Belgium

Re: Faster way then append from SDF

Postby nageswaragunupudi » Fri Feb 15, 2019 6:22 pm

Will you please try this also?
Code: Select all  Expand view
#include "fivewin.ch"

#define BUFSIZE   64000
#define EOLCHR    CRLF
#define EOLLEN    2

static cBuf    := ""
static nStart  := 1
static lEof    := .f.

//----------------------------------------------------------------------------//

function Main()

   local cFile, hFile, cLine
   local alines   := {}  // for testing
   local nLine    := 0   // for testing
   local nSecs

   cFile    := "c:\fwh\source\classes\xbrowse.prg"  // your file here
   hFile    := FOpen( cFile, 64 )

   ? "Start"

   nSecs    := SECONDS()
   do while ( cLine := ReadLine( hFile ) ) != nil
      //
      // do your work here with cLine
      //
      nLine++                            // for testing
      AAdd( aLines, { nLine, cLine } )   // for testing
   enddo
   FClose( hFile )

   ? "Seconds:", SECONDS() - nSecs

   XBROWSER aLines SHOW RECID // Testing
   ? "Done"

return nil

//----------------------------------------------------------------------------//

static function ReadLine( hFile )

   local cLine, nRead, nEolAt

#ifdef __XHARBOUR__
   nEolAt      :=    At( EOLCHR, cBuf, nStart )
#else
   nEolAt      := HB_At( EOLCHR, cBuf, nStart )
#endif
   if nEolAt == 0
      if !lEof
         cBuf     := SubStr( cBuf, nStart ) + ReadBuf( hFile, @lEof )
         nStart   := 1

#ifdef __XHARBOUR__
         nEolAt   :=    At( EOLCHR, cBuf, nStart )
#else
         nEolAt   := HB_At( EOLCHR, cBuf, nStart )
#endif
      endif
   endif
   if nEolAt > 0
      cLine    := SubStr( cBuf, nStart, nEolAt - nStart )
      nStart   := nEolAt + EOLLEN
   else
      if nStart <= Len( cBuf )
         cLine    := SubStr( cBuf, nStart )
      endif
      cBuf  := ""
   endif

return cLine

//----------------------------------------------------------------------------//

static function ReadBuf( hFile, lEof )

   local cBuf  := SPACE( BUFSIZE )
   local nRead

   nRead    := FRead( hFile, @cBuf, BUFSIZE )
   if nRead < BUFSIZE
      lEof  := .t.
      cBuf  := Left( cBuf, nRead )
   endif

return cBuf

//----------------------------------------------------------------------------//
 
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10295
Joined: Sun Nov 19, 2006 5:22 am
Location: India

Re: Faster way then append from SDF

Postby James Bott » Fri Feb 15, 2019 6:28 pm

Maybe you could create a conditional index containing only the records you need. Then you don't even have to move the records to another file. Time required to get the records you want, zero.

Just thinking outside the box.
FWH 18.05/xHarbour 1.2.3/BCC7/Windows 10
User avatar
James Bott
 
Posts: 4840
Joined: Fri Nov 18, 2005 4:52 pm
Location: San Diego, California, USA

Re: Faster way then append from SDF

Postby Marc Vanzegbroeck » Fri Feb 15, 2019 10:55 pm

Roa,

Your example is very fast :lol: Only 2sec :shock:

I tested the loop without the line
Code: Select all  Expand view
AAdd( aLines, { nLine, cLine } )

With this line it take 138sec :shock: :shock: :shock: .

So the aadd() take a lot of time.
That's no problem, because I can direcly do the content check, and only at the information I need.


James,
The problem was not going through the database, but reading the text-file. Otherwhise the condition index was a good idea.
Regards,
Marc

FWH32+xHarbour | FWH64+Harbour | BCC | DBF | ADO+MySQL | ADO+MariaDB | ADO+SQLite
Marc Vanzegbroeck
 
Posts: 1157
Joined: Mon Oct 17, 2005 5:41 am
Location: Belgium

Re: Faster way then append from SDF

Postby Rick Lipkin » Fri Feb 15, 2019 11:26 pm

Marc

Here is a snipit from a Find and replace file utility I wrote that opens any text file you like .... hope you get the gist on how to use ..

Rick Lipkin

Code: Select all  Expand view

Do While .not. Eof()
   
   cFILE := ALLTRIM( A->FileName )

   If ( nHANDLE := Fopen( cFILE )) > 0
   Else
      SAYING := "ERROR Reading file "+cFILE
      YESNO  := { "Skip", "Abort" }

      nOK := Alert(SAYING,YESNO, )

      DO CASE
      CASE nOK = 1      // skip
           SELECT 1
           DELETE
           SKIP
           oLbx:ReFresh()
           lAborted := .f.
           Exit
      CASE nOK = 2      // abort
           Select 1
           Zap
           oLbx:ReFresh()
           Return(.f.)
      OTHERWISE
           Select 1
           Zap
           oLbx:ReFresh()
           Return(.f.)
      ENDCASE
   ENDIF

   Text_Eof   := FSEEK( nHANDLE, 0, 2 )  // get eof value
   Bytes_Read := 0
   Line       := 0

   Fseek( nHANDLE, 0 )                // rewind to bof()

   For Bytes_read = 0 to Text_eof

       Bytes_begin := FSEEK( nHANDLE, 0, 1)
       cText       := _FreadLine( nHANDLE )        // this is what extracts the text

       cText := upper( cText )
       cOldText    := cText

    *   msginfo( "cText" )
    *   msginfo( cText )


       DO CASE
       CASE cTEXT = ">EOF<"  .and. Bytes_read = 0
            // do nothing and go on BOF //
       CASE cTEXT = ">EOF<"  .and. Bytes_read > 0
          Exit
       ENDCASE

       bytes_read := FSEEK( nHANDLE, 0, 1 )
       LINE++

       cText := cText+"   Line "+str(Line,5)
       oText:ReFresh()
       SysReFresh()

       cText := cOldText

       nFOUND := AT( cTextFind, cTEXT )

       If nFOUND > 0
      ...
      ...
      ...


//-------------------------------------
Static FUNC _FreadLine( nHANDLE, LINE_LEN )

LOCAL MAXLINE := 512, BUFFER, LINE_END, NUM_BYTES

IF VALTYPE( LINE_LEN ) <> "N"
   LINE_LEN := MAXLINE
ENDIF

BUFFER    := SPACE(LINE_LEN)
NUM_BYTES := FREAD( nHANDLE, @BUFFER, LINE_LEN )
LINE_END  := AT( CHR(13)+CHR(10), BUFFER )

IF LINE_END = 0           // no cr or must be eof() or bof()

    FSEEK( nHANDLE, 0 )    // go back to top
    RETURN( ">EOF<" )

ELSE

   FSEEK ( nHANDLE, (NUM_BYTES * -1) + LINE_END+1, 1 ) // move to next line
   RETURN( SUBSTR(BUFFER, 1, LINE_END-1) )

ENDIF

// eof freadline

 
User avatar
Rick Lipkin
 
Posts: 2633
Joined: Fri Oct 07, 2005 1:50 pm
Location: Columbia, South Carolina USA

Re: Faster way then append from SDF

Postby James Bott » Sat Feb 16, 2019 1:33 am

Marc,

The problem was not going through the database, but reading the text-file. Otherwhise the condition index was a good idea.

Well, I was thinking that you could eliminate the export/import and just use the original file with the index set.

I don't know exactly what you doing with the new file you are creating, but if you are just processing it, then you can just use the original file indexed.

oDBF:setIndex( cWhatever )

Less is more...

James
FWH 18.05/xHarbour 1.2.3/BCC7/Windows 10
User avatar
James Bott
 
Posts: 4840
Joined: Fri Nov 18, 2005 4:52 pm
Location: San Diego, California, USA

Re: Faster way then append from SDF

Postby nageswaragunupudi » Sat Feb 16, 2019 2:52 am

I don't know exactly what you doing with the new file you are creating, but if you are just processing it, then you can just use the original file indexed.


The original file is a text file. Not a dbf file.
Do you think we can create an index on the text file?
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10295
Joined: Sun Nov 19, 2006 5:22 am
Location: India

Re: Faster way then append from SDF

Postby nageswaragunupudi » Sat Feb 16, 2019 4:05 am

Mr. Marc

That's no problem, because I can direcly do the content check, and only at the information I need.


Yes, that is the idea.
Earlier you were spending around 30 seconds to transfer the data from the text file to a dbf file and then starting your process. Now, this 30 secs is reduced to 2 seconds. You can straight away start processing without waiting for the conversion.

Even these 2 seconds time can be reduced further:
1) The function was written for variable length records. Because you are using APPEND FROM .. SDF, I guess the records in the text file are all of a fixed length. If we modify the program to fixed length records of known record length, the reading would be even much faster.
2) Converting the functions from Harbour to C makes it even faster.

But the additional work may not worth just to save 1 second more.
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10295
Joined: Sun Nov 19, 2006 5:22 am
Location: India


Return to FiveWin for Harbour/xHarbour

Who is online

Users browsing this forum: Google [Bot] and 37 guests