Leer Excel grande a un array o DBF

Leer Excel grande a un array o DBF

Postby Enrrique Vertiz » Fri Feb 04, 2022 11:05 pm

Buenas tardes

Tengo FWH 20.04 y necesito leer un excel de 1,000,000 de lineas y casi 30 columnas que manda un cliente y solo lo puede/quiere mandar en Excel, uso la TExcel y claro funciona, pero demora casi 15 horas, algo mas rapido para poder leer registro x registro y guardarlo despues como datos, levantarlo en un array o en un DBF, alguien algun ejemplo funcional, GRACIAS
Enrrique Vertiz Pitta
Lima-Peru
xHb 1.23.1026X, Fwh 24.09, BCC74, MySQL 8.0.X, SQLLIB 1.9m
Enrrique Vertiz
 
Posts: 541
Joined: Fri Oct 07, 2005 2:17 pm
Location: Lima - Peru

Re: Leer Excel grande a un array o DBF

Postby nageswaragunupudi » Mon Feb 07, 2022 8:02 pm

FWH has built-in functions GetExcelRange() and FW_ExcelToDBF() for this purpose.
But your data is huge, better to use a custom approach for this case.

1) Save Excel sheet as CSV and then copy to DBF.
OR
2) Reading row by row is ok. Instead, read in chunks, ranges of 20,000 approx in a loop and save it in dbf.
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10627
Joined: Sun Nov 19, 2006 5:22 am
Location: India

Re: Leer Excel grande a un array o DBF

Postby Enrrique Vertiz » Mon Feb 07, 2022 9:15 pm

Thanks for your explanation, I worked with CSV and it was the fastest
Enrrique Vertiz Pitta
Lima-Peru
xHb 1.23.1026X, Fwh 24.09, BCC74, MySQL 8.0.X, SQLLIB 1.9m
Enrrique Vertiz
 
Posts: 541
Joined: Fri Oct 07, 2005 2:17 pm
Location: Lima - Peru

Re: Leer Excel grande a un array o DBF

Postby Jimmy » Tue Feb 08, 2022 5:33 am

hi,

i can "read" 1.000.000 ROW of Excel Sheet in 1 < Second ! ( with 64 Bit App )

i do "select" hole "Range" and make a Array with same Size (that´s why need 64 Bit )
than i copy /paste "Range" into Array in less than 1 Second.

depend on your HDD / SDD it need Time to "APPEND" Records from Array

Code: Select all  Expand view
     oExcel := CreateObject( "Excel.Application" )

         oExcel:Application:Workbooks:open(ZPATH+cFILE)
      // Make the first one active
         oWorkBook := oExcel:activeWorkBook
         oExcel:Application:Worksheets(1):activate()
      // Speed things up by creating an object containing the cells
         oSheet := oExcel:Worksheets(1):cells
         oWorkBook:workSheets(1):usedRange:Select

         numRows    := oWorkBook:workSheets(1):usedRange:Rows:Count
         numColumns := oWorkBook:workSheets(1):usedRange:Columns:Count

      // make Array with same Size  
         FOR i := 1 TO numRows
            AADD(aExcel,ARRAY(numColumns))
         NEXT
      // convert to A-Z for Excel Syntax
         cEnde := ZAHL2CHR(numColumns)
      // now paste hole Sheet to Array
         aExcel := oSheet:range( "A1:"+cEnde+LTRIM(STR(numRows)) ):value
      // Array to DBF
         Data2Dbf(aExcel)
 
greeting,
Jimmy
User avatar
Jimmy
 
Posts: 1725
Joined: Thu Sep 05, 2019 5:32 am
Location: Hamburg, Germany

Re: Leer Excel grande a un array o DBF

Postby nageswaragunupudi » Tue Feb 08, 2022 7:17 am

Mr. Jimmy

You can simplify your programs A LOT by using FWH built-in functions. When you recommend, they are easier to implement by other programmers.

This code:
Code: Select all  Expand view

     oExcel := CreateObject( "Excel.Application" )

         oExcel:Application:Workbooks:open(ZPATH+cFILE)
      // Make the first one active
         oWorkBook := oExcel:activeWorkBook
         oExcel:Application:Worksheets(1):activate()
      // Speed things up by creating an object containing the cells
         oSheet := oExcel:Worksheets(1):cells
         oWorkBook:workSheets(1):usedRange:Select
 


can be replaced by single line code:

Code: Select all  Expand view

oRange := GetExcelRange( TrueName( cExelFile ) )
 


This code:
Code: Select all  Expand view

         numRows    := oWorkBook:workSheets(1):usedRange:Rows:Count
         numColumns := oWorkBook:workSheets(1):usedRange:Columns:Count

      // make Array with same Size  
         FOR i := 1 TO numRows
            AADD(aExcel,ARRAY(numColumns))
         NEXT
 


can be replaced by this single line code, which is more efficient than your logic
Code: Select all  Expand view

aExcel := xlRangeValue( oRange )
 


For writing all values in a range to DBF:
Code: Select all  Expand view

// Create of open dbf
( cAlias )->( FW_ExcelToDBF( oRange, [cFieldList], [lHasHeaders], [bProgress] )
 


Syntax:
FW_ExcelToDBF() --> lSuccess
Writes the contents of excel range to default alias.
Parameters:
1. [oRange] Defaults to used range in active worksheet if already open by the user.
2. [cFieldList]
a) NIL:
( i) If range has headers in first row: Copies contents of each column to
the field whose name is same as the header of the column. Skips when
no match is found.
(ii) If range has no headers:
Copies contents of excel columns from left to right into fields 1 to last.
- Char: Should be a comma delemited list of field names of the dbf.
( i) If range has headers in first row: Copies contents of columns with header
maching the contents of the list to fields with the same name. Skips when
no match is found.
(ii) If range has no headers:
Copies contents of excel columns from left to right into fields contained
in the list. Skips in case of no match.
- Two-dimensional array: First element of each item contains the field name and
the second element contains the corresponding header in the excel range or
number of the excel column or alpha-column number like "A", "C", "AB", etc.

3) [lHasHeaders]
- .T.: First row of the range is treated as headers and data is from 2nd row
- .F.: The sheet does not have any headers and all rows are treated as data.
- NIL: The function examines contents of 1st and 2nd rows, If 1st row contains
all character values and 2nd row contains atleast one non-characer value,
lHasHeaders defaults to .T.


Now, the suggested code for the original post is this:
Code: Select all  Expand view

// Open or Create dbf
( cAlias )->( FW_ExcelToDBF( GetExcelRange( TrueName( cExcelFile ) ) ) )
 

instead of the 16 line code proposed by you.

This is even much simpler:

1. Open the excel book with Excel application and leave it open.
Then,
Code: Select all  Expand view

// open or create dbf
FW_ExcelToDBF()
 

The function ExcelToDbf() will read the used range from the open workbook and write to default alias.

However what I am not sure, till I test at my end, is whether the huge data of 1000000 rows x 30 fields will create any memory issues. Also in view of the large size, it is desirable to provide progress bar.
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10627
Joined: Sun Nov 19, 2006 5:22 am
Location: India

Re: Leer Excel grande a un array o DBF

Postby nageswaragunupudi » Tue Feb 08, 2022 7:48 am

Enrrique Vertiz wrote:Thanks for your explanation, I worked with CSV and it was the fastest


Yes, this is the FASTEST.
May I know how much time it took for 1,000,000 records?
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10627
Joined: Sun Nov 19, 2006 5:22 am
Location: India

Re: Leer Excel grande a un array o DBF

Postby Jimmy » Tue Feb 08, 2022 10:10 pm

hi Mr. Nages
nageswaragunupudi wrote:You can simplify your programs A LOT by using FWH built-in functions.
When you recommend, they are easier to implement by other programmers.

WOW ... i have to learn much what FiveWin allready have
greeting,
Jimmy
User avatar
Jimmy
 
Posts: 1725
Joined: Thu Sep 05, 2019 5:32 am
Location: Hamburg, Germany

Re: Leer Excel grande a un array o DBF

Postby Enrrique Vertiz » Tue Feb 08, 2022 10:42 pm

Regards Mr Rao
Reading the CSV file by passing it to an array() not even 15 seconds, 780,000 lines, reading as XLSX took a few hours
The recording if it takes 1 hour and a half, because there are validations before recording record by record.
Enrrique Vertiz Pitta
Lima-Peru
xHb 1.23.1026X, Fwh 24.09, BCC74, MySQL 8.0.X, SQLLIB 1.9m
Enrrique Vertiz
 
Posts: 541
Joined: Fri Oct 07, 2005 2:17 pm
Location: Lima - Peru

Re: Leer Excel grande a un array o DBF

Postby nageswaragunupudi » Wed Feb 09, 2022 8:54 pm

For the purpose of testing, I created an Excel table with 800,000 rows and 20 columns.
I could read the data directly from xlsx file into an arry (without creating csv file) in more or less the same time.
I could write the entire 800,000 rows to DBF in about 10 seconds, much less than a minute.
However I did not do any validations row-wise and cell-wise. If we do any such validations, we need to highly optimize the code because this code gets executed 800,000 times.
Still one hour and half seems to be too high when compared to 10 seconds.
Please review your code again.

I give here my test program. You can see the logic. I also attach a video of the execution of the program.
Code: Select all  Expand view

#include "fivewin.ch"

REQUEST DBFCDX

//----------------------------------------------------------------------------//

function Main()

   local cFileXls := "cust100.xlsx"
   local nSecs
   local aData

   if !File( cFileXls )
      ? cFileXls + " not found"
      return nil
   endif

   USE XLIMP NEW VIA "DBFCDX" EXCLUSIVE
   ZAP

   aData    := MsgRun( "Reading Data from XLS", cFileXLS, ;
               { || ReadFromXLS( cFileXLS ) } )

   nSecs    := MsgRun( "00:00:00 0,000,000,000 000 %", "IMPORT", ;
      { |oDlg| WriteToDBF( aData, oDlg ) } )

   ? "Written to DBF in", TRANSFORM( nSecs, "9999 Seconds" )

   XBROWSER ALIAS() SHOW SLNUM
   CLOSE DATA

return nil

//----------------------------------------------------------------------------//

static function ReadFromXls( cFileXls )

   local oRange, aData, lOpened := .f.
   local nSecs    := SECONDS()

   oRange   := GetExcelRange( TrueName( cFileXls ), nil, nil, @lOpened )
   aData    := xlRangeValue( oRange )
   if lOpened
      oRange:Parent:Parent:Close()
   endif

   ? "Read data into array in ", TRANSFORM( SECONDS() - nSecs, "99 Seconds" )

return aData

//----------------------------------------------------------------------------//

static function WriteToDBF( aData, oDlg )

   local nRows, nCols, nRow, nCol, v
   local nSecs    := SECONDS()

   nRows    := Len( aData )
   nCols    := Len( aData[ 1 ] )

   nRow     := 1
   do while nRow <= nRows .and. !Empty( aData[ nRow, 1 ] )
      DBAPPEND()
      for nCol := 1 to nCols
         v  := aData[ nRow, nCol ]
         // your validation code goes here
         FieldPut( nCol, v )
      next
      //
      if nRow % 5000 == 0
         oDlg:cMsg   := SECTOTIME( SECONDS() - nSecs ) + " " + ;
                        TRANSFORM( nRow, "9,999,999,999" ) + ;
                        TRANSFORM( nRow * 100 / nRows, " 999 %" )
         oDlg:Refresh()
         SysRefresh()
      endif
      nRow++
   enddo

return SECONDS() - nSecs

//----------------------------------------------------------------------------//
 


Image
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10627
Joined: Sun Nov 19, 2006 5:22 am
Location: India

Re: Leer Excel grande a un array o DBF

Postby nageswaragunupudi » Thu Feb 10, 2022 3:56 am

NOW
To demonstrate the power of FWH buil-in Excel functions:

In fact, in most cases, it is not necessary to write such lengthy code. This single line of code will do the entire job.
Code: Select all  Expand view
FW_ExcelToDbf()


In the above case, first open the xlsx file with Excel Application on the desktop and leave it open. Then run this small program.
Code: Select all  Expand view
#include "fivewin.ch"

REQUEST DBFCDX

function Main()

   local nSecs

   USE XLIMP NEW VIA "DBFCDX" EXCLUSIVE
   ZAP
   nSecs    := SECONDS()
   FW_ExcelToDBF()   // Single line code
   nSecs    := SECONDS() - nSecs

   XBROWSER Alias() TITLE LTrim( Str( nSecs ) ) SHOW SLNUM

return nil

This single line of code, reads data from the open excel file and write to the dbf. In the above case it took about 35 seconds. This function also does some basic validation of data types and suitable data type conversions.

Image
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10627
Joined: Sun Nov 19, 2006 5:22 am
Location: India


Return to FiveWin for Harbour/xHarbour

Who is online

Users browsing this forum: No registered users and 109 guests