EXTRACT PLAIN TEXT FROM HTML FILE

Post Reply
User avatar
MarcoBoschi
Posts: 1071
Joined: Thu Nov 17, 2005 11:08 am
Location: Padova - Italy
Contact:

EXTRACT PLAIN TEXT FROM HTML FILE

Post by MarcoBoschi »

Hi,
Please I need, If it exist a freeware software that permits to me to extract plain text from an html file. Or other tips are welcome

Many Thanks

Marco
Marco Boschi
info@marcoboschi.it
User avatar
karinha
Posts: 7932
Joined: Tue Dec 20, 2005 7:36 pm
Location: São Paulo - Brasil
Been thanked: 3 times
Contact:

Re: EXTRACT PLAIN TEXT FROM HTML FILE

Post by karinha »

João Santos - São Paulo - Brasil - Phone: +55(11)95150-7341
User avatar
MarcoBoschi
Posts: 1071
Joined: Thu Nov 17, 2005 11:08 am
Location: Padova - Italy
Contact:

Re: EXTRACT PLAIN TEXT FROM HTML FILE

Post by MarcoBoschi »

8)
Marco Boschi
info@marcoboschi.it
User avatar
karinha
Posts: 7932
Joined: Tue Dec 20, 2005 7:36 pm
Location: São Paulo - Brasil
Been thanked: 3 times
Contact:

Re: EXTRACT PLAIN TEXT FROM HTML FILE

Post by karinha »

Code: Select all | Expand

// C:\FWH\SAMPLES\HTML2TXT.PRG

#include "FiveWin.ch"

MEMVAR cINNText

FUNCTION Main()

   LOCAL cFile := ".\GMAP.HTML"

   IF FILE( "Boschi.txt" )

      FERASE( "Boschi.txt" )

   ENDIF

   MsgRun( "WAIT... Converting HTML to TEXT. ", ;
           "Please, Wait                     ", ;
           { || WinExec( CONVERT_HTML2TXT( cFile ) ), 3 } )

   MemoEdit( MemoRead( "Boschi.txt" ) )

RETURN NIL

FUNCTION CONVERT_HTML2TXT( cFile )

   LOCAL oExplorer := TOLEAuto():New( "InternetExplorer.Application" )

   PRIV cINNText

   oExplorer:Navigate2( cFile )

   DO WHILE oExplorer:ReadyState <> 4

      hb_idleSleep( 1 )

   ENDDO

   cINNText := oExplorer:Document:Body:InnerText

   MemoWrit( "Boschi.txt", cINNText )

   // MemoEdit( MemoRead( "Boschi.txt" ) )

   oExplorer:Quit()

RETURN NIL

// FIN / END
 
Regards, saludos.
João Santos - São Paulo - Brasil - Phone: +55(11)95150-7341
Post Reply