Page 1 of 2

Wildcard Search

PostPosted: Fri May 01, 2009 5:05 pm
by Dietmar Jahnel
Hi,

I have strings like
"2007/14/0001"
"2004/14/0877"
"2008/15/0056"
...
in a very large database.
I need to find all fields containing "/14/" not using the locate command, because it is too slow.
Does anyone have a fast solution for a wildcard search, something like seek "*/14/*" ?

Thank for any suggestions!
Dietmar

Re: Wildcard Search

PostPosted: Fri May 01, 2009 5:49 pm
by Armando
Dietmar:

If you are using xHarbour pls have a look at WildMatch() function.

Here is an example

Code: Select all  Expand view

cFiltro := "*" + ALLTRIM(cFiltro) + "*"
(cProducto)->(DBSETFILTER({||  WildMatch(cFiltro, (cProducto)->PRO_DES,(.T.)) },;                      "WildMatch(cFiltro, (cProducto)->PRO_DES,(.T.))" ))
 

Best regards

Re: Wildcard Search

PostPosted: Fri May 01, 2009 7:02 pm
by Otto
Hello Dietmar,

I use this function:
Code: Select all  Expand view


STATIC aKunden := {}

function SearchFile( suchbeg )
   local nLocation, cData
   local nOffset := 0
   local cDBF    := (  "kunden.dbf" )
   local nPos    := 0

   suchbeg := ALLTRIM(Upper(suchbeg))
   cData   := Upper(MemoRead( cDBF ))

   if Len(cData ) < 1
      MsgInfo("Not Data to Search","File Error")
      Return Nil
   endif

   nOffset := 0
   
   do while .t.
       nPos := INT( AT( suchbeg, cData, nOffset ))    
     
        nLocation := INT( ( nPos - Header() ) / RecSize() ) + 1
        nOffset   := Header() + nLocation * RecSize() + RecSize()
     
      if nPos > 0 .and. nPos <  Header()  
      else
          if nLocation < 1
             Exit
          else
             select kunden
             goto  nLocation
             if DELETED() = .F.
                aAdd( aKunden, getrec() )
               endif
          endif
           
        endif
   
   enddo

Return Nil
//------------------------------------------------------------------




Best regards,
Otto

Re: Wildcard Search

PostPosted: Sun May 03, 2009 7:22 am
by Maurizio
Hello

You cann use temporary index , it is very faster .

index on field->N1 TAG I_TEMP FOR &(ff) TO TEMP TEMPORARY

TEMPORARY
If this option is specified, a temporary index is created which is automatically destroyed when the index is closed.

TO TEMP
The temporary index may be created in memory only .


Maurizio

Re: Wildcard Search

PostPosted: Sun May 03, 2009 10:35 am
by fafi
Mr. Maurizio

Do you have complete sample ?

Thanks

Regards
Fafi

Re: Wildcard Search

PostPosted: Mon May 04, 2009 6:56 am
by Maurizio
Hello
use clients

index on field->ADRESS TAG ADR TO TEMP TEMPORARY FOR field->COUNTRY == "USA"
Browse()

use

Re: Wildcard Search

PostPosted: Mon May 04, 2009 7:28 am
by Otto
Hello Maurizio,

Thank you for sharing.

Is the index after you created automatically opened or do you have to open the
index file.
If you have in your example a clients.cdx do you have to close this before you create the temporary index file or do they coexistent
How to you change the index order.

First you have for example:

Use clients new
Clients-cdx with tag name, town, etc. is opened automatically.

How do you go ahead with your example?

Thanks in advance
Otto

Re: Wildcard Search

PostPosted: Mon May 04, 2009 1:02 pm
by Maurizio
Hello Otto

The index is automatically opened , and you don't have to close the previous index .

// How do you go ahead with your example?

If the natural index name is CLIENT.CDX :

set index to CLIENT

Regards MAurizio

Re: Wildcard Search

PostPosted: Mon May 04, 2009 5:10 pm
by nageswaragunupudi
xHarbour's OrdWildSeek( <cWildSeekExprn> )

Example:
OrdWildSeek( "*/14/*" )

Re: Wildcard Search

PostPosted: Tue May 05, 2009 5:24 am
by anserkk
I don't know whether the Soft seek solution work or not, Anyway it is just a hint
You need to have DBF indexed on the desired field.

Set Soft On
Seek "/14/"
Set Soft Off
Do While SubStr(Alias->FieldName,6,2) == "14"
....
Skip
Enddo

Mr.Rao's solution using xHarbours' OrdWildSeek( "*/14/*" ) seems to be a good one.

Regards

Anser

Re: Wildcard Search

PostPosted: Tue May 05, 2009 6:03 am
by Loach
Hello!
This is a several solutions from xHarbour "TESTS":
1.
Code: Select all  Expand view

cRegex:="*/14/*"
cRegex:=HB_REGEXCOMP(cRegex)
dbgotop()
while !eof()
   if ordkeyval() HAS cRegEx
      // TA-DA!!! We got it!!!
      ?ordkeyno()
   endif
   dborderinfo(DBOI_SKIPREGEX,,,cRegex)
enddo
 

2.
Code: Select all  Expand view

cPattern:="*/14/*"
nSec:=secondscpu()
dbgotop()
if !eof() .and. ! WildMatch(cPattern, ordkeyval())
   dborderinfo(DBOI_SKIPWILD,,,cPattern)
endif
while !eof()
   if WildMatch(cPattern, ordkeyval())
      // TA-DA!!! We got it!!!
      ?ordkeyno()
   endif
   dborderinfo(DBOI_SKIPWILD,,,cPattern)
enddo
 


In both cases you have to make the index on your field.

Re: Wildcard Search

PostPosted: Tue May 05, 2009 6:43 am
by nageswaragunupudi
Mr.Rao's solution using xHarbours' OrdWildSeek( "*/14/*" ) seems to be a good one.

Mr Dietmar wanted all rows matching this criteria. In that case Mr Armando's solution of setting filter is more appropriate.

Yes, like many other friends suggested, there are many other solutions. But Mr Armando's recommendation is simple and fast enough.

Relating to the functions OrdWildSeek() and WildMatch() I have one question:
I am aware that OrdWildSeek() works both in Harbour and xHarbour.
But for me WildMatch() is working only in xHarbour and not in Harbour. Does any one know an equivalent function in Harbour?

Re: Wildcard Search

PostPosted: Tue May 05, 2009 7:04 am
by Loach
Mr. nageswaragunupudi.
For me "setfilter" is too slow (if you want to browse the data), especially with a very large databases. Much more effective to create the CUSTOM INDEX from the array of finded records.
PS. About your question, sorry, I use only xHarbour...

Re: Wildcard Search

PostPosted: Tue May 05, 2009 7:08 am
by Otto
Hello Mr. Rao,

would you be so kind to advice me how to handle lower and upper case with OrdWildSeek.
I would need a search independent of the case.

Thanks in advance
Otto

Re: Wildcard Search

PostPosted: Tue May 05, 2009 7:26 am
by Otto
Hello Mr. Rao,

The search function I suggested above is build after an advice from Antonio:
http://forums.fivetechsupport.com/viewtopic.php?f=3&t=968&p=3824&hilit=RecSize#p3824

Jeff,

Some years ago we helped a company to test several third party tools for such purpouse and finally implemented our own solution to make the fastest search on all fields of a DBF.

We found with great surprise that the solution was to open the DBF as a standard file with FOpen(), read a bunch of bytes in memory and perform a simple At() to locate a string. Once found, you substract the DBF header length, then divide the offset by the record size and you get the record number. At() is an extremelly fast function as it is directly performed by the processor.

These days that we use 32 bits flat memory, I guess there is no need to use a bunch of bytes, so the entire DBF may be loaded in memory doing a MemoRead() of the DBF file, or several bunchs if it is too large, so the code may get simpler.

We compared this way with other available third party tools, and we found that ours was the fastest one

Its worth to try it.


I would like to test OrdWildSeek and compare speed.
Do you open the dbf – file with memoread ?
I can’t imagine that skipping through the file record by record is the solution.
Do you have some more infos.

Thanks in advance
Otto