Page 1 of 2

FW_SetUnicode( .T. ) 2-Byte characters

Posted: Wed Jun 21, 2023 6:33 am
by frose
Hi,

does anyone know why these 2-byte characters "üöä", "ÄÖÜ", "ßéÉÊ" are not interpreted correctly when inserted during editing?

Code: Select all | Expand

FUNCTION Main()

   LOCAL aArray

   HB_CDPSELECT( "UTF8" )

   FW_SetUnicode( .T. )

   aArray := { "üöä", "ÄÖÜ", "ßéÉÊ"}

   XBrowse( aArray, "Unicode 2-Byte Test - FW_SetUnicode( .T. ) - aArray",,,,, !.F., .T.,,, .F., .T. )

RETURN NIL
 
The given characters are displayed correctly:

Image

But if you enter or edit the same characters, they will not be interpreted correctly: :shock:

Image

What is going wrong?

Re: FW_SetUnicode( .T. ) 2-Byte characters

Posted: Wed Jun 21, 2023 1:14 pm
by frose
This is the result with the example from https://forums.fivetechsupport.com/view ... =3&t=43246
when FW_SetUnicode( .t. )

Code: Select all | Expand

local oDlg, oGet, oEdit
   local cVar1 := ""
   local cVar2 := ""

   HB_CDPSELECT( "UTF8" )
   FW_SetUnicode( .T. )

   DEFINE DIALOG oDlg SIZE 300,300 PIXEL TRUEPIXEL

   @  20,20 GET oGet VAR cVar1 SIZE 200,20 PIXEL OF oDlg VARCHAR 20

   @  60,20 EDIT oEdit VAR cVar2 SIZE 200,20 PIXEL OF oDlg

   @ 100,20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ACTION MsgInfo( "|" + cVar1 + "|" + CRLF + "|" + cVar2 + "|" )

   ACTIVATE DIALOG oDlg CENTERED
 
Image

no words, what's going on?

Re: FW_SetUnicode( .T. ) 2-Byte characters

Posted: Wed Jun 21, 2023 3:54 pm
by nageswaragunupudi
cVar2 using EDIT control is correct but cVar1 using GET control is not correct?

I need to do more tests at my end.

Did you try keeping FW_SetUnicode( .f. ) // default
and try setting

Code: Select all | Expand

   HB_LangSelect("DE")
   HB_SetCodePage("DEWIN")

Re: FW_SetUnicode( .T. ) 2-Byte characters

Posted: Wed Jun 21, 2023 6:07 pm
by frose
nageswaragunupudi wrote:cVar2 using EDIT control is correct but cVar1 using GET control is not correct?
Yes!

Did some more tests with ü - 0xC3BC and some other 2-byte characters:
- The first 2-byte char is ok
- all the following are not ok
- Pasting one or more 2-byte chars from the clipboard is working!
nageswaragunupudi wrote:Did you try keeping FW_SetUnicode( .f. ) // default
and try setting

Code: Select all | Expand

   HB_LangSelect("DE")
   HB_SetCodePage("DEWIN")
Used DE850 until now, want to use UTF-8 in the future 8)

Re: FW_SetUnicode( .T. ) 2-Byte characters

Posted: Thu Jun 22, 2023 1:31 am
by nageswaragunupudi
Used DE850 until now,
Is this working perfectly?
Can you please let me see your settings?

Re: FW_SetUnicode( .T. ) 2-Byte characters

Posted: Thu Jun 22, 2023 5:23 am
by frose
Is this working perfectly?
Yes
Can you please let me see your settings?
Windows 11 Pro 22H2 22621.1848
Harbour 3.2.0dev (r2008190002)
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
FWH 23.04 x86

Re: FW_SetUnicode( .T. ) 2-Byte characters

Posted: Thu Jun 22, 2023 12:47 pm
by nageswaragunupudi
I do not have German keyboard, but I am using virtual touch keyboard downloaded from Google.
I noticed the same issues.
We are going to look into and solve the issue.
This may take some time.
I suggest you to postpone moving to Unicode for a few days, till we make this work perfectly.

Meanwhile you can help me by
1. Let me know if the multiline Get
@ r,c, GET ctext MEMO/TEXT ..
is working perfectly with German lang when FW_SetUnicode() is .T. ?

2. Can you paste all problem German characters here?

Re: FW_SetUnicode( .T. ) 2-Byte characters

Posted: Fri Jun 23, 2023 6:56 am
by frose
I do not have German keyboard, but I am using virtual touch keyboard downloaded from Google.
I noticed the same issues.
I use the 'Comfort On-Screen Keyboard Pro' for this purpose.
We are going to look into and solve the issue.
Good to know.
This may take some time.
I suggest you to postpone moving to Unicode for a few days, till we make this work perfectly.
No problem, enough other problems (challenges) left when switching from xHarbour.com/DE850 to Harbour/UTF-8.
Meanwhile you can help me by
1. Let me know if the multiline Get
@ r,c, GET ctext MEMO/TEXT ..
is working perfectly with German lang when FW_SetUnicode() is .T. ?
Ok, I will do.
2. Can you paste all problem German characters here?
There are only three 'German Umlaute': [/list]
  • üÜ
    öÖ
    äÄ
and the ß - UpperCase 'SS' 8) .
For more information see https://www.berlitz.com/blog/german-uml ... ng-letters
In UTF-8 they are a lot of other 2-Byte characters, used in french, spanish, danish croatian, etc.
All 2-Byte characters I have tested are concerned!

One other aspects that (perhaps) fit the theme:
If you switch your windows machine to 'Beta: Use Unicode UTF-8 for worldwide language support' 2-Byte characters in the TGet() are handled differently: the first character will appear as � - https://www.compart.com/en/unicode/U+FFFD all following characters are OK!
After a while I turned the switch 'Beta: Use Unicode UTF-8 for worldwide language support' off again. There are some side effects to other applications. And in my Harbour app it's better to see the misintepreted characters e.g. ü instead off the �.
In this context this side was/is very helpful for me: https://www.i18nqa.com/debug/utf8-debug.html

Re: FW_SetUnicode( .T. ) 2-Byte characters

Posted: Fri Jun 23, 2023 8:18 am
by frose
Meanwhile you can help me by
1. Let me know if the multiline Get
@ r,c, GET ctext MEMO/TEXT ..
is working perfectly with German lang when FW_SetUnicode() is .T. ?
Works without any objection in TMultiGet() :D
Image

Code: Select all | Expand

   local oDlg
   local oGet
   local oEdit
   local oMemo
   local cVar1 := ""
   local cVar2 := ""
   local cVar3 := ""
   local cVar4 := ""

   REQUEST HB_CODEPAGE_UTF8
   HB_CDPSELECT( "UTF8" )
   FW_SetUnicode( .T. )
   
   DEFINE DIALOG oDlg SIZE 600, 600 PIXEL TRUEPIXEL
   
   @  20, 20 GET oGet VAR cVar1 SIZE 200,20 PIXEL OF oDlg VARCHAR 20
   
   @  40,20 EDIT oEdit VAR cVar2 SIZE 200,20 PIXEL OF oDlg
   
   @  60, 20 GET oGet VAR cVar3 MULTILINE SIZE 200, 50 PIXEL OF oDlg

   @ 120, 20 GET oMemo VAR cVar4 MEMO OF oDlg PIXEL SIZE 400, 100
   
   @ 220, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ;
      ACTION MsgInfo( "|" + cVar1 + "|" + CRLF + "|" + cVar2 + "|" + cVar3 + "|" + cVar4 + "|" )

   ACTIVATE DIALOG oDlg CENTERED
 

Re: FW_SetUnicode( .T. ) 2-Byte characters

Posted: Fri Jun 23, 2023 9:51 am
by frose
something else I noticed: During editing it sometimes happens that the following 2-byte characters are interpreted CORRECTLY.
So the error should be related to the length calculation of the previous characters, somewhere deep inside.
HTH

Re: FW_SetUnicode( .T. ) 2-Byte characters

Posted: Fri Jun 23, 2023 1:30 pm
by Uwe.Diemer
same Prob here with unicode

I want move to Harbour from xHarbour

My getfield blocks if itype "Müller" t stops at "Mü"

U.diemer using ads Server 12.2

Re: FW_SetUnicode( .T. ) 2-Byte characters

Posted: Fri Jun 23, 2023 4:01 pm
by nageswaragunupudi
Uwe.Diemer wrote:same Prob here with unicode

I want move to Harbour from xHarbour

My getfield blocks if itype "Müller" t stops at "Mü"

U.diemer using ads Server 12.2
Working well with xHarbour but not with Harbour?

Re: FW_SetUnicode( .T. ) 2-Byte characters

Posted: Sat Jun 24, 2023 6:31 am
by frose
Uwe.Diemer wrote:same Prob here with unicode

I want move to Harbour from xHarbour

My getfield blocks if itype "Müller" t stops at "Mü"

U.diemer using ads Server 12.2
cannot confirm this behavior for Harbour.
Uwe, try example from this thread

Re: FW_SetUnicode( .T. ) 2-Byte characters

Posted: Mon Jun 26, 2023 9:15 am
by frose
nageswaragunupudi wrote: Meanwhile you can help me by
1. Let me know if the multiline Get
@ r,c, GET ctext MEMO/TEXT ..
is working perfectly with German lang when FW_SetUnicode() is .T. ?
TEdit() does not work correctly yet. Something happens with <cVar1> and <cVar2>:

Code: Select all | Expand

FUNCTION Main()

   LOCAL oDlg
   LOCAL oEdit
   LOCAL cU82Lower
   LOCAL cU82Upper
   LOCAL cVar1
   LOCAL cVar2

   REQUEST HB_CODEPAGE_UTF8
   HB_CDPSELECT( "UTF8" )
   FW_SetUnicode( .T. )

   cU82Lower := "lowerüöäßUPPER"
   cU82Upper := "UPPERÄÜÖßlower"
   cVar1     := cU82Lower
   cVar2     := cU82Upper
   
   MsgInfo( ;
      "cVar1: " + cVar1 + CRLF + CRLF + ;
      "cVar2: " + cVar2, ;
      "Test Encoding";
      )
   DEFINE DIALOG oDlg SIZE 600, 600 PIXEL TRUEPIXEL
   
   @  40, 20 EDIT oEdit VAR cVar1 SIZE 200,20 PIXEL OF oDlg
   
   @  60, 20 EDIT oEdit VAR cVar2 SIZE 200,20 PIXEL OF oDlg

   @  80, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ACTION MsgInfo( ;
      "cVar1: " + cVar1 + CRLF + CRLF + ;
      "cVar2: " + cVar2, ;
      "Test Encoding";
      )

   ACTIVATE DIALOG oDlg CENTERED

   MsgInfo( ;
      "cVar1: " + cVar1 + CRLF + CRLF + ;
      "cVar2: " + cVar2, ;
      "Test Encoding";
      )

   MsgInfo( ;
      "cU82Lower: " + cU82Lower + CRLF + CRLF + ;
      "cU82Upper: " + cU82Upper, ;
      "Test Encoding";
      )
      
RETURN NIL
 
Image

Re: FW_SetUnicode( .T. ) 2-Byte characters

Posted: Mon Jun 26, 2023 10:40 am
by karinha

Code: Select all | Expand

// C:\FWH..\SAMPLES\FROSE2UT.PRG

#include "FiveWin.ch"

REQUEST HB_LANG_PT
REQUEST HB_CODEPAGE_PT850

// REQUEST HB_CODEPAGE_UTF8 ???? Harbour? No xHarbour.

// REQUEST HB_CODEPAGE_PTISO
// REQUEST HB_CODEPAGE_UTF8EX

FUNCTION Main()

   LOCAL oDlg
   LOCAL oEdit
   LOCAL cU82Lower
   LOCAL cU82Upper
   LOCAL cVar1
   LOCAL cVar2

   HB_LANGSELECT( 'PT' )     // Default language is now Portuguese
   HB_SETCODEPAGE( "PT850" )

   HB_CDPSELECT( "UTF8" )

   /*
   HB_CDPSELECT( "PTISO" )

   hb_cdpSelect( "UTF8EX" )
   */

   FW_SetUnicode( .T. )

   // cU82Lower := OemToAnsi( LOWER( "lowerüöäßUPPER" ) )
   // cU82Upper := OemToAnsi( UPPER( "UPPERÄÜÖßlower" ) )

   // OR:

   cU82Lower := LOWER( "lowerüöäßUPPER" )
   cU82Upper := UPPER( "UPPERÄÜÖßlower" )

   cVar1     := cU82Lower
   cVar2     := cU82Upper

   DEFINE DIALOG oDlg SIZE 600, 600 PIXEL TRUEPIXEL
   
   @  40, 20 EDIT oEdit VAR cVar1 SIZE 200,20 PIXEL OF oDlg
   
   @  60, 20 EDIT oEdit VAR cVar2 SIZE 200,20 PIXEL OF oDlg

   @  80, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ACTION MsgInfo( ;
      "cVar1: " + cVar1 + CRLF + CRLF + ;
      "cVar2: " + cVar2, ;
      "Test Encoding";
      )

   ACTIVATE DIALOG oDlg CENTERED

RETURN NIL
 
Regards, saludos.