FW_SetUnicode( .T. ) 2-Byte characters

User avatar
frose
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg
Contact:

FW_SetUnicode( .T. ) 2-Byte characters

Post by frose »

Hi,

does anyone know why these 2-byte characters "üöä", "ÄÖÜ", "ßéÉÊ" are not interpreted correctly when inserted during editing?

Code: Select all | Expand

FUNCTION Main()

   LOCAL aArray

   HB_CDPSELECT( "UTF8" )

   FW_SetUnicode( .T. )

   aArray := { "üöä", "ÄÖÜ", "ßéÉÊ"}

   XBrowse( aArray, "Unicode 2-Byte Test - FW_SetUnicode( .T. ) - aArray",,,,, !.F., .T.,,, .F., .T. )

RETURN NIL
 
The given characters are displayed correctly:

Image

But if you enter or edit the same characters, they will not be interpreted correctly: :shock:

Image

What is going wrong?
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
frose
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg
Contact:

Re: FW_SetUnicode( .T. ) 2-Byte characters

Post by frose »

This is the result with the example from https://forums.fivetechsupport.com/view ... =3&t=43246
when FW_SetUnicode( .t. )

Code: Select all | Expand

local oDlg, oGet, oEdit
   local cVar1 := ""
   local cVar2 := ""

   HB_CDPSELECT( "UTF8" )
   FW_SetUnicode( .T. )

   DEFINE DIALOG oDlg SIZE 300,300 PIXEL TRUEPIXEL

   @  20,20 GET oGet VAR cVar1 SIZE 200,20 PIXEL OF oDlg VARCHAR 20

   @  60,20 EDIT oEdit VAR cVar2 SIZE 200,20 PIXEL OF oDlg

   @ 100,20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ACTION MsgInfo( "|" + cVar1 + "|" + CRLF + "|" + cVar2 + "|" )

   ACTIVATE DIALOG oDlg CENTERED
 
Image

no words, what's going on?
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
nageswaragunupudi
Posts: 10691
Joined: Sun Nov 19, 2006 5:22 am
Location: India
Contact:

Re: FW_SetUnicode( .T. ) 2-Byte characters

Post by nageswaragunupudi »

cVar2 using EDIT control is correct but cVar1 using GET control is not correct?

I need to do more tests at my end.

Did you try keeping FW_SetUnicode( .f. ) // default
and try setting

Code: Select all | Expand

   HB_LangSelect("DE")
   HB_SetCodePage("DEWIN")
Regards

G. N. Rao.
Hyderabad, India
User avatar
frose
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg
Contact:

Re: FW_SetUnicode( .T. ) 2-Byte characters

Post by frose »

nageswaragunupudi wrote:cVar2 using EDIT control is correct but cVar1 using GET control is not correct?
Yes!

Did some more tests with ü - 0xC3BC and some other 2-byte characters:
- The first 2-byte char is ok
- all the following are not ok
- Pasting one or more 2-byte chars from the clipboard is working!
nageswaragunupudi wrote:Did you try keeping FW_SetUnicode( .f. ) // default
and try setting

Code: Select all | Expand

   HB_LangSelect("DE")
   HB_SetCodePage("DEWIN")
Used DE850 until now, want to use UTF-8 in the future 8)
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
nageswaragunupudi
Posts: 10691
Joined: Sun Nov 19, 2006 5:22 am
Location: India
Contact:

Re: FW_SetUnicode( .T. ) 2-Byte characters

Post by nageswaragunupudi »

Used DE850 until now,
Is this working perfectly?
Can you please let me see your settings?
Regards

G. N. Rao.
Hyderabad, India
User avatar
frose
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg
Contact:

Re: FW_SetUnicode( .T. ) 2-Byte characters

Post by frose »

Is this working perfectly?
Yes
Can you please let me see your settings?
Windows 11 Pro 22H2 22621.1848
Harbour 3.2.0dev (r2008190002)
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
FWH 23.04 x86
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
nageswaragunupudi
Posts: 10691
Joined: Sun Nov 19, 2006 5:22 am
Location: India
Contact:

Re: FW_SetUnicode( .T. ) 2-Byte characters

Post by nageswaragunupudi »

I do not have German keyboard, but I am using virtual touch keyboard downloaded from Google.
I noticed the same issues.
We are going to look into and solve the issue.
This may take some time.
I suggest you to postpone moving to Unicode for a few days, till we make this work perfectly.

Meanwhile you can help me by
1. Let me know if the multiline Get
@ r,c, GET ctext MEMO/TEXT ..
is working perfectly with German lang when FW_SetUnicode() is .T. ?

2. Can you paste all problem German characters here?
Regards

G. N. Rao.
Hyderabad, India
User avatar
frose
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg
Contact:

Re: FW_SetUnicode( .T. ) 2-Byte characters

Post by frose »

I do not have German keyboard, but I am using virtual touch keyboard downloaded from Google.
I noticed the same issues.
I use the 'Comfort On-Screen Keyboard Pro' for this purpose.
We are going to look into and solve the issue.
Good to know.
This may take some time.
I suggest you to postpone moving to Unicode for a few days, till we make this work perfectly.
No problem, enough other problems (challenges) left when switching from xHarbour.com/DE850 to Harbour/UTF-8.
Meanwhile you can help me by
1. Let me know if the multiline Get
@ r,c, GET ctext MEMO/TEXT ..
is working perfectly with German lang when FW_SetUnicode() is .T. ?
Ok, I will do.
2. Can you paste all problem German characters here?
There are only three 'German Umlaute': [/list]
  • üÜ
    öÖ
    äÄ
and the ß - UpperCase 'SS' 8) .
For more information see https://www.berlitz.com/blog/german-uml ... ng-letters
In UTF-8 they are a lot of other 2-Byte characters, used in french, spanish, danish croatian, etc.
All 2-Byte characters I have tested are concerned!

One other aspects that (perhaps) fit the theme:
If you switch your windows machine to 'Beta: Use Unicode UTF-8 for worldwide language support' 2-Byte characters in the TGet() are handled differently: the first character will appear as � - https://www.compart.com/en/unicode/U+FFFD all following characters are OK!
After a while I turned the switch 'Beta: Use Unicode UTF-8 for worldwide language support' off again. There are some side effects to other applications. And in my Harbour app it's better to see the misintepreted characters e.g. ü instead off the �.
In this context this side was/is very helpful for me: https://www.i18nqa.com/debug/utf8-debug.html
Last edited by frose on Sun Jun 25, 2023 4:08 pm, edited 1 time in total.
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
frose
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg
Contact:

Re: FW_SetUnicode( .T. ) 2-Byte characters

Post by frose »

Meanwhile you can help me by
1. Let me know if the multiline Get
@ r,c, GET ctext MEMO/TEXT ..
is working perfectly with German lang when FW_SetUnicode() is .T. ?
Works without any objection in TMultiGet() :D
Image

Code: Select all | Expand

   local oDlg
   local oGet
   local oEdit
   local oMemo
   local cVar1 := ""
   local cVar2 := ""
   local cVar3 := ""
   local cVar4 := ""

   REQUEST HB_CODEPAGE_UTF8
   HB_CDPSELECT( "UTF8" )
   FW_SetUnicode( .T. )
   
   DEFINE DIALOG oDlg SIZE 600, 600 PIXEL TRUEPIXEL
   
   @  20, 20 GET oGet VAR cVar1 SIZE 200,20 PIXEL OF oDlg VARCHAR 20
   
   @  40,20 EDIT oEdit VAR cVar2 SIZE 200,20 PIXEL OF oDlg
   
   @  60, 20 GET oGet VAR cVar3 MULTILINE SIZE 200, 50 PIXEL OF oDlg

   @ 120, 20 GET oMemo VAR cVar4 MEMO OF oDlg PIXEL SIZE 400, 100
   
   @ 220, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ;
      ACTION MsgInfo( "|" + cVar1 + "|" + CRLF + "|" + cVar2 + "|" + cVar3 + "|" + cVar4 + "|" )

   ACTIVATE DIALOG oDlg CENTERED
 
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
frose
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg
Contact:

Re: FW_SetUnicode( .T. ) 2-Byte characters

Post by frose »

something else I noticed: During editing it sometimes happens that the following 2-byte characters are interpreted CORRECTLY.
So the error should be related to the length calculation of the previous characters, somewhere deep inside.
HTH
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
Uwe.Diemer
Posts: 98
Joined: Mon Aug 09, 2010 11:00 am

Re: FW_SetUnicode( .T. ) 2-Byte characters

Post by Uwe.Diemer »

same Prob here with unicode

I want move to Harbour from xHarbour

My getfield blocks if itype "Müller" t stops at "Mü"

U.diemer using ads Server 12.2
User avatar
nageswaragunupudi
Posts: 10691
Joined: Sun Nov 19, 2006 5:22 am
Location: India
Contact:

Re: FW_SetUnicode( .T. ) 2-Byte characters

Post by nageswaragunupudi »

Uwe.Diemer wrote:same Prob here with unicode

I want move to Harbour from xHarbour

My getfield blocks if itype "Müller" t stops at "Mü"

U.diemer using ads Server 12.2
Working well with xHarbour but not with Harbour?
Regards

G. N. Rao.
Hyderabad, India
User avatar
frose
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg
Contact:

Re: FW_SetUnicode( .T. ) 2-Byte characters

Post by frose »

Uwe.Diemer wrote:same Prob here with unicode

I want move to Harbour from xHarbour

My getfield blocks if itype "Müller" t stops at "Mü"

U.diemer using ads Server 12.2
cannot confirm this behavior for Harbour.
Uwe, try example from this thread
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
frose
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg
Contact:

Re: FW_SetUnicode( .T. ) 2-Byte characters

Post by frose »

nageswaragunupudi wrote: Meanwhile you can help me by
1. Let me know if the multiline Get
@ r,c, GET ctext MEMO/TEXT ..
is working perfectly with German lang when FW_SetUnicode() is .T. ?
TEdit() does not work correctly yet. Something happens with <cVar1> and <cVar2>:

Code: Select all | Expand

FUNCTION Main()

   LOCAL oDlg
   LOCAL oEdit
   LOCAL cU82Lower
   LOCAL cU82Upper
   LOCAL cVar1
   LOCAL cVar2

   REQUEST HB_CODEPAGE_UTF8
   HB_CDPSELECT( "UTF8" )
   FW_SetUnicode( .T. )

   cU82Lower := "lowerüöäßUPPER"
   cU82Upper := "UPPERÄÜÖßlower"
   cVar1     := cU82Lower
   cVar2     := cU82Upper
   
   MsgInfo( ;
      "cVar1: " + cVar1 + CRLF + CRLF + ;
      "cVar2: " + cVar2, ;
      "Test Encoding";
      )
   DEFINE DIALOG oDlg SIZE 600, 600 PIXEL TRUEPIXEL
   
   @  40, 20 EDIT oEdit VAR cVar1 SIZE 200,20 PIXEL OF oDlg
   
   @  60, 20 EDIT oEdit VAR cVar2 SIZE 200,20 PIXEL OF oDlg

   @  80, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ACTION MsgInfo( ;
      "cVar1: " + cVar1 + CRLF + CRLF + ;
      "cVar2: " + cVar2, ;
      "Test Encoding";
      )

   ACTIVATE DIALOG oDlg CENTERED

   MsgInfo( ;
      "cVar1: " + cVar1 + CRLF + CRLF + ;
      "cVar2: " + cVar2, ;
      "Test Encoding";
      )

   MsgInfo( ;
      "cU82Lower: " + cU82Lower + CRLF + CRLF + ;
      "cU82Upper: " + cU82Upper, ;
      "Test Encoding";
      )
      
RETURN NIL
 
Image
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
karinha
Posts: 7885
Joined: Tue Dec 20, 2005 7:36 pm
Location: São Paulo - Brasil
Contact:

Re: FW_SetUnicode( .T. ) 2-Byte characters

Post by karinha »

Code: Select all | Expand

// C:\FWH..\SAMPLES\FROSE2UT.PRG

#include "FiveWin.ch"

REQUEST HB_LANG_PT
REQUEST HB_CODEPAGE_PT850

// REQUEST HB_CODEPAGE_UTF8 ???? Harbour? No xHarbour.

// REQUEST HB_CODEPAGE_PTISO
// REQUEST HB_CODEPAGE_UTF8EX

FUNCTION Main()

   LOCAL oDlg
   LOCAL oEdit
   LOCAL cU82Lower
   LOCAL cU82Upper
   LOCAL cVar1
   LOCAL cVar2

   HB_LANGSELECT( 'PT' )     // Default language is now Portuguese
   HB_SETCODEPAGE( "PT850" )

   HB_CDPSELECT( "UTF8" )

   /*
   HB_CDPSELECT( "PTISO" )

   hb_cdpSelect( "UTF8EX" )
   */

   FW_SetUnicode( .T. )

   // cU82Lower := OemToAnsi( LOWER( "lowerüöäßUPPER" ) )
   // cU82Upper := OemToAnsi( UPPER( "UPPERÄÜÖßlower" ) )

   // OR:

   cU82Lower := LOWER( "lowerüöäßUPPER" )
   cU82Upper := UPPER( "UPPERÄÜÖßlower" )

   cVar1     := cU82Lower
   cVar2     := cU82Upper

   DEFINE DIALOG oDlg SIZE 600, 600 PIXEL TRUEPIXEL
   
   @  40, 20 EDIT oEdit VAR cVar1 SIZE 200,20 PIXEL OF oDlg
   
   @  60, 20 EDIT oEdit VAR cVar2 SIZE 200,20 PIXEL OF oDlg

   @  80, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ACTION MsgInfo( ;
      "cVar1: " + cVar1 + CRLF + CRLF + ;
      "cVar2: " + cVar2, ;
      "Test Encoding";
      )

   ACTIVATE DIALOG oDlg CENTERED

RETURN NIL
 
Regards, saludos.
João Santos - São Paulo - Brasil - Phone: +55(11)95150-7341
Post Reply