Page 1 of 2

TGet() - UTF8 encoding fails [Solved]

PostPosted: Thu Sep 14, 2023 8:26 am
by frose
UTF8 encoding fails in TGet()!

Code: Select all  Expand view
#include "fivewin.ch"

function Main()

   local oDlg
   local oGet
   local cVar1 := "üäö"

   REQUEST HB_CODEPAGE_UTF8
   HB_CDPSELECT( "UTF8" )
   FW_SetUnicode( .T. )

   DEFINE DIALOG oDlg SIZE 600, 600 PIXEL TRUEPIXEL

   @  20, 20 GET oGet VAR cVar1 SIZE 200,20 PIXEL OF oDlg VARCHAR 70 PICTURE "@!70"

   @ 240, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ;
      ACTION MsgInfo( "oGet/TGet():        " + cVar1 + " - " + StrToHex( cVar1, " " ) )

   ACTIVATE DIALOG oDlg CENTERED

RETURN NIL
 


Image

If using the paramters <VARCHAR/lnLimitChars> and/or <PICTURE/cPict> the encoding is changed from UTF-8 to Unicode when editing!

Re: TEdit() - UTF8 encoding fails

PostPosted: Thu Sep 14, 2023 8:01 pm
by nageswaragunupudi
UTF-8 to Unicode

Utf-8 is Unicode
Probably you mean ANSI to UTF8.

Re: TGet() - UTF8 encoding fails

PostPosted: Thu Sep 14, 2023 8:28 pm
by frose
Yes, the encoding switch from UTF8 to ANSI

Re: TGet() - UTF8 encoding fails [Unsolved]

PostPosted: Fri Oct 06, 2023 9:44 am
by frose
Dear Mr. Nageswara Rao,

can you confirm the unwanted change of the encoding?
If so, do you plan to correct this behavior?

Many greetings
Frank

Re: TGet() - UTF8 encoding fails [Unsolved]

PostPosted: Mon Oct 09, 2023 7:39 am
by nageswaragunupudi
Looking into this.
Please wait a little

Re: TGet() - UTF8 encoding fails [Unsolved]

PostPosted: Mon Oct 09, 2023 8:55 am
by frose
super, ok :D

Re: TGet() - UTF8 encoding fails [Unsolved]

PostPosted: Wed Oct 11, 2023 12:11 am
by nageswaragunupudi
I copied your program as it is and built with FWH2307 and this is what I got.
Image

However, there is a lot more to discuss about TGet and Umlauts.
Please wait for my next post.

Re: TGet() - UTF8 encoding fails [Unsolved]

PostPosted: Wed Oct 11, 2023 6:36 am
by frose
yes, so far everything is in order.
But when editing, the encoding switches!

Please wait for my next post.
Ok, I will wait, it is not very urgent. In some places I have switched to TEdit(), but would like to return to TGet().

Re: TGet() - UTF8 encoding fails [Unsolved]

PostPosted: Wed Oct 11, 2023 9:06 am
by nageswaragunupudi
But when editing, the encoding switches!


Please try this:
Code: Select all  Expand view
#include "fivewin.ch"

function Main()

   local oDlg
   local oGet
   local cVar1 := "üäö"

   FW_SetUnicode( .T. )

   DEFINE DIALOG oDlg SIZE 300, 300 PIXEL TRUEPIXEL

   @  20, 20 GET oGet VAR cVar1 SIZE 250,30 PIXEL OF oDlg PICTURE "@!" VARCHAR 70 ;
      ON CHANGE oDlg:Update()

   @  60, 20 SAY cVar1 SIZE 250,30 PIXEL OF oDlg UPDATE

   @ 100, 20 SAY STRTOHEX( cVar1, " " ) SIZE 260,60 PIXEL OF oDlg UPDATE

   @ 200, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ;
      ACTION MsgInfo( "oGet/TGet(): " + Trim( cVar1 ) + CRLF + StrToHex( cVar1, " " ) )

   ACTIVATE DIALOG oDlg CENTERED

RETURN NIL

Re: TGet() - UTF8 encoding fails [Unsolved]

PostPosted: Wed Oct 11, 2023 4:06 pm
by frose
nothing has changed!

If I put an 'a' at the end of the given characters, then the encoding changes to ANSI:
Image

Re: TGet() - UTF8 encoding fails [Unsolved]

PostPosted: Wed Oct 11, 2023 6:12 pm
by nageswaragunupudi
I am running the code I posted.
I do not see an problems here.
Are you using FWH2307 please?
Image

Re: TGet() - UTF8 encoding fails [Unsolved]

PostPosted: Wed Oct 11, 2023 8:56 pm
by frose
I noticed that all hexcodes in your example are ANSI and that there are NO UTF8 2-byte hexcodes!

Probably the encoding is already changed to ANSI before the TGet() was activated!?

Maybe it is the text object to display the hexcode directly?

I'll test it tomorrow 8)

Re: TGet() - UTF8 encoding fails [Unsolved]

PostPosted: Wed Oct 11, 2023 10:33 pm
by nageswaragunupudi
Probably the encoding is already changed to ANSI before the TGet() was activated!?


Yes.

We will discuss how you and other programmers would like the behavior to be.

Re: TGet() - UTF8 encoding fails [Unsolved]

PostPosted: Thu Oct 12, 2023 7:30 am
by frose
Please try WITH VARCHAR and PICTURE :
Code: Select all  Expand view
#include "fivewin.ch"

function Main()

   local oDlg
   local oGet
   local cVar1 := "üäö"

   FW_SetUnicode( .T. )
   
   MsgInfo( "cVar1: " + Trim( cVar1 ) + CRLF + StrToHex( cVar1, " " ) )

   DEFINE DIALOG oDlg SIZE 300, 300 PIXEL TRUEPIXEL

   @  20, 20 GET oGet VAR cVar1 SIZE 250,30 PIXEL OF oDlg PICTURE "@!" VARCHAR 70 ;
      ON CHANGE oDlg:Update()

   @  60, 20 SAY cVar1 SIZE 250,30 PIXEL OF oDlg UPDATE

   @ 100, 20 SAY STRTOHEX( cVar1, " " ) SIZE 260,60 PIXEL OF oDlg UPDATE

   @ 200, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ;
      ACTION MsgInfo( "oGet/TGet(): " + Trim( cVar1 ) + CRLF + StrToHex( cVar1, " " ) )

   ACTIVATE DIALOG oDlg CENTERED

   MsgInfo( "cVar1: " + Trim( cVar1 ) + CRLF + StrToHex( cVar1, " " ) )
RETURN NIL

Image

cVar1 changes WITHOUT editing, but that can not be right!

And then without VARCHAR and PICTURE without editing:
Code: Select all  Expand view

   @  20, 20 GET oGet VAR cVar1 SIZE 250,30 PIXEL OF oDlg ;
   ON CHANGE oDlg:Update()
 

Image
cVar1 doesn't change, that's OK!

Editing also works, the encoding is and remains UTF8!:

Image

---------------------------------
As a reminder: The correct UTF8 hexcodes for 'üäö' are C3BC, C3A4 und C3B6, not DC C4 D6, see for example https://www.charset.org/utf-8!
DC C4 D6 are the ANSI hexcodes

Re: TGet() - UTF8 encoding fails [Unsolved]

PostPosted: Thu Oct 12, 2023 11:01 pm
by nageswaragunupudi
As a reminder: The correct UTF8 hexcodes for 'üäö' are C3BC, C3A4 und C3B6, not DC C4 D6, see for example https://www.charset.org/utf-8!
DC C4 D6 are the ANSI hexcodes


Code: Select all  Expand view
+---+--------+-----------------+
|STR|ANSI-HEX|UTF8-HEX         |
|üäö|FC E4 F6|C3 BC C3 A4 C3 B6|
|ÜÄÖ|DC C4 D6|C3 9C C3 84 C3 96|
+---+--------+-----------------+


With the picture clause "@!", "üäö" is converted to "ÜÄÖ" and hence the hex codes lile "DC C4 D6" are correct for Upper Case text
in ANSI