Page 1 of 1

String in unicode

Posted: Wed Nov 15, 2023 1:51 pm
by Natter
Hi,

There is a string in unicode. The length of Latin characters is 1 byte, the length of characters of other code pages is 2 bytes.
How to get an array of characters of this string ?

Re: String in unicode

Posted: Thu Nov 16, 2023 5:53 am
by Antonio Linares
Dear Yuri,

Please try:

? WideToStr( cUnicode )

Re: String in unicode

Posted: Thu Nov 16, 2023 6:23 am
by nageswaragunupudi
This is a UTF8 encoded string.
You can use HB_UTF8LEN( cString ) to find the number of characters (not bytes) and HB_UTF8SUBSTR( cString, n, 1 ) to get the nth character (not byte)

Example:
Let us take a sample string in Hex: "41C39CC49EE0B095".
This is displayed as : AÜĞక

Code: Select all | Expand

#include "fivewin.ch"

function Main()

   local cString := HEXTOSTR( "41C39CC49EE0B095" )
   local nChars, n
   local aChars := {}

   ? cString // --> AÜĞక
   ? "Length in Bytes : ", Len( cString ) // --> 8
   ? "Length in Chars : ", nChars := HB_UTF8LEN( cString ) --> 4 
   for n := 1 to nChars
      aAdd( aChars, HB_UTF8SUBSTR( cString, n, 1 ) )
   next

   XBROWSER aChars // { "A", "Ü", "Ğ", "క" }

   AEval( aChars, { |c,i| aChars[ i ] := { c, STRTOHEX( c ), Len( c ) } } )
   XBROWSER aChars SETUP oBrw:cHeaders := { "CHAR","HEX","BYTES" }

return nil
Image

Re: String in unicode

Posted: Thu Nov 16, 2023 7:00 am
by Natter
Thank you, I will definitely use it!

Re: String in unicode

Posted: Fri Nov 17, 2023 8:01 am
by frose
For me, these two sources are helpful regarding UTF8:

1. 3.5 Codepage API - UTF8 and code page functions built into Harbour: http://www.kresin.ru/en/hrbfaq_3.html#Doc5

2. Unicode conversion functions for Harbour. http://www.hmgextended.com/files/CONTRI ... onvert.prg unfortunately no longer online :(

Maybe someone can help 8) :wink:

Re: String in unicode

Posted: Fri Nov 17, 2023 9:00 am
by Antonio Linares

Re: String in unicode

Posted: Fri Nov 17, 2023 9:53 am
by frose
muchas gracias Antonio 8)