Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

[TIP] How to convert a Big Endian Unicode String to Ansi ?


  • Please log in to reply
5 replies to this topic
SKAN
  • Administrators
  • 9115 posts
  • Last active:
  • Joined: 26 Dec 2005
VarSetCapacity(U,12,0), NumPut(0x6B005300,U), NumPut(0x6E006100,U,4), NumPut(0x2221,U,8 )
; The above creates Big Endian Unicode string: Skan™ ( 0053006B0061006E21220000 )

VarSetCapacity(A,6,0)
DllCall( "WideCharToMultiByte", Int,0,Int,0,UInt,[color=red]&U+1[/color],Int,-1,Str,A,Int,6,Int,0,Int,0 )
MsgBox,0,Incorrect,%A% 

; the following does a byte-swap for every word to convert UTF16-BE to UTF16-LE
Loop % (VarSetCapacity(U)//2)+(P:=&U-2)-P 
  NumPut( (*(P:=P+2)<<8)+(*(P+1)),P+0,0,"UShort" ) 

VarSetCapacity(A,6,0)
DllCall( "WideCharToMultiByte", Int,0,Int,0,UInt,&U,Int,-1,Str,A,Int,6,Int,0,Int,0 )
MsgBox,0,Correct,%A%

Is there a better/faster way to handle UTF16-BE ? Please Help!

SKAN
  • Administrators
  • 9115 posts
  • Last active:
  • Joined: 26 Dec 2005
Okay, solved!

VarSetCapacity(BE,12,0), NumPut(0x6B005300,BE),NumPut(0x6E006100,BE,4),NumPut(0x2221,BE,8)
; The above creates Big Endian Unicode string: Skan™ ( 0053006B0061006E21220000 )

VarSetCapacity(LE,12,0), LCMAP_BYTEREV := 0x800
DllCall( "LCMapStringW", UInt,0, UInt,LCMAP_BYTEREV, Str,BE, UInt,12, Str,LE, UInt,12 )
; http://msdn.microsoft.com/en-us/library/dd318700

VarSetCapacity( Ansi,6,0)
DllCall( "WideCharToMultiByte",Int,0,Int,0, UInt,&LE, Int,-1, Str,Ansi, Int,6,Int,0,Int,0)

MsgBox, % Ansi

kWo4Lk1.png

danalec
  • Members
  • 225 posts
  • Last active: Oct 03 2014 05:31 PM
  • Joined: 20 Jul 2006
awesome, thanks very much for it.

                                  [ profile ]


SKAN
  • Administrators
  • 9115 posts
  • Last active:
  • Joined: 26 Dec 2005
Welcome... and may I ask why you need it? Just curious.

Hamlet
  • Members
  • 302 posts
  • Last active: Mar 23 2014 03:37 PM
  • Joined: 22 Jan 2009
Thanks, Lexikos tells me to go here. So here I am.

Fine !!

I am working for an translation guy who handle bunch of strange files. A computer aided translation programming makes this kind of stupid UCS2-BE files while handling some ad documents made of xml format (funny, its contents are about Red Hat's cloud computing and big-data. Kind of nat that much old topic). File extension is not exactly txt but ttx.

SKAN
  • Administrators
  • 9115 posts
  • Last active:
  • Joined: 26 Dec 2005
I was wondering where I encountered Big Endian Unicode!
Here : TT_GetFontName()
kWo4Lk1.png