Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate

# Machine code functions: Bit Wizardry

144 replies to this topic
• Moderators
• 4713 posts
• Last active: Mar 31 2012 03:17 AM
• Joined: 14 Feb 2005
Here is Hex2Bin, the inverse of the Bin2Hex function, discussed earlier in this thread. Sometimes manipulating the hex representation of binary data is easier, so in the beginning of a script use Bin2Hex to convert the data to a stream of hex digits, process it, and in the end convert the result back to a binary buffer (e.g. to be saved to a file). The corresponding C function is slightly more complex, because 2 digits have to be combined to form a byte, and the hex digits can be lower case (a..f) or capital letters (A..F). The straightforward algorithm (with typedef unsigned char UInt8;):
```void Hex2Bin0(UInt8 *bin, UInt8 *hex) { // in bin room for ceil(strlen(hex)/2) bytes
UInt8 c, d;
for(;;) {
c = *hex++; if (c == 0) break;
if (c > 96) c -= 87;
else if (c > 64) c -= 55;
else c -= 48;
d = *hex++; if (d == 0) {*bin = c<<4; break;}
if (d > 96) d -= 87;
else if (d > 64) d -= 55;
else d -= 48;
*bin++ = (c<<4)|d;
}
}```
This works, but has a lot of branches, which flush the instruction pipeline of the processor, and so we lose speed. With a little trickier code (relying on the binary representation of the ASCII codes of A..F and a..f), we can make it shorter and faster:
```void Hex2Bin(UInt8 *bin, UInt8 *hex) { // in bin room for ceil(strlen(hex)/2) bytes
UInt8 b, c, d;
for(;;) {
c = *hex++; if (c == 0) break;
b = c >> 6;
*bin = ((c & 15) + b + (b << 3)) << 4;
d = *hex++; if (d == 0) break;
b = d >> 6;
*bin++ |= (d & 15) + b + (b << 3);
}
}```
The compiled code can be included in AHK with the usual MCode function:
```MCode(Hex2Bin,"568b74240c8a164684d2743b578b7c240c538ac2c0e806b109f6e98ac802cac0e10"
. "4880f8a164684d2741a8ac2c0e806b309f6eb80e20f02c20ac188078a16474684d275cd5b5f5ec3") ; 73 bytes```
After reserving memory for the binary buffer, call it this way:
`DllCall(&Hex2Bin, "UInt",&bin, "UInt",&hex, "CDECL")`
Here is some test code;
```hex = 1089abefFABE5
VarSetCapacity(bin, ceil(StrLen(hex)/2), 99)
DllCall(&Hex2Bin, "UInt",&bin, "UInt",&hex, "CDECL")

VarSetCapacity(S,99)
DllCall("msvcrt\sprintf", "Str",S, "Str","%02X %02X %02X %02X %02X %02X %02X"
, "UChar",*( &bin ), "UChar",*(&bin+1), "UChar",*(&bin+2), "UChar",*(&bin+3)
, "UChar",*(&bin+4), "UChar",*(&bin+5), "UChar",*(&bin+6), "CDECL" )

MsgBox %S%```

• Members
• 1934 posts
• Last active: May 30 2018 08:13 PM
• Joined: 22 Apr 2007
I dont understand. Why do all values for hex give 63 or 00? and how do i write to a file with this?

thanks

• Moderators
• 4713 posts
• Last active: Mar 31 2012 03:17 AM
• Joined: 14 Feb 2005

Why do all values for hex give 63 or 00?

Did you run the test code and it showed you 63 and 00 values? It showed me 10 89 AB EF FA BE 50. The hex digits are input, the output is binary.

how do i write to a file with this?

It is used to make one large binary buffer, which you write with one dll call to a file, instead of byte-by-byte, in a loop.

• Moderators
• 4713 posts
• Last active: Mar 31 2012 03:17 AM
• Joined: 14 Feb 2005
63 is the hex for the default values in the bin buffer, 99. Their appering in the result could mean that the dll call failed. Did you include the MCode(Hex2Bin, "568… instruction and do you have the right definition for MCode()?

• Members
• 1934 posts
• Last active: May 30 2018 08:13 PM
• Joined: 22 Apr 2007
Ok. I got the same answer as you for the example, but what am I doing wrong for writing an ico file? I can see it is obviouslt much too short, and I dont really understand all this machine code lark

```MCode(ByRef code, hex) { ; allocate memory and write Machine Code there
VarSetCapacity(code,StrLen(hex)//2)
Loop % StrLen(hex)//2
NumPut("0x" . SubStr(hex,2*A_Index-1,2), code, A_Index-1, "Char")
}

MCode(Hex2Bin,"568b74240c8a164684d2743b578b7c240c538ac2c0e806b109f6e98ac802cac0e10"
. "4880f8a164684d2741a8ac2c0e806b309f6eb80e20f02c20ac188078a16474684d275cd5b5f5ec3") ; 73 bytes

VarSetCapacity(bin, ceil(StrLen(hex)/2), 99)
DllCall(&Hex2Bin, "UInt",&bin, "UInt",&hex, "CDECL")

VarSetCapacity(S,99)
DllCall("msvcrt\sprintf", "Str",S, "Str","%02X %02X %02X %02X %02X %02X %02X"
, "UChar",*( &bin ), "UChar",*(&bin+1), "UChar",*(&bin+2), "UChar",*(&bin+3)
, "UChar",*(&bin+4), "UChar",*(&bin+5), "UChar",*(&bin+6), "CDECL" )

MsgBox, %S%
StringReplace, S, S, %A_Space%,, All
h := DllCall("CreateFile", "Str", "test.ico", "Uint", 0x40000000, "Uint", 0, "UInt", 0, "UInt", 4, "Uint", 0, "UInt", 0)
Result := DllCall("WriteFile", "UInt", h, "UChar *", S, "UInt", 4286, "UInt *", Written, "UInt", 0)
Return

|  - Open binary file
|  - Read n bytes (n = 0: all)
|  - From offset (offset < 0: counted from end)
|  - Close file
|  data (replaced) <- file[offset + 0..n-1]
*/ ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

{
h := DllCall("CreateFile","Str",file,"Uint",0x80000000,"Uint",3,"UInt",0,"UInt",3,"Uint",0,"UInt",0)
IfEqual h,-1, SetEnv, ErrorLevel, -1
IfNotEqual ErrorLevel,0,Return,0 ; couldn't open the file

m = 0                            ; seek to offset
IfLess offset,0, SetEnv,m,2
r := DllCall("SetFilePointerEx","Uint",h,"Int64",offset,"UInt *",p,"Int",m)
IfEqual r,0, SetEnv, ErrorLevel, -3
IfNotEqual ErrorLevel,0, {
t = %ErrorLevel%              ; save ErrorLevel to be returned
DllCall("CloseHandle", "Uint", h)
ErrorLevel = %t%              ; return seek error
Return 0
}

data =
IfEqual n,0, SetEnv n,0xffffffff ; almost infinite

format = %A_FormatInteger%       ; save original integer format
SetFormat Integer, Hex           ; for converting bytes to hex

Loop %n%
{
if (!result or Read < 1 or ErrorLevel)
break
c += 0                        ; convert to hex
StringTrimLeft c, c, 2        ; remove 0x
c = 0%c%                      ; pad left with 0
StringRight c, c, 2           ; always 2 digits
data = %data%%c%              ; append 2 hex digits
}

IfNotEqual ErrorLevel,0, SetEnv,t,%ErrorLevel%

h := DllCall("CloseHandle", "Uint", h)
IfEqual h,-1, SetEnv, ErrorLevel, -2
IfNotEqual t,,SetEnv, ErrorLevel, %t%

SetFormat Integer, %format%      ; restore original format
Totalread += 0                   ; convert to original format
}```

• Moderators
• 4713 posts
• Last active: Mar 31 2012 03:17 AM
• Joined: 14 Feb 2005
What do you want to do? The binary data is in “bin”. In S there is a string, the first 7 bytes of bin, converted to hex, just for testing if the function works. Only 20 bytes of S are used of the reserved 99. Then you write 4286 bytes, starting at the first byte of S. Most of it is just garbage.

You probably need (w/o function declarations):
```MCode(Hex2Bin,"568b74240c8a164684d2743b578b7c240c538ac2c0e806b109f6e98ac802cac0e10"
. "4880f8a164684d2741a8ac2c0e806b309f6eb80e20f02c20ac188078a16474684d275cd5b5f5ec3") ; 73 bytes

len := ceil(StrLen(hex)/2)

VarSetCapacity(bin, len)
DllCall(&Hex2Bin, "UInt",&bin, "UInt",&hex, "CDECL")

h := DllCall("CreateFile", "Str","test.ico", "Uint",0x40000000, "Uint",0, "UInt",0, "UInt",4, "Uint",0, "UInt",0)
Result := DllCall("WriteFile", "UInt",h, "Str",bin, "UInt",len, "UInt*",Written, "UInt",0)
MsgBox %Written%
Return```

• Moderators
• 4713 posts
• Last active: Mar 31 2012 03:17 AM
• Joined: 14 Feb 2005
Btw, the function BinRead is also obsolete. It was written, when AHK did not have means to manipulate binary data. Now you can read a large portion (or all) of the file into a binary buffer and use Bin2Hex to convert it instantly to a stream of hex digits. It is shorter and several orders of magnitude faster.

• Moderators
• 4713 posts
• Last active: Mar 31 2012 03:17 AM
• Joined: 14 Feb 2005
The newest member of the family, floating point comparison, is posted in its own thread, because of its importance.

• Moderators
• 4713 posts
• Last active: Mar 31 2012 03:17 AM
• Joined: 14 Feb 2005
Here are two prime number functions, IsPrime(n) and pDivs(n) with the Hex2Bin wrapper for the corresponding machine code function described a few posts ago. They are included in the Popup calculator II vers. 2.0, too.

IsPrime(n) returns true or false if n is a prime number or not. pDivs(n) returns a list of all the prime divisors of n, in increasing order, each repeated as needed. Here the Hex2Bin function is used to convert the machine code from hex to binary and to set up the array p used to get the next divisor candidate, not divisible by 2, 3 or 5.
```IsPrime(n) { ; 1 if n is prime, 0 otherwise
Static f, p
If n < 4
Return n > 1
If (f = "") {
Hex2Bin(f,"558bec5151568b75106a02593bf1894dfc72398b45088945f88b450c33d2f7"
. "75fc8955108b45f88b5510f775fc8bc285c0741e83f91d760383e91e8b45140fb604010145fc"
. "03c83975fc76cd33c040eb0233c05ec9c3") ; PrimeCheck: 86 Bytes
Hex2Bin(p,"010601020302010403020102010403020102010403020106050403020102")
}
Return DllCall(&f, "Int64",n, "UInt",round(sqrt(n)), "UInt",&p, "CDECL Int")
}

pDivs(n) { ; comma separated list of prime divisors of n
Static f, p
If n < 4
Return n
If (f = "") {
Hex2Bin(f,"558bec5151568b75106a1e33d2598bc6f7f13b75148bca773a8b45088945f8"
. "8b450c33d2f7f68955fc8b45f88b55fcf775108bc285c0741f8b45180fb6040103f003c883f9"
. "1d897510760383e91e3b751476cc33c0eb028bc65ec9c3") ; NextDiv: 92 Bytes
Hex2Bin(p,"010601020302010403020102010403020102010403020106050403020102")
}
d = 2
Loop {
d := DllCall(&f, "Int64",n, "UInt",d, "UInt",round(sqrt(n)), "UInt",&p, "CDECL UInt")
if (d = 0)
Return s . n
s .= d . ","
n //= d
If (n = 1) {
StringTrimRight s, s, 1
Return s
}
}
}

Hex2Bin(ByRef bin, hex) { ; convert hex stream to binary
Static f
If (f = "") {
VarSetCapacity(f,73,1)
Loop 73
NumPut("0x" . SubStr("568b74240c8a164684d2743b578b7c240c538ac2c0e806b1"
. "09f6e98ac802cac0e104880f8a164684d2741a8ac2c0e806b309f6eb80e20f02c20ac188078a"
. "16474684d275cd5b5f5ec3", 2*A_Index-1,2), f, A_Index-1, "Char")
}
VarSetCapacity(bin, (StrLen(hex)+1)//2, 99)
DllCall(&f, "UInt",&bin, "UInt",&hex, "CDECL")
}```
They are pretty fast up to 15 digit numbers, but even at 19 digits, the largest 64-bit integers, they finish in about half a minute in my 2GHz Centrino laptop. With much more complex algorithms (like Miller-Rabin primality test or number field sieving) these running times could be reduced to under a second, but 16..19 digit numbers are seldom processed with an AHK script to justify the code complexity. The functions try all the suitable divisors up to sqrt(n). There is an optimization: the next candidate divisor is chosen the next larger integer, which is not divisible by 2, 3 or 5. For each residue r mod 30 the array entry p[r] tells, how large a step is needed to jump over unwanted numbers. This gives a 3.75 fold speedup over testing every number to see if it was a divisor of n.

There is another issue: sqrt(n) is only 53 bit accurate, which could cause missing the divisors of (near) perfect squares. However, it is not the case: the integer part of sqrt(n) is at most 32 bit large, therefore sqrt(n) is accurate in several decimal places after the point, so rounding it to the nearest integer is safe.

You can test the functions with
```Loop 20
MsgBox % A_Index ": " IsPrime(A_Index)
Loop 20
MsgBox % A_Index ": " pDivs(A_Index)

t := A_TickCount
MsgBox % IsPrime(9223372036854775783) . "`n" . (A_TickCount-t)/1000  ; DELL Inspiron 9300: 34 sec

t := A_TickCount
MsgBox % pDivs((2**31+45)*(2**31+11)) . "`n" . (A_TickCount-t)/1000  ; DELL Inspiron 9300: 23 sec```

The C source code is below, which was compiled with VS'05, creating also Assembly With Machine Code (/FAc) listing, from where the hex stream of the machine code was copied to the script (without any disassembler).
```typedef unsigned char    UChar;
typedef unsigned int     UInt;
typedef unsigned __int64 UInt64;

__forceinline static UInt UMOD(UInt LS, UInt MS, UInt d) { // <- n64%d. (n64=MS|LS)
MS = MS % d; // reduce MS to avoid overflow in div
__asm {
mov eax, LS
mov edx, MS
div d
mov eax, edx
}
}

int PrimeCheck(UInt64 nn, UInt c, UChar* p) { // test if nn is prime, c=sqrt(nn), p=next div offs
UInt m = 2, d = 2, *n = (UInt*)(&nn);      // n[0],n[1] = LS,MS words
for(;;) {
if (d > c) return 1;
if (UMOD(n[0],n[1],d) == 0) return 0;
d += p[m];
m += p[m];
if (m > 29) m -= 30;
}
}

UInt NextDiv(UInt64 nn, UInt d, UInt x, UChar* p) { // next>=d div of nn, x=sqrt(nn), p=next div offs
UInt m = d % 30, *n = (UInt*)(&nn);      // n[0],n[1] = LS,MS words
for(;;) {
if (d > x) return 0;
if (UMOD(n[0],n[1],d) == 0) return d;
d += p[m];
m += p[m];
if (m > 29) m -= 30;
}
}```
The two later functions are almost identical, in fact, you could use NexDiv in place of IsPrime with simple change in the calling syntax. The UMOD function was used to prevent the compiler from calling its library function for 64-bit modulo operators.

• Moderators
• 4713 posts
• Last active: Mar 31 2012 03:17 AM
• Joined: 14 Feb 2005
The simplest process I know to get machine code from C:

- Install the free VS'05 or VS'08 express compiler from MS: http://www.microsoft...aspx#webInstall
- Create an empty console project
- Write C functions, with their names exported (Project/Properties: Linker command line options / Additional options: "/EXPORT:MyFuncName")
- Set the compiler option to list assembly and machine code (C/C++ / Output Files: Assembler Output: Assembly With Machine Code (/FAc))
- Compile the C project

The compiler generates an "fname.cod" file, where you find the machine code in hex and the corresponding assembly instructions. You can use an editor to strip off unwanted information, but I have been using a short script for this and for nicely formatting the hex stream of the machine code.
```+!z::             ; Shift-Alt-Z: convert machine code listing to hex stream
ClipBoard =
Send ^c
ClipWait 2
IfEqual ErrorLevel,1, Return

Clip := RegExReplace(ClipBoard,"m)[;\\$].*\$")
Clip := RegExReplace(Clip,"m)^.*?\t(.*?)\t.*\$","\$1")
Clip := RegExReplace(Clip,"m)\s")
TrayTip,,% "Bytes = " StrLen(Clip)//2
If StrLen(Clip) < 80
ClipBoard := Clip
Else {
ClipBoard := """" . SubStr(Clip,1,62)
StringTrimLeft Clip, Clip, 62
Loop {
ClipBoard .= """`n. """ . SubStr(Clip,1,76)
StringTrimLeft Clip, Clip, 76
If StrLen(Clip) < 80
Break
}
ClipBoard .= """`n. """ . Clip . """"
}
Return```
Just select the instructions belonging to your function in any editor (Notepad is OK) and press the Shift-Alt-Z hotkey. The formatted hex stream w/o the assembly instructions or comments appears in the clipboard, ready to be pasted to your script.

The difficulty is to prevent the compiler to link its library functions, which makes the machine code not standing alone. You should use your own versions (mostly written in assembly as the UMOD function above) with the prefix "__forceinline", but it is still hard to use doubles. char's, int's and __int64's can be handled w/o serious complications.

• Moderators
• 4512 posts
• Last active: May 20 2019 07:41 AM
• Joined: 24 May 2006
Thx for sharing, some very good info.

• Members
• 72 posts
• Last active: Jan 16 2009 10:08 AM
• Joined: 19 Dec 2006
Thanks Laszlo for excellent idea
Thanks as well to Skan for base request

I've finally done what I had in mind since I first read this thread : my own MCode function, which is written in assembly. It's ready to go in a library file...

Of course, it auto-encodes itself at first run using AHK, but all following calls are using LM code.

For what it's worth to you, reader, you can take it as is. I believe I managed to get it compatible with Laszlo's first version.

Note : source assembly is given as well to those interrested.

MCode.ahk
```/*
AHK MCode machine langage injector - v1.0
Recommanded fileName : MCode.ahk
Author : LHdx 2008/02
Permission is granted to use copy for commercial or non commercial use provided credit to author remains in source code.

WARNING : depending on your usage of this wrapper, you *might* have to check shell32.dll version. It is not done directly in this code

USE AT YOUR OWN RISKS !

Base reference : http://www.autohotkey.com/forum/viewtopic.php?t=21172

Usage : MCode(Destination variable, Source hexa code as string)
Returns the number of encoded bytes.
*/

MCode(ByRef pDestination, pCodeHexa) {
Static lMcode
VarSetCapacity(pDestination, StrLen(pCodeHexa) // 2)
If (lMcode)
Return DllCall(&lMcode, "Str", pCodeHexa, "UInt", &pDestination, "cdecl UInt")
lMcode:="608B7424248B7C242833C9FCAC08C074243C397604245F2C072C30C0E0048AE0AC08C074103C397604245F2C072C3008E0AA41EBD7894C241C61C3"
Loop % StrLen(lMcode)//2
NumPut("0x" . SubStr(lMcode,2*A_Index-1,2), lMcode, A_Index-1, "UChar")
Return DllCall(&lMcode, "Str", pCodeHexa, "UInt", &pDestination, "cdecl UInt")
}

/*
00401051  |. 8B7424 24      MOV ESI,DWORD PTR SS:[ESP+24]
00401055  |. 8B7C24 28      MOV EDI,DWORD PTR SS:[ESP+28]
00401059  |. 33C9           XOR ECX,ECX
0040105B  |. FC             CLD
0040105C  |> AC             LODS BYTE PTR DS:[ESI]
0040105D  |> 08C0           OR AL,AL
0040105F  |. 74 24          JE SHORT hex2bin2.00401085
00401061  |. 3C 39          CMP AL,39
00401063  |. 76 04          JBE SHORT hex2bin2.00401069
00401065  |. 24 5F          AND AL,5F
00401067  |. 2C 07          SUB AL,7
00401069  |> 2C 30          SUB AL,30
0040106B  |. C0E0 04        SHL AL,4
0040106E  |. 8AE0           MOV AH,AL
00401070  |. AC             LODS BYTE PTR DS:[ESI]
00401071  |. 08C0           OR AL,AL
00401073  |. 74 10          JE SHORT hex2bin2.00401085
00401075  |. 3C 39          CMP AL,39
00401077  |. 76 04          JBE SHORT hex2bin2.0040107D
00401079  |. 24 5F          AND AL,5F
0040107B  |. 2C 07          SUB AL,7
0040107D  |> 2C 30          SUB AL,30
0040107F  |. 08E0           OR AL,AH
00401081  |. AA             STOS BYTE PTR ES:[EDI]
00401082  |. 41             INC ECX
00401083  |.^EB D7          JMP SHORT hex2bin2.0040105C
00401085  |> 894C24 1C      MOV DWORD PTR SS:[ESP+1C],ECX
0040108A  \. C3             RETN

60 8B 74 24 24 8B 7C 24 28 33 C9 FC AC 08 C0 74 24 3C 39 76 04 24 5F 2C 07 2C 30 C0 E0 04 8A E0
AC 08 C0 74 10 3C 39 76 04 24 5F 2C 07 2C 30 08 E0 AA 41 EB D7 89 4C 24 1C 61 C3
*/```

• Moderators
• 4512 posts
• Last active: May 20 2019 07:41 AM
• Joined: 24 May 2006
Great and definitely needed.

Thank you !

Now if Laszlo does proper documentation we would have really pro library.

• Moderators
• 4713 posts
• Last active: Mar 31 2012 03:17 AM
• Joined: 14 Feb 2005
Azerty: Your assembler code is a several bytes shorter than the Hex2Bin function posted earlier, maybe, because that was compiled from C. Of course, you can use Hex2Bin in place of MCode, but yours looks better. Thanks for sharing it.

In case anyone wants to compare performance, here is the odd couple I have been using (Hex2Bin replaced MCode):
```Bin2Hex(addr,len) { ; Bin2Hex(&x,4)
Static fun
If (fun = "")
VarSetCapacity(hex,2*len+1)
VarSetCapacity(hex,-1) ; update StrLen
Return hex
}

Hex2Bin(ByRef bin, hex) { ; Hex2Bin(fun,"8B4C24") = MCode(fun,"8B4C24")
Static fun
If (fun = "") {
h:="568b74240c8a164684d2743b578b7c240c538ac2c0e806b109f6e98ac802cac0e104880f8"
. "a164684d2741a8ac2c0e806b309f6eb80e20f02c20ac188078a16474684d275cd5b5f5ec3"
VarSetCapacity(fun,StrLen(h)//2)
Loop % StrLen(h)//2
NumPut("0x" . SubStr(h,2*A_Index-1,2), fun, A_Index-1, "Char")
}
VarSetCapacity(bin,StrLen(hex)//2)
dllcall(&fun, "uint",&bin, "Str",hex, "cdecl")
}```

Edit: be careful! The two machine code functions (inside MCode and Hex2Bin use different parameter order.

• Members
• 72 posts
• Last active: Jan 16 2009 10:08 AM
• Joined: 19 Dec 2006
majkinetor & laszlo : thx

yes it's been fully "hand written" to be short (I think it's a "root" function so it needs to have a short "load time").

I'm planning an ASM written base64 encoder/decoder for ahk to enable external dependancies to become inline coded in main script (I hate having 50 files in a subdir when one is enough ). I'll probably post it in this topic. So Stay tuned :wink: