Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

Machine code functions: Bit Wizardry


  • Please log in to reply
144 replies to this topic
Laszlo
  • Moderators
  • 4713 posts
  • Last active: Mar 31 2012 03:17 AM
  • Joined: 14 Feb 2005

Can you explain this…r1 := ~res → 0x292f0b53, r2 := res ^ 0xFFFFFFFF → -0xd6d0f4ad?

It is the difference between 32 or 64-bit signed number representation. Their least significant 32 bits are the same.

tic
  • Members
  • 1934 posts
  • Last active: May 30 2018 08:13 PM
  • Joined: 22 Apr 2007
ok. i was 1 version of ahk behind. dont know which updates have made it change, but now it always gives 434084816 no matter what i add to the file

olfen
  • Members
  • 115 posts
  • Last active: Dec 25 2012 09:48 AM
  • Joined: 04 Jun 2005
Please try if this works:
#NoEnv 



MCode(CRC32_Init, "5589E583EC10C745FC2083B8EDC745F400000000817DF4FF0000007F508B45F48945F8C745F008000000837DF0007E238B45F883" 

. "E00185C0740D8B45F8D1E83345FC8945F8EB058D45F8D1288D45F0FF08EBD78B45F48D0C85000000008B55088B45F88904118D45F4FF00EBA7C9C3") 

MCode(CRC32_Get, "5589E583EC08C745FCFFFFFFFFC745F8000000008B45F83B450C732E8B45080345F80FB6003345FC25FF0000008D0C85000000008" 

. "B55108B45FCC1E8083304118945FC8D45F8FF00EBCA8B45FCC9C3") 

  

VarSetCapacity(CRC32LookupTable, 256*4) 

DllCall(&CRC32_Init, "uint",&CRC32LookupTable) 



SetFormat, Integer, Hex 

  

a = abcdef

res := DllCall(&CRC32_Get, "uint",&a, "uint",StrLen(a), "uint",&CRC32LookupTable) 

If (res < 0)

  MsgBox % Abs(res) - 1

Else

  MsgBox % ~res



FileRead, a, %A_AhkPath%

FileGetSize, size, %A_AhkPath%

res := DllCall(&CRC32_Get, "uint",&a, "uint",size, "uint",&CRC32LookupTable) 

If (res < 0)

  MsgBox % Abs(res) - 1

Else

  MsgBox % ~res


tic
  • Members
  • 1934 posts
  • Last active: May 30 2018 08:13 PM
  • Joined: 22 Apr 2007
Nope :(

I ran

#NoEnv

MCode(ByRef code, hex) { ; allocate memory and write Machine Code there
   VarSetCapacity(code,StrLen(hex)//2)
   Loop % StrLen(hex)//2
      NumPut("0x" . SubStr(hex,2*A_Index-1,2), code, A_Index-1, "Char")
}

MCode(CRC32_Init, "5589E583EC10C745FC2083B8EDC745F400000000817DF4FF0000007F508B45F48945F8C745F008000000837DF0007E238B45F883"
. "E00185C0740D8B45F8D1E83345FC8945F8EB058D45F8D1288D45F0FF08EBD78B45F48D0C85000000008B55088B45F88904118D45F4FF00EBA7C9C3")
MCode(CRC32_Get, "5589E583EC08C745FCFFFFFFFFC745F8000000008B45F83B450C732E8B45080345F80FB6003345FC25FF0000008D0C85000000008"
. "B55108B45FCC1E8083304118945FC8D45F8FF00EBCA8B45FCC9C3")
 
VarSetCapacity(CRC32LookupTable, 256*4)
DllCall(&CRC32_Init, "uint",&CRC32LookupTable)

FileRead, a, %A_AhkPath%
FileGetSize, size, %A_AhkPath%
res := DllCall(&CRC32_Get, "uint",&a, "uint",size, "uint",&CRC32LookupTable)
If (res < 0)
  MsgBox % Abs(res) - 1
Else
  MsgBox % ~res

and when modified it still will always give me 434084816

olfen
  • Members
  • 115 posts
  • Last active: Dec 25 2012 09:48 AM
  • Joined: 04 Jun 2005

and when modified it still will always give me 434084816

-> 0x19DF9BD0 in decimal form, which is correct. Try adding
SetFormat, Integer, Hex
before calling CRC32_Get.

olfen
  • Members
  • 115 posts
  • Last active: Dec 25 2012 09:48 AM
  • Joined: 04 Jun 2005
Here's an example demonstrating how to generate a list of CRC-32 - file pairs:
#NoEnv 



MCode(ByRef code, hex) { ; allocate memory and write Machine Code there 

   VarSetCapacity(code,StrLen(hex)//2) 

   Loop % StrLen(hex)//2 

      NumPut("0x" . SubStr(hex,2*A_Index-1,2), code, A_Index-1, "Char") 

}



MCode(CRC32_Init, "5589E583EC10C745FC2083B8EDC745F400000000817DF4FF0000007F508B45F48945F8C745F008000000837DF0007E238B45F883" 

. "E00185C0740D8B45F8D1E83345FC8945F8EB058D45F8D1288D45F0FF08EBD78B45F48D0C85000000008B55088B45F88904118D45F4FF00EBA7C9C3") 

MCode(CRC32_Get, "5589E583EC08C745FCFFFFFFFFC745F8000000008B45F83B450C732E8B45080345F80FB6003345FC25FF0000008D0C85000000008" 

. "B55108B45FCC1E8083304118945FC8D45F8FF00EBCA8B45FCC9C3") 

  

VarSetCapacity(CRC32LookupTable, 256*4) 

DllCall(&CRC32_Init, "uint",&CRC32LookupTable) 



SetFormat, Integer, Hex 

  

Loop, *.pdf

{

  FileRead, a, %a_loopfilename%

  FileGetSize, size, %a_loopfilename%

  res := DllCall(&CRC32_Get, "uint",&a, "uint",size, "uint",&CRC32LookupTable)  

  res := res < 0 ? Abs(res) - 1 : ~res



  StringUpper, res, res

  StringReplace, res, res, 0X, 0000000 

  r .= SubStr(res, -7) . a_tab . a_loopfilename . "`n"

}

a =

Sort, r

FileAppend, %r%, res.txt

MsgBox, Done.


Lexikos
  • Administrators
  • 9844 posts
  • AutoHotkey Foundation
  • Last active:
  • Joined: 17 Oct 2006
As an alternative to true multi-threading support in AutoHotkey, would it be possible to write a machine code function to wrap DllCall? Ideally, it would start a new thread, call the function (from the new thread), then send the return value back to the script (via window messages, I suppose.)

Laszlo
  • Moderators
  • 4713 posts
  • Last active: Mar 31 2012 03:17 AM
  • Joined: 14 Feb 2005
These machine code blocks were meant for short, standalone functions. A complex task needs internal subroutines. The linker could generate relative call addresses, which you have to edit, after removing unnecessary code from the binary file. Interfacing the Windows dynamic library load and call mechanism needs some code, so I’d not be surprised to see several pages of hex data in the AHK function. It looks simpler to run a second copy of AHK via the RUN command.

Leon
  • Members
  • 179 posts
  • Last active: May 22 2008 02:41 PM
  • Joined: 27 Aug 2007
Great work and info by everyone.
I'm really working at understanding how all this works.

How can i use this to have javascript in my ahk script / compiled exe?
This need only work in XP or later.

I found these hashing algorithms that seem to be great. Want to include one in a script.

In light of this topic, some of you may find these veruy useful.

DL / view source code:
MD4
MD5
SHA-1

Laszlo
  • Moderators
  • 4713 posts
  • Last active: Mar 31 2012 03:17 AM
  • Joined: 14 Feb 2005
There are many JavaScript compilers out there. Google it. However, they usually produce code with a lot of external function calls (like Java class libraries), so they have to be present at run time. Therefore, JavaScript does not seem to be the best source. I would stick to C, Delphi, VB, if I could.

For the hash algorithms you don’t need to compile anything. They are included in Windows, as discussed here. MD4 is not there, but it is not commonly used any more.

tic
  • Members
  • 1934 posts
  • Last active: May 30 2018 08:13 PM
  • Joined: 22 Apr 2007
Ive been playing a bit with olfens crc32 but i would like to know how it could be rewritten to not need the entire string at once and then perform a crc, but actually do it piece by piece. Currently if it hashes a film, itll put 700mb into the memory which is not ideal


I posted on this a while ago and would like to know how it works and whther it can be used in ahk:

<!-- m -->http://www.dominik-r...ce.shtml#rehash<!-- m -->

it took less than 2 seconds to get the crc32 of a 700mb file, never uses more than 25% cpu and always uses 1mb of ram.

In comparison the above machine code takes 11 seconds, sometimes uses 100% cpu and uses 700mb of ram

Please help!

olfen
  • Members
  • 115 posts
  • Last active: Dec 25 2012 09:48 AM
  • Joined: 04 Jun 2005
The C function would have to be extended to accept a start value as an additional parameter, so it could be passed on between calls that operate on chunks of data.
But I'm not sure, if it can easily be done for binary files, as I haven't tested if it's possible to read chunks of data from a file, possibly using WinAPI (ReadFile in conjunction with SetFilePointer).

P.S.: I just looked at the source of the C++ Class: CSHA1 on the site you posted, and found, that it is using fopen et al from stdio.h for reading data. And I guess rehash is doing it the same way.
So for compareable performance, the C function would have to be rewritten, also implementing file i/o in machine code (using WinAPI would still require slow AHK loops).

P.P.S.: After a bit of testing I think it would require a lot of work to extract and integrate the file I/O functions from msvcrt.dll into machine code, that can be called from AHK. Maybe it isn't possible at all.
IMO the best option is to create a crc32.dll that is DllCall'ed the conventional way, when processing large files.

tic
  • Members
  • 1934 posts
  • Last active: May 30 2018 08:13 PM
  • Joined: 22 Apr 2007
Thank you for having a look olfen.

I am very interested in this, not specificially the crc32, but perhaps more with md5. I think it would be really good to have some way in ahk that could compare files to check whether they are identical (but make it the fastest way possible and not use ridicuolously large memory + cpu to do it). Unfortunately I do not have nearly the knowledge to understand what is going on in that C code and steal it to a dll or any other way.

Laszlo
  • Moderators
  • 4713 posts
  • Last active: Mar 31 2012 03:17 AM
  • Joined: 14 Feb 2005
Here is a version of Olfen’s CRC32 function, which is somewhat shorter, inverts the bits of the result and can be continued, when only sections of the data are available at a time (e.g. when computing the CRC of a large file, read piece-by-piece). The corresponding C code is
typedef unsigned long uint;

void CRC32_Init(uint* table) { // uint table[256]
  uint i, j, poly = 0xEDB88320, CRC;

  for(i = 0; i < 256; i++) {
    CRC = i;
    for(j = 0; j < 8; j++)
      if(CRC & 1)
        CRC = (CRC >> 1) ^ poly;
      else
        CRC >>= 1;
    table[i] = CRC;
  }
}

uint CRC32(unsigned char* buffer, uint len, uint crc32val, uint* table) { // init: crc32val = 0xFFFFFFFF
  uint i;
  for (i = 0; i < len; i++)
    crc32val = table[(crc32val ^ buffer[i]) & 255] ^ (crc32val >> 8);
  return ~crc32val;
}
Their machine code is converted to AHK functions by
MCode(CRC32_Init,"33c06a088bc85af6c101740ad1e981f12083b8edeb02d1e94a75ec8b542404890c82403d0001000072d8c3")
MCode(CRC32,"558bec33c039450c7627568b4d080fb60c08334d108b55108b751481e1ff000000c1ea0833148e403b450c89551072db5e8b4510f7d05dc3")

When the CRC of a string or a binary buffer is computed, crc32val=-1:
a = ABCDEFGHIJKLMNOPQRSTUVWXYZ ; CRC = 0xabf77822
MsgBox % DllCall(&CRC32, "uint",&a, "uint",StrLen(a), "int",-1, "uint",&CRC32LookupTable, "cdecl uint")

The call is the same, when the first section of long data is processed. Subsequent sections need modified calls: the third parameter be the bitwise-not of the previous CRC value:
a = clip
b = board ; CRC("clipboard") = 0x8fdbc496
c := DllCall(&CRC32, "uint",&a, "uint",StrLen(a), "int",-1, "uint",&CRC32LookupTable, "cdecl uint")
MsgBox % DllCall(&CRC32, "uint",&b, "uint",StrLen(b), "int",~c, "uint",&CRC32LookupTable, "cdecl uint")

This version is very fast, so it can be used to compare large files. First compare the size of the files. If they are equal, compare their CRC values. Because the CRC’s are only 32 bits long, there is a 50% chance that among 65,536 random files of the same size, two will be falsely labeled as equal. We don’t normally have that many files of the same size. If you do, you can compute more CRC values for the files, which possibly equal to another one. These CRC’s should use different starting values (like -1, 0, 0x55555555).

Laszlo
  • Moderators
  • 4713 posts
  • Last active: Mar 31 2012 03:17 AM
  • Joined: 14 Feb 2005
Here is a wrapper for the CRC32 function, to make it easier to use, and some test cases.
SetFormat Integer, HEX
a = ABCDEFGHIJKLMNOPQRSTUVWXYZ
MsgBox % CRC32(a)               ; abf77822

a = clip
b = board
c:= CRC32(a)
MsgBox % CRC32(b,0,~c)          ; 8fdbc496

MsgBox % CRC32(a:="1234567890") ; 261daee5

FileRead a, %A_AhkPath%         ; Version 1.0.47.04
FileGetSize size, %A_AhkPath%
MsgBox % CRC32(a,size)          ; 19df9bd0

; ---Include lines below---
CRC32(ByRef Buffer, Bytes=0, Start=-1) {
   Static CRC32, CRC32_Init, CRC32LookupTable
   If (CRC32 = "") {
      MCode(CRC32_Init,"33c06a088bc85af6c101740ad1e981f12083b8edeb02d1e94a75ec8b542404890c82403d0001000072d8c3")
      MCode(CRC32,"558bec33c039450c7627568b4d080fb60c08334d108b55108b751481e1ff000000c1ea0833148e403b450c89551072db5e8b4510f7d05dc3")
      VarSetCapacity(CRC32LookupTable, 256*4)
      DllCall(&CRC32_Init, "uint",&CRC32LookupTable, "cdecl")
   }
   If Bytes <= 0
      Bytes := StrLen(Buffer)
   Return DllCall(&CRC32, "uint",&Buffer, "uint",Bytes, "int",Start, "uint",&CRC32LookupTable, "cdecl uint")
}

MCode(ByRef code, hex) { ; allocate memory and write Machine Code there
   VarSetCapacity(code,StrLen(hex)//2)
   Loop % StrLen(hex)//2
      NumPut("0x" . SubStr(hex,2*A_Index-1,2), code, A_Index-1, "Char")
}

The function CRC32 has three parameters.
- The first one is the name of a buffer, which can contain binary data.
- The second parameter is the length of the data in bytes. If omitted or not positive, Strlen(Buffer) is used internally.
- The 3rd parameter is used for continuing the CRC computation for second or later data sections. If omitted, -1 is used, the standard initial value for CRC32. If an earlier CRC operation is to be continued (which returned C), put here ~C. If a different CRC is needed than the standard CRC-32 (e.g. to resolve collisions), you can use any 32 bit integer for initialization.