Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

Opinions wanted : Optimized DllCall()


  • Please log in to reply
92 replies to this topic
Laszlo
  • Moderators
  • 4713 posts
  • Last active: Mar 31 2012 03:17 AM
  • Joined: 14 Feb 2005

the need for speed is absolute. Or faster dllCall or rewriting everything.

You could consider putting the loop in C function, and compile it to machine code. This way you only need one slow dll call.

You must have a supermachine to do this test in ? seconds

Nothing fancy. I built it myself: Intel Core 2 Quad Q6600 ($185), 2GB RAM ($40), GeForce 8500 GT 512MB graphic card ($50)... For $500 you can build a similar one, too. For better graphics add another $100.

Pil
  • Members
  • 55 posts
  • Last active: Mar 25 2014 05:00 AM
  • Joined: 26 Feb 2006
Hey,you had the same idea, that´s exactly what i am trying to do. My first programming was in Quickbasic, then I went to Xbasic and VBasic.. I made a dll with XBasic, which works fine with XBasic but must be somehow different and gives the error N0xC0000005 So I tried to rebuild the lib with a program called DllGuide., but without success. Maybe a C header is necessary.
My VB version 4 only makes OLE dll´s . This would be someway a detour
I really just know the very basic of C++ and have the free version Visual C++ NET 2005 installed. They gave it free just for working with the NET Framework to push developers to work with this, so no windows api calls here.
Yesterday I received the Borland´s free c++ compiler and think I will read the docs tonight.
It would really be very helpful if someone could show me a code example in C and show how to link this as dll. My idea is to write a ‘buffer’ code that would accept in the call as parameter the times the loop would be done and return a string address with the port values. As an example this would difficult to test without the hardware. But the Colorpicker code could be used writing a function that accepts the hcd of the window, the blue value and a top x and y position as parameters and then draw a 250 by 250 pixel bitmap.

SKAN
  • Administrators
  • 9115 posts
  • Last active:
  • Joined: 26 Dec 2005
@Pil

Hey,you had the same idea


Laszlo was referring to Machine code functions: Bit Wizardry

I made a dll with XBasic, which works fine with XBasic but must be somehow different and gives the error N0xC0000005


Have you ever tried XbLite? It is a "Windows only" port of XBasic capable of creating DLL's as tiny as 7k. To the point, I had faced access violation error when I was experimenting my first DLL with XBLite. That language expected a CSTRING but AHK was passing strings as address resulting in an error. My experience @ xblite forum was pleasant: xblite: How can a DLL function return a "String value" to the calling program ?

My VB version 4 only makes OLE dll´s . This would be someway a detour
I really just know the very basic of C++ and have the free version Visual C++ NET 2005 installed. They gave it free just for working with the NET Framework to push developers to work with this, so no windows api calls here.
Yesterday I received the Borland´s free c++ compiler and think I will read the docs tonight.


I learnt from corrupt that there exists BCX (Basic to C Converter). He created and gave me a DLL ( Basic Code to C DLL ) which works pretty fine in AHK.

FYI. :)
kWo4Lk1.png

Lexikos
  • Administrators
  • 9844 posts
  • AutoHotkey Foundation
  • Last active:
  • Joined: 17 Oct 2006
FYI, DllCall was not written by Chris.

Marcus Sonntag (Ultra): For the research, design, coding, and testing of DllCall.

DllCall calls GetProcAddress() each time to retrieve the address of the function. Since DllCall can accept a function address in place of a function name, we can speed things up a bit:
SetPixel:=DllCall("GetProcAddress",UInt,DllCall("GetModuleHandle",Str,"gdi32"),Str,"SetPixel")
Loop 255 {
   x++
   z := (b<<16)|x
   y = 0
   loop 17 {
      DllCall(SetPixel,Int,hdc,UChar,x,UChar,y++,Int,(y<<8)|z)
     ,DllCall(SetPixel,Int,hdc,UChar,x,UChar,y++,Int,(y<<8)|z)
     ,DllCall(SetPixel,Int,hdc,UChar,x,UChar,y++,Int,(y<<8)|z)
     ,DllCall(SetPixel,Int,hdc,UChar,x,UChar,y++,Int,(y<<8)|z)
     ,DllCall(SetPixel,Int,hdc,UChar,x,UChar,y++,Int,(y<<8)|z)
     ,DllCall(SetPixel,Int,hdc,UChar,x,UChar,y++,Int,(y<<8)|z)
     ,DllCall(SetPixel,Int,hdc,UChar,x,UChar,y++,Int,(y<<8)|z)
     ,DllCall(SetPixel,Int,hdc,UChar,x,UChar,y++,Int,(y<<8)|z)
     ,DllCall(SetPixel,Int,hdc,UChar,x,UChar,y++,Int,(y<<8)|z)
     ,DllCall(SetPixel,Int,hdc,UChar,x,UChar,y++,Int,(y<<8)|z)
     ,DllCall(SetPixel,Int,hdc,UChar,x,UChar,y++,Int,(y<<8)|z)
     ,DllCall(SetPixel,Int,hdc,UChar,x,UChar,y++,Int,(y<<8)|z)
     ,DllCall(SetPixel,Int,hdc,UChar,x,UChar,y++,Int,(y<<8)|z)
     ,DllCall(SetPixel,Int,hdc,UChar,x,UChar,y++,Int,(y<<8)|z)
     ,DllCall(SetPixel,Int,hdc,UChar,x,UChar,y++,Int,(y<<8)|z)
   }
}
I ran four tests:
[*:xyfgr767]Pil's original example.
[*:xyfgr767]Laszlo's optimized version.
[*:xyfgr767]The version above.
[*:xyfgr767]SetPixel as a built-in function.All four tests were run by the same script, in a custom build of AutoHotkey. I ran each test 10 times and averaged the results.
1       2       3
1   0.97
2   0.82     +18%
3   0.58     +67%    +41%
4   0.31    +212%   +164%    +87%
(CPU: Core 2 Duo E4600 2.4Ghz)

SKAN
  • Administrators
  • 9115 posts
  • Last active:
  • Joined: 26 Dec 2005
@Pil: OffTopic

Creating and Manipulating a BITMAP is more efficient than SetPixel for this particular requirement:

ATC:=A_TickCount
SetBatchLines -1
DetectHiddenWindows, On

; Creating a 256x256 24bit bitmap file in memory
Id:="BM",Hdr:=54,W:=256,H:=256,Bit:=24,Byt:=W*H*(Bit/8),  VarSetCapacity(BMP,Hdr+Byt,0)
NumPut( NumGet( Id,0,"UShort" ),BMP,0,"UShort" ), NumPut( Hdr+Byt,BMP,2 )
NumPut( Hdr,BMP,10 ), NumPut( 40,BMP,14 ), NumPut( W,BMP,18 ), NumPut( H,BMP,22 )
NumPut( 1,BMP,26,"UShort" ), NumPut( Bit,BMP,28,"UShort" ), NumPut( Byt,BMP,34 )

Gui, 99:+AlwaysOnTop +ToolWindow +LastFound  
Gui99:=WinExist(), hDC:=DllCall("GetDC",UInt,Gui99)
Gui, 99:Add, Text, w256 h256 0x120E hWndhPic gSelColor
X:=0,B:=0, , OffSet := &BMP+54
Loop 256 {                ; Altering RGB values   
 X:=X+1, Y:=0 
 Loop 256 
  Offset := NumPut( [color=red]((b<<16)|(y++<<8)|x)[/color], Offset+0 ) - 1 
} 
hBMP := DllCall( "CreateDIBitmap", UInt,hDC, UInt,&BMP+14, Int,4, UInt,&BMP+NumGet(BMP,10)
              , UInt,&BMP+14, UInt,1 )        
SendMessage, (STM_SETIMAGE:=0x172), (IMAGE_BITMAP:=0x0), hBMP,, ahk_id %hPic%
Gui, 99:Show,, % "ColorPicker [ " (A_TickCount-ATC) "ms ]"
Return

SelColor:
 MouseGetPos, X, Y
 PixelGetColor,Color, X, Y, RGB
 Tooltip % SubStr(Color,-5) 
 SetTimer, TooltipOff, -2000
Return

ToolTipOff:
 ToolTip
Return

GuiClose:
 ExitApp

1) I am not getting the right colors.. you have to tinker the bitshifting.
2) A BITMAP stores the image flipped upside down.. I was lazy to numput the values backwards and so the palette is displayed flipped
kWo4Lk1.png

Laszlo
  • Moderators
  • 4713 posts
  • Last active: Mar 31 2012 03:17 AM
  • Joined: 14 Feb 2005
Interestingly, LoadLibrary does not bring any speedup. I run Lexikos's computed address speedup in my 2GHz Centrino XP laptop, and got the following:

Original: 0.691s
Opt17X: 0.565s
+LoadLibrary: 0.568s
Computed address: 0.375s

That is, the computed address trick gives us another 33% speedup (0.375/0.5654=0.66). All the optimizations in the AHK script can accelerate the original less than twofold (0.375/0.691=0.54).

Pil
  • Members
  • 55 posts
  • Last active: Mar 25 2014 05:00 AM
  • Joined: 26 Feb 2006
SKAN:

Laszlo was referring to Machine code functions: Bit Wizardry

Wow, this is interesting. I didn’t see this before, because I don’t have internet at home, I only can use it for short time in the net-coffee shop. But I will download this topic and try to learn from it.

Have you ever tried XbLite?

Yes But it I thought it’s using the same but shortened libraries of XBasic. I couldn’t make a by AHK readable dll, so I didn’t expect XBLite would . Now I see you made it and the sun is going to shine again, I’ll will try this first

Creating and Manipulating a BITMAP is more efficient than SetPixel

Nice example. I will write a HSB color picker with this.

BCX (Basic to C Converter).

Thank´s, your post really helped me.

Lexikos:

Thank you for improving and testing.
A pity that you didn’t include the time needed for the VBasic version <!-- m -->https://ahknet.autoh...lorPickerVB.exe<!-- m -->
which maybe need <!-- m -->https://ahknet.autoh...pil/VB40032.DLL<!-- m -->
This still is a lot faster than even the Bitmap version.
So I still think that calling the outside dll with specific function will be the solution. I would like to use Lazlo´s machine code but I will need a long time to manage this.

Lexikos
  • Administrators
  • 9844 posts
  • AutoHotkey Foundation
  • Last active:
  • Joined: 17 Oct 2006
I'm working on a feature that essentially glues a bunch of machine code together based on the number and type of parameters. My proof-of-concept benchmarked the same as the equivalent built-in function. The end-product should support the same parameter and return types as DllCall.

I'd like to hear any ideas you may have on syntax/usage. (See also DllCall function declaration, but note it will have to be a #directive, not a command.)

I intend to finish the debugging features I was working on before continuing with this project. I got side-tracked...

Laszlo
  • Moderators
  • 4713 posts
  • Last active: Mar 31 2012 03:17 AM
  • Joined: 14 Feb 2005
We should know, what eats up all that time in dllcall's. If the data conversion does (from AHK strings to binary and back), typed variables would bring great speedup.

DllCall's in loops could sometimes be handled with a special (repetition) construct for one of the parameters, like:
..."int", j .. j+3...
or
"int", 1 .. 10 .. 2 ; increment by 2
One could use a loop variable, its current value to be accessed (but not changed?) in other parameters
..."int", i := j .. j+3, "int", i<<8...
If the internal implementation keeps the values in binary form, we would save time. But these look hard to implement.

In timers we would need other speedup tools. In one of my applications I have a timer activated in every 10 ms. It reads the raw data of four joystick axes, with dllcall's. I could not use shorter (multimedia) timers, because the dllcalls were too slow, and the timer did not finish before it had to be started again. Here binary data types could help.

Lexikos
  • Administrators
  • 9844 posts
  • AutoHotkey Foundation
  • Last active:
  • Joined: 17 Oct 2006

We should know, what eats up all that time in dllcall's. If the data conversion does (from AHK strings to binary and back), typed variables would bring great speedup.

My previous benchmark included a test with SetPixel as a built-in function. It was significantly faster, even though the parameters still needed to be converted from strings to binary.

..."int", i := j .. j+3, "int", i<<8...
If the internal implementation keeps the values in binary form,

Expressions already do that. The values are converted to strings when stored in a variable or returned. Parameters of built-in function are implemented as expression tokens rather than script variables, so they receive the actual integer, floating point value, string, variable reference, etc. given by the expression.

Interestingly, LoadLibrary does not bring any speedup.

It is no surprise with SetPixel, since gdi32 is already loaded.

Laszlo
  • Moderators
  • 4713 posts
  • Last active: Mar 31 2012 03:17 AM
  • Joined: 14 Feb 2005

Expressions already do that.

The point was the iterator ".." in parameters. The dll should be called repeatedly with incremented values in the corresponding parameter, kept in binary between calls.

Lexikos
  • Administrators
  • 9844 posts
  • AutoHotkey Foundation
  • Last active:
  • Joined: 17 Oct 2006
I see. TBH, it doesn't seem worth the development time and complexity. I wonder if it has uses outside DllCall.

Pil
  • Members
  • 55 posts
  • Last active: Mar 25 2014 05:00 AM
  • Joined: 26 Feb 2006

I'd like to hear any ideas you may have on syntax/usage but note it will have to be a #directive, not a command.

Im a confused a little now Let me first try to understand what you mean with directives.
I know that In the Ahk community are a lot of people beginning their programming because it’s surely a pleasant place to start with. So let me tell what I think directives are.
Please correct me if I´m wrong.
Directive is called so because it directs the compiler to do something
Preprocessor directives are commands executed by the preprocessor phase of the compiler that executes before your code is compiled into object code, and preprocessor directives generally act on your source code in some way before it is compiled. They all start with the # character. AHK has many directives that you can place in your script. I think the most used in all languages is the #Incude directive that puts the content of a text file as code on the place you included it.
In Visual Basic we have #If ... Then ... #Else Directive,yhat conditionally compiles selected blocks of Visual Basic code, which is typically used to compile the same program for different platforms. It can also be used to prevent debugging code from appearing in an executable file. Code excluded during conditional compilation is completely omitted from the final executable file.
Simular in C++ you can define a symbol like _DEBUG
The code between the:
#ifdef _DEBUG
// Code for debugging purposes...
#endif // _DEBUG
is only compiled if the symbol _DEBUG is defined.
Now I see that you used in your custom build AutoHotkey_L also the #if directive that creates context-sensitive hotkeys and hotstrings.
Maybe not that confusing I just will have to get used to it
But is this the way your rebuild AHK your way, using preprocessor directives?

See also DllCall function declaration

, I vote yes, because don’t want to wait something that maybe never will be implemented. And I did´t find anything about #define directives on the forum .
(A macro is a name defined by a #define pre-processor directive that will be replaced by some text that will normally be C++ code, but could also be constants or symbols of some kind.)

I am looking forward at your feature that can replace DllCall. Hope it will be soon.
Great work
I see that I was wrong when I said that the DllCall is the only way to extend AHK.
Like Lazzlo said

You don’t have to wait for Chris. Write your own preprocessor, based on regular expressions.

However, I will have to learn a lot before this

Lexikos
  • Administrators
  • 9844 posts
  • AutoHotkey Foundation
  • Last active:
  • Joined: 17 Oct 2006
Basically, a directive in AutoHotkey tells AutoHotkey to do something immediately as the directive is processed, before the script is fully loaded. Commands generally do something only if and when they are reached during execution of the script. (There are exceptions - for instance, the presence of certain commands will cause the script to be "persistent" by default.)

DllFunction (or whatever) needs to be a directive as it must generate psuedo-builtin functions before AutoHotkey resolves and validates function calls. This must always happen before the script begins executing.

But is this the way your rebuild AHK your way, using preprocessor directives?

I extend and modify the C++ source code of AutoHotkey which is available from the download page.

And I did´t find anything about #define directives on the forum .

I believe the wish is to give AutoHotkey the same functionality as C/C++ macros.

We should know, what eats up all that time in dllcall's. If the data conversion does ...

After doing some more benchmarks with a built-in version, I have the following estimates:

Built-in version:
[*:1h0uhx8d]69% spent in gdi32\SetPixel.
[*:1h0uhx8d]8% spent converting parameters.
[*:1h0uhx8d]23% spent "interpreting" the script.DllCall version:
[*:1h0uhx8d]24% spent in gdi32\SetPixel.
[*:1h0uhx8d]68% spent in DllCall (interpreting type parameters, converting input parameters, etc.)
[*:1h0uhx8d]8% spent "interpreting" the script.I think that demonstrates speed of data conversion is not the issue.

Pil
  • Members
  • 55 posts
  • Last active: Mar 25 2014 05:00 AM
  • Joined: 26 Feb 2006
Fortunately my (temporary) solution, a data buffer for the Paralell port
made in extern dll is working very well.
Now I have a ten times faster access, so I can go on with my project.
Just to show it with the GetPixel example here is the (realy simple to write) specific dll code in XBLite:
PROGRAM "mydll" 
VERSION "0.0003" 
IMPORT "gdi32" 

DECLARE FUNCTION MyDll () 
EXPORT 
DECLARE FUNCTION PrintSomething (hdc& ,blue&& ) 
END EXPORT 

FUNCTION MyDll () 
IF LIBRARY(0) THEN RETURN 
END FUNCTION 

FUNCTION PrintSomething (hdc& ,blue&& ) 
FOR x = 0 TO 254 
r = r + 1 
g = 0 
FOR y = 0 TO 254 
g = g + 1 
color = (0 << 16) | (y << 8) | x 
SetPixel (hdc&, x, y, color) 
NEXT y 
NEXT x 
END FUNCTION 

END PROGRAM

Along the Mydll.dll <!-- m -->https://ahknet.autoh.../~pil/mydll.dll<!-- m -->
you have to use the xbl.dll (just 58kb) <!-- m -->https://ahknet.autoh...om/~pil/xbl.dll<!-- m -->
So the AHK code just uses one time the DllCall, and on my computer it´s even slight faster than the VB version

#NoEnv 
SetBatchLines -1 
hModule := DllCall("LoadLibrary", "str", "Mydll.dll") 
Gui,Add,Text, vmy_text x10 y270 h60 w250, Please wait. 
Gui,Show,w259 h290, ColorPicker 

hDC := DllCall("GetDC", UInt, WinExist("ColorPicker")) 

start_time := a_tickCount 
DllCall( "mydll\PrintSomething", "int",hdc, "uint",blue ) 
needed_time := (a_tickCount - start_time)/1000 

GuiControl,,my_text, This drawing is done in %needed_time% seconds 
Return 

GuiClose: 
ExitApp 
return

with a built-in version

?