AHK v2: converting/optimizing scripts Topic is solved

Get help with using AutoHotkey (v2 or newer) and its commands and hotkeys
oif2003
Posts: 214
Joined: 17 Oct 2018, 11:43
Contact:

Re: AHK v2: converting/optimizing scripts

17 Dec 2018, 14:47

vvhitevvizard wrote:
17 Dec 2018, 10:26
Excellent detective work as always! :thumbup: I poked around the memory content of TCC's compiled C code and noticed that I can clone the function with static variable, but I must call it once first, and the cloned copy retains the last static variable value of the original function as a side effect. However, I did not look into it further to untangle the mess. It seems like you are already on the path of finding a general solution to the problem!
User avatar
vvhitevvizard
Posts: 454
Joined: 25 Nov 2018, 10:15
Location: Russia

Re: AHK v2: converting/optimizing scripts

17 Dec 2018, 20:01

oif2003 wrote:
17 Dec 2018, 14:47
It seems like you are already on the path of finding a general solution to the problem!
I noticed 1 more strange thing as well: within MCode initializer function we call VirtualProtect and declare the size of memory required to cram our mcode into it, we give that memory area PAGE_EXECUTE_READWRITE abilities, thats not right by itself - the code has to be just PAGE_EXECUTE or PAGE_EXECUTE_READ (to store some strings and other constants right inside .text code). With the current setting we can do whatever we want including storing our data (dynamically changing variables) right inside it or even dynamically change the whole mcode placeholder's content, replace it with another mcode.
I was tired yday and didnt check it but the simplest hack would be just to allocate a bigger memory region with GlobalAlloc+VirtualProtect so all the relative variables addressed inside .data offset would be within that area thus making ur code work as is:

Code: Select all

	int test(void) {
		static int i = 0;
		return ++i;
	}
2.
But thats a bad practice. better make the caller initialize some structure with data for the calee mcode: VatSetCapacity(buffer, size) and call the mcode with &buffer addr as its parameter each time. So the callee stores its intermediate data. Many system "simple 1-task" functions work this way - they dont have any static data, just alloca allocated stack buffer for local variables.

We might put in that structure addresses of some other DLL functions needed for mcode, the latter makes external DLL calls while being truly portable at the same time.

3.
Image
Hmm I figured out ur code snippet has 2 candidates for fixups actually. in the screenshot above it has 2 uint32 placeholders (filled with 4 zero bytes) for the DLL/ELF loader to replace them with real value`s address. Thats a no-go for portable mcodes. These addresses should be relative either to the stack (local vars area) or to the code itself, they shouldn't be absolute. I placed text cursor on the first occurrence of such fixup placeholder @0x4D. W/o proper fixups the code just destroys itself (overwrites DWORD @ 0x5A so it never returns). :D
Ofc we could recreate full loader's logic and patch these fixup placeholders ourselves, but better if we design source code to be portable.

Btw pay ur attention that its 7 machine commands in total. 4 of them r plain redundant. Also it seems TCC mixes 32bit code inside x64 segment (mov ebp,esp should be mov rbp,rsp) - it works but is prone to errors for addresses over 32bits and its suboptimal for compiled code size.
User avatar
nnnik
Posts: 4500
Joined: 30 Sep 2013, 01:01
Location: Germany

Re: AHK v2: converting/optimizing scripts

18 Dec 2018, 03:05

3 there is a lot of logic regarding relocations inside my framework. If we connect both things they can work together.
Recommends AHK Studio
User avatar
vvhitevvizard
Posts: 454
Joined: 25 Nov 2018, 10:15
Location: Russia

Re: AHK v2: converting/optimizing scripts

18 Dec 2018, 14:13

nnnik wrote:
18 Dec 2018, 03:05
there is a lot of logic regarding relocations inside my framework. If we connect both things they can work together.
thats the issue. 1000 lines of code for COFF format. And we would have to add even more by including ELF support. Its a bit complicated for my case cuz I do know if I start mcoding some C algorithm I would end up optimizing it in assembly language. For size and performance. And as long as we have Execute+Write+Read memory page options for our mcode chunk, any badly designed c code with lots of relocations can actually be manually changed to be flat and relative (no fixups table) by placing everything in 1 segment and changing absolute addresses to be loaded with lea [relative offset]. And properly designed mcode would have no relocation troubles at all.

I converted the c lines

Code: Select all

int test(void){
		static int i = 0;
		return ++i;
	}
to this (simplified yet working example):

Code: Select all

use64
	lea	rcx,[i]
	mov	rax, [rcx]	
	inc	rax
	mov	[rcx], rax
	ret
i dq 0
compiled asm code -> otest.bin:
it has the same logic static int64 i=0; i++; return i but it has only relative addresses. 100% portable code. And its output (using FASM w/o any command line options) is a tiny binary file (just a chunk of x64 code) that can be loaded with file.RawRead command.

Code: Select all

f:=FileOpen("otest.bin", "r")
VarSetCapacity(b,f.length)
n:=f.RawRead(b, f.length)
p:=DllCall("GlobalAlloc", 'uint',0, 'ptr',n, "PTR")
Loop(n//8) ;write program data to memory in 8 byte chunks
	i:=A_Index-1, NumPut(NumGet(b,i*8, "uint64"), p,i*8, "uint64")
DllCall("VirtualProtect", "ptr",p, "uint",n, "uint",0x40, "uintp",op)
msgbox(DllCall(p))
msgbox(DllCall(p))
msgbox(DllCall(p))
output is
otest.7z
(149 Bytes) Downloaded 74 times
Last edited by vvhitevvizard on 18 Dec 2018, 16:24, edited 10 times in total.
User avatar
vvhitevvizard
Posts: 454
Joined: 25 Nov 2018, 10:15
Location: Russia

Re: AHK v2: converting/optimizing scripts

18 Dec 2018, 14:25

But I dont understand why this (w/o allocating new memory chunk and copying data into it) doesnt work:

Code: Select all

VarSetCapacity(b,f.length)
n:=f.RawRead(b, f.length)
DllCall("VirtualProtect", "ptr",&b, "uint",n, "uint",0x40, "uintp",op)
msgbox(DllCall(&b))
Last edited by vvhitevvizard on 18 Dec 2018, 15:40, edited 2 times in total.
oif2003
Posts: 214
Joined: 17 Oct 2018, 11:43
Contact:

Re: AHK v2: converting/optimizing scripts

18 Dec 2018, 14:33

Would it be easier to figure out how to manually load a DLL? As in straight from embedded binary to memory without writing to disk like MCode. If you use a good compiler, there would be less optimization to do, and perhaps that is acceptable.
https://www.joachim-bauch.de/tutorials/ ... om-memory/
User avatar
vvhitevvizard
Posts: 454
Joined: 25 Nov 2018, 10:15
Location: Russia

Re: AHK v2: converting/optimizing scripts

18 Dec 2018, 14:35

oif2003 wrote:
18 Dec 2018, 14:33
Would it be easier to figure out how to manually load a DLL?
good link.
but again. issue here is badly designed mcode. Would it be easier to make portable mcode at first place :) for c: local vars, DLL function pointers passed with args from AHK, no static and global vars (storing mcode's non-volatile data in structure on AHK side).

for FASM output, I already simplified it to the extreme. output is just a binary mcode, no redundant data to process.

And I do believe if we started to mcode, we d better go this route: C output -> disassembler (the output format doesnt matter - DLL, EXE, OBJ, ELF) -> removing redundant machine instructions, moving data to the code segment, proceeding with further optimizations -> mcode chunk as simple binary file or base64 string
Last edited by vvhitevvizard on 18 Dec 2018, 14:50, edited 2 times in total.
oif2003
Posts: 214
Joined: 17 Oct 2018, 11:43
Contact:

Re: AHK v2: converting/optimizing scripts

18 Dec 2018, 14:46

If you can reliably load a DLL manually, the advantage of MCode may become marginal as long as the code was written properly using a good compiler (not TCC for this task). I think it is a reasonable trade off that leverages existing solutions. As an interim solution, I would just write the DLL to disk, load it the normal way and get function pointers for speed. The MCode project can take some time to perfect and at least with a DLL you have the C/C++ code you need for generating MCode later. As for the modularity of MCode vs DLL, you can either compile the functions you need for each project, or save them as separate DLLs. If you can embed a single DLL, you can embed any arbitrary number of them.

Perhaps you should examine compiled code from different compilers to see if hand optimization is really necessary?
User avatar
vvhitevvizard
Posts: 454
Joined: 25 Nov 2018, 10:15
Location: Russia

Re: AHK v2: converting/optimizing scripts

18 Dec 2018, 14:57

oif2003 wrote:
18 Dec 2018, 14:46
If you can reliably load a DLL manually, the advantage of MCode may become marginal
I was about to proceed with json.Get optimizations using fixed (hardcoded) mcode chunks. and u say "marginal". whats the point of DLL storing mcode chunks that have no use for other applications except the function they were written for? :)
oif2003 wrote:
18 Dec 2018, 14:46
Perhaps you should examine compiled code from different compilers to see if hand optimization is really necessary?
Ive done that many times back to 1990ss..2000ss. No compiler can create machine code that cannot be further optimized manually. A few years ago I had an idea of optimizing AHK's SearchImage with SSE4 or AVX1 instructions but give it up due to lack of time
Last edited by vvhitevvizard on 18 Dec 2018, 15:01, edited 2 times in total.
oif2003
Posts: 214
Joined: 17 Oct 2018, 11:43
Contact:

Re: AHK v2: converting/optimizing scripts

18 Dec 2018, 15:00

We can always use separate DLLs or compile specific version for each task. Given the amount of memory and storage most systems that run AHK v2 have, the savings may not be worth the effort.
User avatar
vvhitevvizard
Posts: 454
Joined: 25 Nov 2018, 10:15
Location: Russia

Re: AHK v2: converting/optimizing scripts

18 Dec 2018, 15:10

oif2003 wrote:
18 Dec 2018, 15:00
We can always use separate DLLs or compile specific version for each task. Given the amount of memory and storage most systems that run AHK v2 have, the savings may not be worth the effort.
I meant that for json.Get it will be a fixed-function logic mcoded. Unusable for other tasks. there is no point to put it in DLL for further use and make us use a satellite file with the script. Script portability suffers. Thats the reason why small scripts base64 resources like icons to store all them in 1 file. Makes sense :D
Last edited by vvhitevvizard on 18 Dec 2018, 15:12, edited 1 time in total.
oif2003
Posts: 214
Joined: 17 Oct 2018, 11:43
Contact:

Re: AHK v2: converting/optimizing scripts

18 Dec 2018, 15:11

DLL != satellite file. We can embed them just like MCode if push comes to shove. And there is also potential of being able to load them directly ourselves without ever writing to disk... once someone figures out how to do it reliably.
User avatar
vvhitevvizard
Posts: 454
Joined: 25 Nov 2018, 10:15
Location: Russia

Re: AHK v2: converting/optimizing scripts

18 Dec 2018, 15:13

oif2003 wrote:
18 Dec 2018, 15:11
DLL != satellite file. We can embed them just like MCode if push comes to shove.
Empty DLL is about 3k redundant data. I managed to create 1k empty DLL but its rather a hack with zeroed sections that overlap each other.
Last edited by vvhitevvizard on 18 Dec 2018, 15:14, edited 1 time in total.
oif2003
Posts: 214
Joined: 17 Oct 2018, 11:43
Contact:

Re: AHK v2: converting/optimizing scripts

18 Dec 2018, 15:13

Compression API takes care of most of that. But of course none of this address your core concern. Your hand tuned version will probably always be faster.
User avatar
vvhitevvizard
Posts: 454
Joined: 25 Nov 2018, 10:15
Location: Russia

Re: AHK v2: converting/optimizing scripts

18 Dec 2018, 15:15

oif2003 wrote:
18 Dec 2018, 15:13
Compression API takes care of most of that.
with such logic we have overbloated stuff all over the place. We start using mcoding to optimize at first place
oif2003 wrote:
18 Dec 2018, 15:11
And there is also potential of being able to load them directly ourselves without ever writing to disk... once someone figures out how to do it reliably.
BTW, I would make a simple analyzing AHK function that shows number of relocated placeholders and imported DLL calls for compiled c/cpp/whatever code designed to be used as mcode. Just to let ppl check how good/bad their design of mcode. :D
Last edited by vvhitevvizard on 18 Dec 2018, 15:22, edited 1 time in total.
oif2003
Posts: 214
Joined: 17 Oct 2018, 11:43
Contact:

Re: AHK v2: converting/optimizing scripts

18 Dec 2018, 15:18

Not exactly, since decompression and loading to memory are one time overheads. Based on the way you benchmarked the functions, I assume you won't be calling the script multiple times, thus loading/unloading the files over and over.
oif2003
Posts: 214
Joined: 17 Oct 2018, 11:43
Contact:

Re: AHK v2: converting/optimizing scripts

18 Dec 2018, 15:25

vvhitevvizard wrote:
18 Dec 2018, 15:15
BTW, I would make a simple analyzing AHK function that shows number of relocated placeholders and imported DLL calls for compiled c/cpp/whatever code designed to be used as mcode. Just to let ppl check how good/bad their design of mcode. :D
I like that idea! :D
User avatar
vvhitevvizard
Posts: 454
Joined: 25 Nov 2018, 10:15
Location: Russia

Re: AHK v2: converting/optimizing scripts

18 Dec 2018, 15:32

could u pls test it out (I added compiled .bin for convinience)
https://www.autohotkey.com/boards/viewt ... 48#p253748
and in the next post, maybe u know, why can't we just initialize buffer with VarSetCapacity, load raw data into it, and use it as a function with &buffer
the only way it works is thru allocating new memory via GlobalAlloc and copying out data
oif2003
Posts: 214
Joined: 17 Oct 2018, 11:43
Contact:

Re: AHK v2: converting/optimizing scripts

18 Dec 2018, 16:09

Hey, what starting address were you using from otest.o? I tried 0x40 with varying lengths, loading the full file and calling +0x40, but neither worked. I must be doing something wrong.
User avatar
vvhitevvizard
Posts: 454
Joined: 25 Nov 2018, 10:15
Location: Russia

Re: AHK v2: converting/optimizing scripts

18 Dec 2018, 16:18

oif2003 wrote:
18 Dec 2018, 16:09
Hey, what starting address were you using from otest.o? I tried 0x40 with varying lengths
otest.bin (compiled asm. not the file from the compiled c code)!
just use AHK script that follows right in the same post. there is no set virtual address base - code works at any address. it has only relative memory offsets. and .bin file is just the mcode itself - its not ELF or OBJ. its stripped of all redundant data. u load it and use its base address (offset is 0) as a pointer for function
Last edited by vvhitevvizard on 18 Dec 2018, 16:22, edited 1 time in total.

Return to “Ask for Help (v2)”

Who is online

Users browsing this forum: william_ahk and 49 guests