Binary Data | VarSetCapacity | VarSetLength | Heap Object

Discuss the future of the AutoHotkey language
lexikos
Posts: 9552
Joined: 30 Sep 2013, 04:07
Contact:

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

07 Jul 2018, 18:21

nnnik wrote:Then are you planning on implementing a hard-coded version for every possible struct that is out there?
I don't know how you came up with that crazy idea. I don't believe I said anything about my plans aside from keeping NumGet/NumPut.
Though I want to get back on topic which is how a user would implement a binding to a COM object with an IUnkown Interface with NumGet and with the stream object (or the struct object if it is more suited)
I think you are still deviating from the original topic, and that it would be a misuse of the hypothetical binary/struct interface.
Towards me it makes way more sense to read all the values from the table that you need once and then put them into variables.
Unless you are working with a singleton object, you would need a unique set of variables for each object/instance. There is no guarantee that any two objects implementing a given interface have the same implementation.
If all functions are read then the stream object is superior.
You would read all functions at some point, then cache them somewhere, to be later retrieved and called. In that case, you must compare the retrieval mechanism to NumGet, and add on top of that the initial caching overhead. Caching must be done independently for each object (or if you want to bypass the contract and rely on implementation, for each v-table address), which may not be feasible let alone efficient, depending on where the object comes from and how it is used. The cache must somehow outperform the original v-table (a straightforward array of integers) far enough to outweigh the initial cost of building the cache.
User avatar
nnnik
Posts: 4500
Joined: 30 Sep 2013, 01:01
Location: Germany

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

08 Jul 2018, 00:31

Well as soon as you built any abstraction of the object in AutoHotkey this overhead happens anyways. I dont seriously believe that you want people to use non-descriptant numGet combinations in their code.
So as soon as we use any sort of object to get a table offset or to access the table at an offset using numGet we have more overhead at runtime than just storing the result of the numGet inside an object.
The only way around the object is using a set of variables. But maintaining a set of variables for every COM Object that you use and makeing them consume the super global scope is not really an option towards me.

When using the objects you have several options:

Offset-Object:
In this design you store the offset of a function in the COM Objects v-table inside an object.
Then you use NumPut + DllCall every time you want to call it. e. g.:

Code: Select all

vTablePtr := ...
vTableOffsets := {release:A_PtrSize*2, ...}
DllCall(NumGet(vTablePtr+vTableOffsets["anyMethod"], "Ptr"), ...)
This method brings the full overhead of the NumPut and the Object access.
The advantage is that you dont need to prepare anything special for each object.

Pointer-Object: (the thing I thought about when making my last post)
In this design you store the result of the NumGet inside the object.
Then you use DllCall every time you want to call the function thats stored inside. e. g.:

Code: Select all

vTablePtr := ...
functionPtr := {release:numGet(vTablePtr+A_PtrSize*2,"Ptr"), ...}
DllCall(functionPtr["anyMethod"], ...)
This method only needs the Overhead of the object access and saves the numGet.
The disadvantage is that you need to save every function pointer for every COM object inside a new functionPtr Object.

Class-Wrapper:
In this design you just use the non-descriptant NumGets with fixed offsets in several class methods
Then you just use the method everytime you want to access the object:

Code: Select all

vTablePtr := ...
wrappedInstance := new comWrapper(vTablePtr)
wrappedInstance.anyMethod(...)

class comWrapper {
	__new(vTablePtr) {
		this.vTablePtr := vTablePtr
	}
	release(..) {
		DllCall(numGet(this.vTablePtr+A_PtrSize*2,"Ptr"),...)
	}
}
This method has a lot of hidden overhead and is slower then the other 2 - though it is also easier to use.
It has 2 object accesses and numGet.

There are several more designs however after thinking about it in depth it seems like creating a special struct for each vTable is the best option after all.
Recommends AHK Studio
Helgef
Posts: 4709
Joined: 17 Jul 2016, 01:02
Contact:

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

08 Jul 2018, 01:46

I only did hello com, but I used bound funcs, it seemed logical at the time. Today, I'd probably use closures.
Anyways, this seems a bit off topic, I'd very much like to get an answer to the original question, that is, what's the plan? I guess there isn't one.

Cheers.
lexikos
Posts: 9552
Joined: 30 Sep 2013, 04:07
Contact:

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

11 Jul 2018, 00:53

Helgef, there is no plan. Only vague intentions.

nnnik, my point remains that the stream/binary object abstraction is unsuitable for COM interfaces. It is no more "descriptant" than NumGet; you still specify the offset, data type and operation, but take longer (more steps) and allocate more resources to do it. You would be adding overhead for no good reason. By contrast, the objects in your last post at least provide some benefit, but are not what we were discussing.

For the record, you are missing the "get v-table from object" step. Given a COM interface pointer pobj and a function bin() which returns a binary/stream object wrapping a given address, a call would be more like this:

Code: Select all

bobj := bin(pobj)
bvt := bin(bobj.ReadUPtr())  ; No seek needed to read at offset 0.
bvt.seek(3*A_PtrSize)
DllCall(bvt.ReadUPtr(), "ptr", pobj)
You can compact it, but still, I cannot see any advantage over NumGet, only disadvantages.
User avatar
nnnik
Posts: 4500
Joined: 30 Sep 2013, 01:01
Location: Germany

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

11 Jul 2018, 01:19

Tell me a single good reason why we shouldn't have a method named bin.ReadBin().
And once again that is assuming you plan to do random access with the binary object which is not neccessary as we discussed above.
Especially if we use closures as Helgef suggested. The step of getting get v-table from object is constant time and a single effort for both methods therefore I neglected discussing it.

I can see your point about NumGet bering better for random access though. I still think that creating a typed struct would be better for pObjs than NumGet e.g.:

Code: Select all

struct pObj {
Ptr[]* v_Table;
}

COMObj := (new) pObj( ... ) ;would you use new here?
DllCall(COMObj.vTable[3], ...)
In comparison to that the NumGet option seems like someone had a seizure on the keyboard.
Recommends AHK Studio
lexikos
Posts: 9552
Joined: 30 Sep 2013, 04:07
Contact:

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

11 Jul 2018, 22:06

Tell me a single good reason why we shouldn't have a method named bin.ReadBin().
Why should I? I merely provided an example based directly on your own example. Even combining seek + read, it is still more complex than NumGet, still with no apparent advantage.
I still think that creating a typed struct would be better for pObjs than NumGet e.g.
For this, I disagree. Having written numerous scripts dealing with COM object pointers, I would never wish to do it the way you presented.
User avatar
nnnik
Posts: 4500
Joined: 30 Sep 2013, 01:01
Location: Germany

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

12 Jul 2018, 00:11

Why would anybody leave the values in v-Table in the first place?
Recommends AHK Studio
User avatar
Flipeador
Posts: 1204
Joined: 15 Nov 2014, 21:31
Location: Argentina
Contact:

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

18 Jul 2018, 16:14

What I propose is:
  • Use VarSetCapacity only for strings and add the Encoding parameter.
  • Add a HeapCreate function (or any other name) that allows us to reserve memory (in an efficient way) and create a Heap object by passing a memory address (the instance variable named ptr will contain the memory address of the buffer).
  • This Heap object would have several methods to work with binary data (including the Read/WriteNum methods). WriteNum method should return the same Heap object, but in the next position depending on the data type written, then we can continue 'concatenating' more WriteNum. We could also use the Array syntax to return the memory address at the specified offset. The reserved memory should not be maintained once all the references to the Heap object have been deleted, to improve the performance we can simply declare the Heap object as static.
  • Maintain the NumGet and NumPut functions (It is still useful to modify a Buffer when there is no need to create a Heap object from it, or in COM).
Regarding struct: it must be discussed in another topic. I would classify it as low priority, it can be added in the future without the need to break any script.
Helgef
Posts: 4709
Joined: 17 Jul 2016, 01:02
Contact:

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

19 Jul 2018, 02:00

What would the encoding parameter imply?
to improve the performance we can simply declare the Heap object as static.
I guess that is ok in some cases, but it is not great, for example, in case of an interrupting thread calling the function.

Cheers.
User avatar
Flipeador
Posts: 1204
Joined: 15 Nov 2014, 21:31
Location: Argentina
Contact:

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

04 Aug 2018, 22:33

What would the encoding parameter imply?
I don't know. :lolno: I think I got confused. :crazy: Let's forget that parameter. :shh:
I guess that is ok in some cases, but it is not great, for example, in case of an interrupting thread calling the function.
You're right. I imagine that in these cases we can simply use VarSetCapacity. Or maybe we can also include VarSetLength. No idea. :think:
I just realized a strange behavior with ObjSetCapacity.
According to the documentation, it only accepts two parameters, but:

Code: Select all

obj1 := { Buffer: "" }
obj1.SetCapacity("Buffer", 1)

obj2 := { Buffer: "" }
obj2.SetCapacity("Buffer", 1, 0)      ; "0" ?

obj3 := { Buffer: "" }
obj3.SetCapacity("Buffer", 1, 255)    ; "255" ?

MsgBox NumGet(obj1.GetAddress("Buffer"),"UChar")
     . "`n" . NumGet(obj2.GetAddress("Buffer"),"UChar")   ; it is always zero - behaves exactly like the third parameter of VarSetCapacity (FillByte)
     . "`n" . NumGet(obj3.GetAddress("Buffer"),"UChar")   ; but in this case it seems to have no effect
lexikos
Posts: 9552
Joined: 30 Sep 2013, 04:07
Contact:

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

04 Aug 2018, 23:16

Flipeador wrote:I just realized a strange behavior with ObjSetCapacity.
There is some strange behaviour, but you have misinterpreted it.

If the field's capacity is increasing, the new portion (the whole in this case) is not initialized. It will contain whatever realloc left it with, which is probably whatever was there before realloc was called.
realloc wrote:realloc does not zero newly allocated memory in the case of buffer growth.
The parameters have no effect on this, except by affecting the final capacity.

The actual strange behaviour (bug) is that SetCapacity uses its last parameter as the new capacity to set. In your case, that is either the second or third parameter. So obj1's Buffer has a capacity of 1, obj2 has 0, and obj3 has 255.

The error was introduced by commit 0f9f2b3d, which changed the method to tolerate surplus parameters (like dynamic function calls). Prior to this, SetCapacity silently aborted if the parameter count was neither 1 nor 2, so the last parameter was always the correct one. The simplest fix is to change aParam[aParamCount - 1] to aParam[aParamCount > 1].

Fields with a capacity of 0 do not have their own memory; they point to a shared null-terminated string (unless it has been corrupted).
User avatar
Flipeador
Posts: 1204
Joined: 15 Nov 2014, 21:31
Location: Argentina
Contact:

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

15 Sep 2018, 21:28

I've been thinking better about a Struct object, and I've come to a conclusion: we definitely need a Struct object.
Instead of adding a Heap object, we can simply use: StructCreate("uchar[1024]") or s:=StructCreate()`ns.SetCapacity(1024).
I've been seeing some things about Direct2D, Direct3D and OpenGL, and I've found several structures. Trying to implement everything using VarSetCapacity and NumPut/Get is a headache. Trying to "emulate" structures using classes is horrible.
I think this would be a very good solution, while we see what will be done with VarSetCapacity.
@lexikos Thanks for your last comment. And yes, I misinterpreted it. :wave:
lexikos
Posts: 9552
Joined: 30 Sep 2013, 04:07
Contact:

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

15 Sep 2018, 23:54

A simple binary buffer object would be much easier to implement, and serves some different purposes than a struct (which has a specific structure). It serves as an immediate replacement for VarSetCapacity, SetCapacity and GetAddress, while still having its place after struct support is added. Struct support can be added after v2.0, which gives it a better chance of being done properly.

I think that the s.SetCapacity(1024) example makes little sense: this is not a struct. But then, I think "Heap object" is also a misnomer; your idea seems to have very little to do with heap data structures or heap memory allocations. If it can accept an address, it can utilize memory from totally different sources.
iseahound
Posts: 1434
Joined: 13 Aug 2016, 21:04
Contact:

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

24 Sep 2018, 20:29

Does that mean we're getting memory arrays? +1
lexikos
Posts: 9552
Joined: 30 Sep 2013, 04:07
Contact:

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

11 May 2019, 02:16

nnnik wrote:
13 Jan 2019, 14:44
I think the best option here is to force the users to define a data type in all three cases and remove a default type.
Perhaps the main reason it isn't that way is historical. The most common calls are probably Int or Ptr for DllCall and UInt or UPtr/Ptr for NumPut/NumGet, but before x64 builds, they were the same thing. It's easy to get DllCall and NumPut/NumGet wrong, and can be hard to track down the problem when they are, so being explicit is a good thing.

I started working on a buffer object as a very simple replacement for SetCapacity/GetCapacity/GetAddress, and found that I had already made one: the ClipboardAll object; except:
  • Size is read-only, whereas SetCapacity allows a buffer to be resized.
  • It isn't accepted by the "var or address" buffer parameter of any built-in function.
So after making Size read/write, I naturally tested with NumPut, adding a clipboard format to the ClipboardAll object. This was part of the code:

Code: Select all

; Put the format code f, size 4, value 42, and null-terminator (right-to-left)
NumPut(0, NumPut(42, NumPut(4, NumPut(f, c.Ptr, offset, "uint"), "uint"), "uint"), "uint")
My first thought was that this was part of why the File object makes the type part of the method name; like:

Code: Select all

PutUint(0, PutUint(42, PutUint(4, c.Ptr, offset)))
But then it struck me that a simple loop and some minor adjustments is probably all it would take to allow this:

Code: Select all

NumPut("uint", f, "uint", 42, "uint", 0, c.Ptr, offset)
I think something like this was suggested before, in the form of a "make struct" function. Since it would probably be used mostly for structs, perhaps it should pad to default C alignment values by default (with the single-value mode available to override it).

This ordering is a lot better for chaining even without the internal loop, since it puts the type and value together. It is also reminiscent of DllCall, and similar ordering to a type cast or parameter type declaration.

And it reminded me that nnnik suggested the type parameter be made mandatory, although it could be implemented in a way that allows the type to be optional, and even without breaking scripts.

I'm not too sure about how NumGet should look, though. Keep in mind that it would not be intended as a substitute for struct support.


Someone suggested removing the offset parameter. Although I could have used c.Ptr + offset in this case, the offset parameter provides an important utility in other cases: validation. NumPut knows how big each value is, so if you also let it know how big the buffer is (by passing a variable reference or buffer object), it can prevent you from overrunning the buffer.
swagfag
Posts: 6222
Joined: 11 Jan 2017, 17:59

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

12 May 2019, 12:09

+1 Heap something

i tried to make a function such as this one:

Code: Select all

makeRect(x, y, w, h) {
	VarSetCapacity(Rect, 16, 0)
	NumPut(x, Rect, 0, "Int")
	NumPut(y, Rect, 4, "Int")
	NumPut(w, Rect, 8, "Int")
	NumPut(h, Rect, 12, "Int")

	return Rect
}
but it doesnt work. basically @Flipeador's second example
u could make it ByRef but if i wanted to cache the struct between function calls, id have to use one of these
https://docs.microsoft.com/en-us/windows/desktop/memory/comparing-memory-allocation-methods
lexikos
Posts: 9552
Joined: 30 Sep 2013, 04:07
Contact:

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

12 May 2019, 19:23

I did say in my previous post that I was working on it, didn't I? ;)

You can use this with v2.0-a103:

Code: Select all

makeRect(x, y, w, h) {
    ; VarSetCapacity(Rect, 16, 0)
    Rect := BufferAlloc(16)
    NumPut(x, Rect, 0, "Int")
    NumPut(y, Rect, 4, "Int")
    NumPut(w, Rect, 8, "Int")
    NumPut(h, Rect, 12, "Int")
    return Rect
}
or this:

Code: Select all

makeRect(x, y, w, h) {
    Rect := BufferAlloc(16)
    NumPut("int", x, "int", y, "int", w, "int", h, Rect)
    return Rect
}
For the Buffer object, I implemented only the very bare minimum to replace SetCapacity/GetAddress for now.
swagfag
Posts: 6222
Joined: 11 Jan 2017, 17:59

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

12 May 2019, 19:55

oh wow, that was fast
cool cool @lexikos
Helgef
Posts: 4709
Joined: 17 Jul 2016, 01:02
Contact:

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

13 May 2019, 03:45

some minor adjustments is probably all it would take to allow this:

Code: Select all

NumPut("uint", f, "uint", 42, "uint", 0, c.Ptr, offset)
Keep in mind that it would not be intended as a substitute for struct support.
So if it is not for structs, then what is it for? I can only imagine arrays as a likely common case. So I wonder why do we need to repeat the type? I think I'd prefer, numput [...], adr, o, type or puttype [...], adr, o where [...] could either mean an (ahk) array of numbers, mulitple parameters or both. and the order could be different.

I think writetype and readtype would be a good replacement for numgetput.

Cheers.
User avatar
Flipeador
Posts: 1204
Joined: 15 Nov 2014, 21:31
Location: Argentina
Contact:

Re: Binary Data | VarSetCapacity | VarSetLength | Heap Object

13 May 2019, 04:45

I think, as lexikos said above, it should pad to default C alignment values by default.
The last parameter could be a Boolean value to deactivate the automatic padding.

Also, if several types are specified in NumGet, it could return an Array.
Array := NumGet(Target [, Offset := 0][, Type := "UPtr"] [, Type2, ...] [, Padding? = true])
Something like that xD :crazy:

Code: Select all

Rect := NumGet(Rect, "Int", "Int", "Int", "Int", false)
Left := Rect[1], Top := Rect[2], Right := Rect[3], Bottom := Rect[4]

Return to “AutoHotkey Development”

Who is online

Users browsing this forum: lexikos and 34 guests