Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

Choose naming and syntax for built-in Extract/InsertInteger


  • Please log in to reply
100 replies to this topic

Poll: Pass "VarOrAddress, Offset" (20% faster and possibly less error-prone) vs. "&Var + Offset" (fewer parameters and more pure)? (13 member(s) have cast votes)

Pass "VarOrAddress, Offset" (20% faster and possibly less error-prone) vs. "&Var + Offset" (fewer parameters and more pure)?

  1. I strongly prefer passing the single parameter: &Var + Offset (0 votes [0.00%])

    Percentage of vote: 0.00%

  2. I slightly prefer passing the single parameter: &Var + Offset (3 votes [23.08%])

    Percentage of vote: 23.08%

  3. I strongly prefer passing two parameters: VarOrAddress, Offset (4 votes [30.77%])

    Percentage of vote: 30.77%

  4. I slightly prefer passing two parameters: VarOrAddress, Offset (3 votes [23.08%])

    Percentage of vote: 23.08%

  5. No preference. (3 votes [23.08%])

    Percentage of vote: 23.08%

Vote Guests cannot vote
Chris
  • Administrators
  • 10727 posts
  • Last active:
  • Joined: 02 Mar 2004

I suppose the Type/Size parameter is mostly omitted in actual usage.

NumSet(&var + 4, 0x8000)
NumGet(&var + 4)

Yes, I see your point. So it would be:
NumGet(Address, Type = "UInt") 
NumPut(Address, Number, Type = "UInt")
At some point, I'll probably reset the poll to a new one that includes your Set/Get names.

One thing I just realized is that if only a pure address is accepted, the script is likely to crash whenever the caller forgets the ampersand (&). This could be solved by putting in an exception handler, but that is likely to significantly impact performance.

Sean
  • Members
  • 2462 posts
  • Last active: Feb 07 2012 04:00 AM
  • Joined: 12 Feb 2007

One thing I just realized is that if only a pure address is accepted, the script is likely to crash whenever the caller forgets the ampersand (&). This could be solved by putting in an exception handler, but that is likely to significantly impact performance.

Personally I'll take the risk of crash.
It's my fault omitting &, and a script will need a test anyway before going in a real use.

Chris
  • Administrators
  • 10727 posts
  • Last active:
  • Joined: 02 Mar 2004
I'll benchmark the exception handler to find out its impact. If the impact is minimal, maybe it can even throw up an error dialog to help with debugging.

PhiLho
  • Moderators
  • 6850 posts
  • Last active: Jan 02 2012 10:09 PM
  • Joined: 27 Dec 2005
I see there are lot of answers, but I will reply before reading them...
I went for NumGet / NumPut because:
- Extract and even more Insert seems inappropriate. We don't insert a number in a buffer, we write it, and overwrite the existing value: no shift of data.
- The encode / decode pair is too technical and not even correct for me.
OK, now to read the other opinions... :-)

[EDIT 1] I have read Chris' message, and I don't agree...

GetNum(VarOrAddress, Offset = 0, SignedOrFloat? = 0, Size = 4)
PutNum(Number, VarOrAddress, Offset = 0, Size = 4) ; Best parameter order?

For GetNum, I would put the size before the SignedOrFloat? flag: we have to specify a size more often than to use a signed number.
For PutNum, I would put the VarOrAddress param first, for consistency.

The whole reason to support raw addresses instead of simply address-of-variables is flexibility: it seems conceivable the caller may have a raw address from somewhere but not know which variable it points into.

I can't think a single case where a user can put an hard-coded numerical address...
And if a DllCall returns a memory address (I do a GlobalAlloc, then a GlobalLock call to get the address of the buffer in my SetClipboardData function), how do we use it? Can we write PutNum(%pMem%, ...) for example?

Come to think of it, there is a pair of names you didn't provided: GetNum / SetNum. Set seems like a more natural companion to Get, at least for those used to setters and getters, eg. in Java.

[EDIT 2] I see Sean agrees with me on Get/Set... :-)

So it would be:

NumGet(Address, Type = "UInt") 
NumPut(Address, Number, Type = "UInt")

Looks good, except for the names. ;-)

Exception handler: I have seen that using that in a program bloats it a lot, but it is used in DllCall already, so the evil is already done... :-)
And indeed, I believe a scripting language should catch as much errors as it can, even if it reduces performance, that's the price of ease of use.
Note that I can crash my scripts already, with some bad DllCalls. Once, it was because I forgot a comma between the type and the value: "UInt" var. It took me a long time to understand where the error was... Another evil side effect of implicit concatenation, hiding a syntax error.
Posted Image vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")

Chris
  • Administrators
  • 10727 posts
  • Last active:
  • Joined: 02 Mar 2004
Thanks for the comments.

I've deleted the old poll and added a new one to choose among the most popular options of the previous poll. The old poll's results were:
ExtractNum / InsertNum     0%  [ 0 ]
DecodeNum / EncodeNum      0%  [ 0 ]
GetNum / PutNum           14%  [ 2 ]
NumExtract / NumInsert    42%  [ 6 ]
NumDecode / NumEncode      7%  [ 1 ]
NumGet / NumPut           28%  [ 4 ]
Other                      7%  [ 1 ]

Total Votes : 14
Although I haven't voted, I think I like Get/Put better than Get/Set because: 1) Get/Set are visually hard to distinguish from each other; 2) Put seems to more accurately describe what is happening, and is less amiguous. For example, someone seeing "NumSet()" out of context would have little idea what it does. But seeing "NumPut()" gives a more precise impression: that the function is placing something somewhere rather than just changing some setting or option (which is what "set" implies).

SKAN
  • Administrators
  • 9115 posts
  • Last active:
  • Joined: 26 Dec 2005
I voted for NumGet / NumSet :)

Sean
  • Members
  • 2462 posts
  • Last active: Feb 07 2012 04:00 AM
  • Joined: 12 Feb 2007
I also voted for NumGet/NumSet.
However, I'm a bit concerned about actually the same opinion would be divided into two.

majkinetor
  • Moderators
  • 4512 posts
  • Last active: May 20 2019 07:41 AM
  • Joined: 24 May 2006
NumGet/Set is totaly bad, IMO, as they look the same.
I also don't know why do we have to poll again. Previous results are obvious.

NumGet and NumPut are also very similar, this case is not very different then Get/Set.
Posted Image

corrupt
  • Members
  • 2558 posts
  • Last active: Nov 01 2014 03:23 PM
  • Joined: 29 Dec 2004
I vote that this is a waste of the energy budget and should either be built in properly with full struct support and user friendly syntax or be taken off the todo list and left in a function...

Choose naming and syntax for built-in Extract/InsertInteger

If everyone is already familiar with the names and able to understand what the topic is about with the brief description that was given then the names should probably not be changed since new names will likely only cause confusion for those who barely understood what the functions did in the first place...

Laszlo
  • Moderators
  • 4713 posts
  • Last active: Mar 31 2012 03:17 AM
  • Joined: 14 Feb 2005

full struct support and user friendly syntax

These functions fully support structs. Propose a more user friendly syntax, if you don't like the current one.

If blocks of parameters were allowed to repeat, like NumGet(Var0,addr0,Type0, Var1,offs1,Type1, Var2,offs2,Type2…), it only saves you retyping the function name, and you can easily create a wrapper like this, if you need.

If everyone is already familiar with the names and able to understand what the topic is about with the brief description that was given then the names should probably not be changed since new names will likely only cause confusion for those who barely understood what the functions did in the first place...

There are always new users, and old ones, who start using more advanced features. For them the name changes don't matter. Furthermore, there were no standard functions, only code samples in the Help, which continue to work, so no changes are necessary in existing scripts. The new functions offer much more, much faster, so they are badly needed. Now, many posted scripts use different versions of them, resulting in scripts having several functions of similar purpose but different names.

corrupt
  • Members
  • 2558 posts
  • Last active: Nov 01 2014 03:23 PM
  • Joined: 29 Dec 2004

full struct support and user friendly syntax

These functions fully support structs. Propose a more user friendly syntax, if you don't like the current one.

RECT1 := CreateStruct(Int, Left, Int, Top, Int, Right, Int, Bottom)
RECT1.Left := 2
RECT1.Right := 50
; etc...
; 
; after using in DllCall use something like:
VarSetCapacity(RECT1, -1)
; or better yet... automatically update the RECT structure after the call
; since it would be unlikely to pass a struct to a call and not want it to be updated
; and easy enough to copy it before passing it if that was desired

I don't see why users should be made to suffer through constantly having to specify type values and offsets when this information can be stored and recalled by the language.

corrupt
  • Members
  • 2558 posts
  • Last active: Nov 01 2014 03:23 PM
  • Joined: 29 Dec 2004
Ideally, I'd prefer to see the previous example as something like this instead:
RECT1 := CreateStruct("RECT") 

; look up values for common structs in a separate file

RECT1.Left := 2 

RECT1.Right := 50 

; etc... 

...but I realize that that would reduce portability since another file would need to be included...

Chris
  • Administrators
  • 10727 posts
  • Last active:
  • Joined: 02 Mar 2004
I'd like to see better structure support someday too. The main reasons for building in NumGet and NumPut are:

1) Performance: I was surprised at how much faster ExtractInteger is when built-in.
2) Ease-of-use: Users don't have to #include or copy and paste the functions into their scripts.
3) Even when fancier structure support gets added, it seems likely that NumGet/Put will still be useful for low-level tasks or for simple structures in which only one or two members needs to be set/put. Plus it might be a long time before fancier struct support is added, so NumGet/Put will pay a lot of dividends until then.

corrupt
  • Members
  • 2558 posts
  • Last active: Nov 01 2014 03:23 PM
  • Joined: 29 Dec 2004
Good points... In that case I would vote for StructGet and StructSet since this would primarily be used for structs and be much less cryptic for someone to search for in the documentation.

I have also noticed that most #Include files I have put together recently require ExtractInteger/InsertInteger so I'm not opposed to the idea but was hoping for a more user friendly approach and hoping for separate variables and dot notation...

toralf
  • Moderators
  • 4035 posts
  • Last active: Aug 20 2014 04:23 PM
  • Joined: 31 Jan 2005
In Maj Lil' Builder I found 3 different versions for these functions. I guess they come from the different modules he included. I'll post them to show that there are some variants. I guess they all have their purpose and thus the build in funtions should be able to cover their posibilities:

For ExtractInteger:
DecodeInteger(ptr){ 
   Return *ptr | *++ptr << 8 | *++ptr << 16 | *++ptr << 24 
}

ExtractInteger(ByRef pSource, pOffset = 0, pIsSigned = false, pSize = 4){
	Loop %pSize%  ; Build the integer by adding up its bytes.
		result += *([color=red]&[/color]pSource + pOffset + A_Index-1) << 8*(A_Index-1)
	if (!pIsSigned OR pSize > 4 OR result < 0x80000000)
		return result  ; Signed vs. unsigned doesn't matter in these cases.
	return -(0xFFFFFFFF - result + 1) ; Otherwise, convert the value (now known to be 32-bit) to its signed counterpart
}

ExtractIntegerAtAddr(pSourceAddr, pOffset = 0, pIsSigned = False, pSize = 4) { 
   Loop, %pSize%    
      iResult += *(pSourceAddr + pOffset + A_Index - 1) << 8 * (A_Index - 1) 
   If (pIsSigned && pSize <= 4 && iResult >= 0x80000000) 
      iResult -= 0x100000000 
   Return iResult 
}

And for InsertInteger:
InsertInteger(pInteger, ByRef pDest, pOffset = 0, pSize = 4){
	Loop %pSize%  ; Copy each byte in the integer into the structure as raw binary data.
		DllCall("RtlFillMemory", "UInt", [color=red]&[/color]pDest + pOffset + A_Index-1, "UInt", 1, "UChar", pInteger >> 8*(A_Index-1) & 0xFF)
}

InsertIntegerAtAddr(pInteger, ByRef pDest, pOffset = 0, pSize = 4){
	Loop %pSize%  ; Copy each byte in the integer into the structure as raw binary data.
		DllCall("RtlFillMemory", "UInt", pDest + pOffset + A_Index-1, "UInt", 1, "UChar", pInteger >> 8*(A_Index-1) & 0xFF)
}

EncodeInteger(ref, val, nSize = 4){
   DllCall("RtlMoveMemory", "Uint", ref, "int64P", val, "Uint", nSize)
}

I'm unsure if EncodeInteger and DecodeInteger fit into this, but they seem to be close.
Ciao
toralf
 
I use the latest AHK version (1.1.15+)
Please ask questions in forum on ahkscript.org. Why?
For online reference please use these Docs.