Writing Command-Line Programs in AutoHotkey

Helpful script writing tricks and HowTo's
[Shambles]
Posts: 20
Joined: 20 May 2014, 21:24

Writing Command-Line Programs in AutoHotkey

01 Oct 2017, 02:14

Introduction
This tutorial might be of interest to several groups of people:
  • those who want to automate the (un)installation, updating, and (re)configuration of software
  • those extending editors (e.g. Emacs) that can use the standard streams (stdin, stdout, and stderr) to communicate with other programs

The reports of the command-line's death have been greatly exaggerated. Just because most users do not know how to use it any more does not mean it is useless! Command-line interfaces are much easier to automate than graphical user interfaces.

This is written for v1. When I know something needs to be changed for v2, I document it.


Preparation
You will need a way to edit the PE header of a Windows executable. You can use a hex editor like Frhed or a specialized tool like LordPE.

You will need Lexikos' RegisterSyncCallback to handle console control events.


Explanation

The Beginning

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus



Most of this is not specific to command-line programs. However, a command-line program should not display an icon in the notification area (a.k.a. system tray) and it should not be limited to a single instance.

The "Debugging" section can be commented out when you are not debugging.

The settings I have not drawn attention to so far are what they are planned to be in v2 and are improvements over v1's default behavior.

In v2, you will probably need to set the working directory to A_InitialWorkingDir. This variable does not exist in v1. v1 respects the initial working directory, but v2 currently sets it to A_ScriptDir, which is incorrect for command-line programs.

Connecting to the Standard Streams

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus


This code does what you expect.

Processing Command-Line Arguments
There is considerable variation in how programs process command-line arguments. I attempt to explain an acceptable way, not the only way.

This section is long and includes many digressions. Rest assured that it is all relevant. I believe it is the easiest to understand when explained in this way.

Algorithms that follow use a specification for your command-line program. An example is below.

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus



COMMAND must contain the name of your program. It must be the same as what users enter at the command-line to run your program. Be aware that the Windows command-line is case insensitive.

VERSION must contain the version of your program. Please use the Semantic Versioning format.

USAGE_PATTERNS must contain the patterns of arguments that can be used together. "-?" and "-version" must be present. Arguments surrounded by square brackets ([]) are optional. Arguments surrounded by angle brackets (<>) are to be substituted by actual values.

OPTIONS must contain the names of your options (variables that dictate the specifics of your program's operation) and a specification of their option-argument and purpose. "Arg" is the name of the option-argument for options that require an option-argument and "" otherwise. "Csv" is true for an option-argument in comma-separated value format and false otherwise. "Desc" contains the human-readable description of the option. Defaults are depicted as they are in the code above.

OPERANDS must contain the names of your operands (variables that are operated on by your program). "?" and "*" are processed differently. Operands following "?" are optional operands. Other operands are required. "*" is an operand that contains any arguments that are not contained by other operands. "*" can appear at the beginning, in the middle, or at the end of the operands. "?" and "*" must appear once, if at all. The names "?" and "*" were inspired by regular expression notation.

You must not name your parameters (options and operands) after any special Object keys (like base), methods (like Clone), or meta-functions (like __Get). That would make it impossible to reliably set and get their values.

The algorithms that use this specification do not check its contents for errors. Be careful when filling out yours!

The main procedure provides a framework for understanding the rest of this section.

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus



Programs should follow the convention of prefixing the first line of their error messages with their command name followed by ": ". The first line should be sufficient to understand the error. Trailing, unprefixed lines are sometimes used to show more information. They are almost exclusively used to tell the user how to use help if they try to use an invalid option. Errors should always be shown on stderr. This makes it easy to redirect and filter error messages. Users often want to do that to find the cause of errors in shell scripts.

Programs should follow the convention of returning an exit status of 0 on success or 1 on error. This makes it easier to detect errors in shell scripts.

The main procedure will be called this way.

Code: [Select all] [Download] GeSHi © Codebox Plus

Main(GetArgs())


Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus



GetArgs reads the command-line arguments out of the pseudo-array where they are stored in v1 and packs them into an array. In v2 they are already stored in an array named A_Args.

Pseudo-arrays are not first class. Converting them to values and passing them as arguments instead of reading them directly makes it easier to test our algorithms.

Notice the global variable named 0 in the code above. Without that line, when you read the global variable named 0, AutoHotkey will create a variable named 0 and initialize it to ""! I am not sure if this is a defect in AutoHotkey or just intended, undesirable behavior. Reading the other variables named after integers in GetArgs works as expected. If it were not so, it would be impossible to write GetArgs in AutoHotkey because the number of command-line arguments can vary.

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus



ParseArgs builds a data structure containing the options and operands passed to your program while checking for syntax errors. The data structure has two keys: "Options" and "Operands". Their values are objects with key-value pairs for each option and operand passed to your program.

The value associated with an option that does not require an argument is true.

The value associated with an option that requires a CSV argument is an array.

It is a syntax error to specify an option more than once with different arguments because it is not obvious what should be done in that situation. Arguments must use the same letter case to be considered the same because sometimes letter case matters.

-- is a special option that is used to delimit the end of the options. It prevents operands that begin with - or / from being mistaken for options. It is sometimes used in a security context to separate trusted options from untrusted operands. If you are using it that way, make certain you can trust the options! Otherwise, it could be parsed as an option-argument.

The value associated with the * operand is an array.

The * operand behaves as consistently as possible, but a corner case might be surprising. If an optional operand’s argument was omitted, there is no key-value pair corresponding to it in the data structure ParseArgs returns, but there will be a key-value pair corresponding to * in the data structure ParseArgs returns unless an optional operand’s argument was omitted before it! This is consistent with how the * operand behaves when it is the last operand and there are no arguments left to fill it. An empty array should be equivalent semantically to a nonexistent key-value pair anyway.

Code: [Select all] [Download] GeSHi © Codebox Plus

ValidateCliInput(CliInput)
{
; See the explanation below.

return CliInput
}


ValidateCliInput checks for semantic errors in the data structure returned by ParseArgs and, if none were found, returns the data structure unchanged.

The checks involved depend on the program in question. For example, in the program the example specification was for it should check that CliInput["Options"]["optimize"], if present, is "off", "on", or "unstable" and throw an Exception with a helpful message otherwise.

Some checks cannot be performed by ValidateCliInput because doing so would cause TOCTTOU (time of check to time of use) defects. For example, if it checked that a file the program is about to process exists, the file might be moved or deleted before it was opened.

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus



Exec performs the operation that the program’s users value.

Programs should follow the convention of showing help if the help option is used or showing the version if the version option is used. Those options are checked for in that order and before any others. Any other options are ignored when those options are used.

Input should be credible when it reaches Exec, but it must handle some errors to avoid introducing TOCTTOU defects. Most of these errors involve using files, directories, and sockets. They should be handled with EAFP (it is easier to ask forgiveness than it is to get permission). In other words, try it and throw an Exception if it fails.

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus



This code does what you expect.

Help should always be shown on stdout. This makes it easy to redirect and filter help messages. Users often want to do that to find an option they cannot remember the name for.

Handling Console Control Events
If you need your program to do something other than exit when Ctrl+C or Ctrl+Break is pressed or the console window is closed, you will need to write code to handle console control events.

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus



If you tried to use RegisterCallback instead of RegisterSyncCallback, your process would become unstable. HandlerRoutine would run on another thread and probably cause memory corruption.

There are three console control events your process might receive:
  • CTRL_C_EVENT (0) -- Ctrl+C was pressed to terminate your process or to terminate the algorithm your process is running
  • CTRL_BREAK_EVENT (1) -- Ctrl+Break was pressed to terminate your process or to terminate the algorithm your process is running and show debugging information
  • CTRL_CLOSE_EVENT (2) -- the console window was closed to terminate your process

The default console control handler will terminate your process when it receives any of these events. Your custom console control handler (HandlerRoutine in the code above) must return true to replace that behavior or false to include its behavior before that behavior.

Be aware that CTRL_CLOSE_EVENT is different from the other events in that your process will be terminated as soon as HandlerRoutine returns, no matter what it returns! Also, if HandlerRoutine takes more than 5 to 20 seconds (depending on Windows version) to handle CTRL_CLOSE_EVENT, your process will be terminated anyway! So HandlerRoutine should return false when handling CTRL_CLOSE_EVENT and handle it quickly. It must contain or call code to handle that event because the main thread will not get a chance to run further.

If you want your program to terminate the algorithm it is running when Ctrl+C or Ctrl+Break is pressed, you will need to set CtrlEvent to "Ctrl+C" for CTRL_C_EVENT or "Ctrl+Break" for CTRL_BREAK_EVENT in HandlerRoutine and call PollForConsoleCtrlEvents in your main thread. HandlerRoutine should not set CtrlEvent to "Ctrl+C" if it is already set to "Ctrl+Break". That implies only CTRL_BREAK_EVENT can clobber CTRL_C_EVENT, which should be acceptable to your users and unlikely. No events can be lost because HandlerRoutine should never reset CtrlEvent. CTRL_CLOSE_EVENT cannot be clobbered or lost because of its nature. Exception handling provides a good way to 'back out' of your algorithm, and you can catch and rethrow exceptions to perform cleanup or undo procedures in a telescoping fashion.

Avoid designing your program in a way that could result in data corruption if its process was terminated without having run its cleanup procedure. If power were interrupted, your process was forcibly terminated (e.g. by Task Manager), or similar situations arose, your cleanup procedure would not run.

Compiling and Editing the PE Header
AutoHotkey normally refuses to produce command-line programs, but it can be forced to with some effort.

You must compile your program. Otherwise, there is no PE header to edit.

You must edit the PE header to change the Subsystem field from WINDOWS_GUI (2) to WINDOWS_CUI (3). Otherwise, your program will be unable to attach to the console.

The Subsystem field is a 16-bit integer stored in little-endian order at offset 372 (0x174). That information is useful to those using a hex editor to change the field.


Advice
Know that CON, CONIN$, CONOUT$, CONERR$, NUL, wildcards, piping, and redirection exist. Use them. Do not reinvent them.

Consider the design of related command-line programs when designing yours. This should make your program easier to use with those programs. Adopt their good ideas. Avoid their bad ideas. This is how progress is made.

Consider the Microsoft Command Line Standard when designing your program, but be aware that even Microsoft’s programs do not follow it. That is why I suggest considering the design of related programs too.

It might be worthwhile to consider the POSIX and docopt standards when designing your program even though they are not Windows standards. They have some good ideas, like the -- option, that solve problems that exist but have no conventional solution on Windows.

Avoid accidental complexity. Keep the number of options small. Avoid using CSV options when possible. Avoid using optional operands when possible. If you must write a variadic command-line program, position the variadic operand (*) as the last operand when possible because other positions often confuse users.



The Complete Template

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus


You might need to delete sections that do not apply to your program (e.g. the console control event handling code in a program that uses the default handler).


Conclusion
I wrote this for several reasons:
  • to save others the time and effort it took me to learn how to do this
  • to encourage the AutoHotkey developers to improve support for writing command-line programs
  • to thank the community for helping me

This document is released into the public domain.
Last edited by [Shambles] on 17 Oct 2017, 08:51, edited 7 times in total.
[Shambles]
Posts: 20
Joined: 20 May 2014, 21:24

Addendum

01 Oct 2017, 22:05

Special Thanks
qwerty12 told me why my program became unstable when I used RegisterCallback when setting my console control event handler, directed me to helpful resources, and critiqued this document.

See Also
  • tmplinshi's script for changing the Subsystem field of the PE header
Last edited by [Shambles] on 21 Oct 2017, 01:05, edited 5 times in total.
User avatar
RUNIE
Posts: 252
Joined: 03 May 2014, 14:50
GitHub: Run1e

Re: Writing Command-Line Programs in AutoHotkey

01 Oct 2017, 22:10

Really cool. Was looking for something like this.
User avatar
nnnik
Posts: 2283
Joined: 30 Sep 2013, 01:01
Location: Germany

Re: Writing Command-Line Programs in AutoHotkey

02 Oct 2017, 02:24

It might be worth mentioning that certain IDEs eat your stdout and your code under the section connecting to standard streams will not do what you expect using this IDEs
Recommends AHK Studio
[Shambles]
Posts: 20
Joined: 20 May 2014, 21:24

Re: Writing Command-Line Programs in AutoHotkey

02 Oct 2017, 03:24

nnnik wrote:It might be worth mentioning that certain IDEs eat your stdout and your code under the section connecting to standard streams will not do what you expect using this IDEs


Please elaborate.

I am having trouble imagining what problems you expect to occur.
User avatar
nnnik
Posts: 2283
Joined: 30 Sep 2013, 01:01
Location: Germany

Re: Writing Command-Line Programs in AutoHotkey

02 Oct 2017, 04:34

Hmm yeah that would only occur when you edit the PE Header of AutoHotkey.exe and then use an IDE to directly launch your Scripts.
Recommends AHK Studio

Return to “Tutorials”

Who is online

Users browsing this forum: No registered users and 2 guests