Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

AHKL: IniRead/IniWrite and UTF8


  • Please log in to reply
20 replies to this topic
ruespe
  • Members
  • 567 posts
  • Last active: Dec 01 2014 07:59 PM
  • Joined: 17 Jun 2008
If the ini-file is UTF8, IniRead/Write doesn't work correctly. This is a problem for me, because Compile_AHK is able to put the scripts compiler-settings into the script, which has to be UTF8 for AHKL.

Example:

[VERSION]
File_Description=Test für UTF8

IniRead,xx,c:\AutoHotKey\test.ini,VERSION,File_Description,abcäöüß
MsgBox,%xx% ;shows Test für UTF8
IniWrite,abcäöüßxx,c:\AutoHotKey\test.ini,VERSION,File_Description2
IniRead,xx,c:\AutoHotKey\test.ini,VERSION,File_Description2,abcäöüß
MsgBox,%xx% ;shows abcäöüßxx
Return

[VERSION]
File_Description=Test für UTF8
File_Description2=abc巼࠸x



fincs
  • Moderators
  • 1662 posts
  • Last active:
  • Joined: 05 May 2007
Ini commands use the legacy Windows 3.1 PrivateProfile functions, which are deprecated (& buggy when dealing with encodings). You are encouraged to use a scripted alternative.

Lexikos
  • Administrators
  • 9844 posts
  • AutoHotkey Foundation
  • Last active:
  • Joined: 17 Oct 2006
I wouldn't call it buggy - it just doesn't support UTF-8. What you're seeing is the result of converting your Unicode string to ANSI. It looks wrong because your editor is interpreting it as UTF-8. If you manually encode the string as UTF-8 using StrPut then write it using WritePrivateProfileStringA, it works. Of course, it would corrupt the string if the INI file was UTF-16, and you would need to use GetPrivateProfileStringA and StrGet to retrieve the UTF-8 string without damaging it.

I suppose that the string is modified/converted only in two cases:
[*:15cgzrhw]Calling WritePrivateProfileStringW on a file without a UTF-16 BOM.
[*:15cgzrhw]Calling WritePrivateProfileStringA on a file with a UTF-16 BOM.

ruespe
  • Members
  • 567 posts
  • Last active: Dec 01 2014 07:59 PM
  • Joined: 17 Jun 2008

If you manually encode the string as UTF-8 using StrPut then write it using WritePrivateProfileStringA, it works. Of course, it would corrupt the string if the INI file was UTF-16, and you would need to use GetPrivateProfileStringA and StrGet to retrieve the UTF-8 string without damaging it.

I suppose that the string is modified/converted only in two cases:
[*:28c9tj3j]Calling WritePrivateProfileStringW on a file without a UTF-16 BOM.
[*:28c9tj3j]Calling WritePrivateProfileStringA on a file with a UTF-16 BOM.

Call me a noob, but I don't understand a single word. I understand that inserting the ini-settings within the UTF8-script doesn't work, because IniRead and IniWrite don't support UTF8. That's bad but I can live with it. But it's one point more where you have to pay attention on when using AHKL, which I don't want to miss anymore. :(

YMP
  • Members
  • 424 posts
  • Last active: Apr 05 2012 01:18 AM
  • Joined: 23 Dec 2006
If you save the script and its INI file as UTF-16, everything should work OK, I think.

majkinetor
  • Moderators
  • 4512 posts
  • Last active: May 20 2019 07:41 AM
  • Joined: 24 May 2006
Actually, I don't see why ini is needed with AHK.
I thought about it, and the best option is to use AHK syntax to store configuration.

So, instead of
[Setup]
;key comment
key1 = value1
...

Its better to use

;[Setup]
;key comment
key1 = value1  

This needs, ofc, library that will be able to update such ahk file with values, which is trivial.

I am waiting for Lexikos to implement object initializers and then I don't see why anybody would need to use ini files. The benefit is obvious - ini file is evaluated on startup and all stuff is connected to variables/objects. You can even allow more complex configuration for instance:

;[Setup]
;key comment
key1 = value1
key2 := A_OSVersion = "WIN_95" ?  "value1" : "value2"

The minor problem is that ahk config file might prevent app from loading if there are syntax errors in it, but, this can be prevented by keeping the last known good version of ini file and reverting to it if user made mistake (require some mumbo jumbo with includes but nothing serious, I done something like that in my plugin framework).
The major problem is that it can't be used in compiled scripts, but I personally don't see a need for that in most cases.
Posted Image

Tuncay
  • Members
  • 1945 posts
  • Last active: Feb 08 2015 03:49 PM
  • Joined: 07 Nov 2006

Actually, I don't see why ini is needed with AHK.

First, there are probably ini files which is needed to parse.
Second, ini is a well known format (more or less standard) and very easy to parse.

It could be needed for data exchange with other programs. Why bother someone to learn a new format with probably some new traps? Also the new parser have to be written first and then it is only parsable from ahk, because there is no other parser in the world.

It would make problems and restrictions...

This needs, ofc, library that will be able to update such ahk file with values

The minor problem is that ahk config file might prevent app from loading if there are syntax errors in it

The major problem is that it can't be used in compiled scripts


I suggest instead of to work on that what we have and make it Unicode capable. ... of course imo.

If you do not want use ini, then there are other known standards and some implementations with ahk. Think of xml, json... and probably more.

No signature.


Lexikos
  • Administrators
  • 9844 posts
  • AutoHotkey Foundation
  • Last active:
  • Joined: 17 Oct 2006

If you save the script and its INI file as UTF-16, everything should work OK, I think.

ruespe's use case requires UTF-8. (See the OP.)

;[Setup]
;key comment
key1 = value1

IIRC, this should work even with IniRead/IniWrite:
/*
[Setup]
*/
key1 = value1
key2 = value2
/*
[EndSetup]
*/
The limitation is that new keys might be added in one of the comments. The second comment/section header is there as a "guard" to protect the rest of the script.

The major problem is that it can't be used in compiled scripts, but I personally don't see a need for that in most cases.

It might still be useful - the script.ahk can be configured, then compiled to distribute it with a particular configuration.

Also the new parser have to be written first and then it is only parsable from ahk, because there is no other parser in the world.

Are you talking about majkinetor's proposed format? It can be INI-compatible, as shown above. If we were using some other format, there's little reason to invent or own when we can choose from formats like XML, JSON and YAML.

YMP
  • Members
  • 424 posts
  • Last active: Apr 05 2012 01:18 AM
  • Joined: 23 Dec 2006

If you save the script and its INI file as UTF-16, everything should work OK, I think.

ruespe's use case requires UTF-8. (See the OP.)

I've seen. But what do you mean by 'requires'? UTF-16 won't work?

Lexikos
  • Administrators
  • 9844 posts
  • AutoHotkey Foundation
  • Last active:
  • Joined: 17 Oct 2006
As already mentioned in the OP, source files for compiled scripts must be UTF-8. UTF-16 will not work.

Learning one
  • Members
  • 1483 posts
  • Last active: Jan 02 2016 02:30 PM
  • Joined: 04 Apr 2009

If the ini-file is UTF8, IniRead/Write doesn't work correctly.

I remember this problem while I was creating Radial menu application and RM2module. UTF-8 support was a must for me, so I made this:
FileRead, Variables, %SkinDir%\Skin definition.txt
	StringReplace, Variables, Variables, `r, ,all
	Loop, parse, Variables, `n
	{
		Field := A_LoopField
		if Field is space						
		Continue
		while (SubStr(Field,1,1) = A_space or SubStr(Field,1,1) = A_Tab)
		StringTrimLeft, Field, Field, 1
		if (SubStr(Field, 1, 1) = ";")			
		Continue
		While (SubStr(Field,0,1) = A_space or SubStr(Field,0,1) = A_Tab)
		StringTrimRight, Field, Field, 1
		EqualPos := InStr(Field, "=")			
		if (EqualPos = 0)						
		Continue
		var := SubStr(Field, 1, EqualPos-1)		
		StringReplace, var, var, %A_Space%, ,all
		StringReplace, var, var, %A_Tab%, ,all
		if var is space
		Continue
		val := SubStr(Field, EqualPos+1)		
		while (SubStr(val,1,1) = A_space or SubStr(val,1,1) = A_Tab)
		StringTrimLeft, val, val, 1
		if val is space
		val =
		%var% := val							
	}


Lexikos
  • Administrators
  • 9844 posts
  • AutoHotkey Foundation
  • Last active:
  • Joined: 17 Oct 2006

while (SubStr(Field,1,1) = A_space or SubStr(Field,1,1) = A_Tab)
StringTrimLeft, Field, Field, 1
...
While (SubStr(Field,0,1) = A_space or SubStr(Field,0,1) = A_Tab)
StringTrimRight, Field, Field, 1

Trim() ;)

ruespe
  • Members
  • 567 posts
  • Last active: Dec 01 2014 07:59 PM
  • Joined: 17 Jun 2008

I suggest instead of to work on that what we have and make it Unicode capable. ... of course imo.

1++
Lexikos, wouldn't this be the best solution for compatibility-reasons? IniRead and IniWrite unicode-compatile.

Lexikos
  • Administrators
  • 9844 posts
  • AutoHotkey Foundation
  • Last active:
  • Joined: 17 Oct 2006
One thing I didn't think to point out earlier is that IniRead and IniWrite are merely wrappers around GetPrivateProfileString and WritePrivateProfileString. We're limited to what those two functions can do. They're already Unicode compatible - just not with ANSI/8-bit files. Entirely reinventing IniRead/IniWrite to add UTF-8 support is not something I have any interest in doing; it is also likely to change the behaviour in subtle ways or break some other functionality, such as IniFileMapping.

wouldn't this be the best solution for compatibility-reasons?

In addition to the points made above, it would adversely affect compatibility with other applications (including AutoHotkey) because these standard functions treat INI files as either UTF-16 or ANSI, never UTF-8. These functions are used by many applications, perhaps the majority of applications that use INI files on Windows. Your script should work just fine if all of the following are true:
[*:16upwx5m]All values are encoded as ANSI inside your "UTF-8" file. (Values written with IniWrite meet this requirement. Text typed in a UTF-8 aware editor does not.)
[*:16upwx5m]Whatever reads the values treats them as ANSI strings. (IniRead meets this requirement; but if the /*[comment]*/ trick is used to update assignments in a UTF-8 file, the assigned values may be incorrect as they would be interpreted as UTF-8 and "translated" to either UTF-16 or ANSI.)
[*:16upwx5m]Your editor preserves the values exactly, even though they may appear to be invalid UTF-8.
[*:16upwx5m]All characters you wish to write are present in your system's ANSI code page.I suppose these requirements are not specific to AutoHotkey.


If we assume that anything reading a "UTF-8" INI file as ANSI deserves whatever it gets, manually encoding the values as UTF-8 might be an option. Basically what I already suggested with StrPut and WritePrivateProfileStringA, but internal to IniWrite and conditional on the presence of a UTF-8 BOM. I feel like there'll be some hidden negative consequence, so this idea will probably go on the backburner for a while.

You are encouraged to use a scripted alternative.

Also, keep in mind that whatever the solution to your (ruespe's) particular problem, Compile_AHK.exe would need to be recompiled with modifications and/or AutoHotkey_L.

Learning one
  • Members
  • 1483 posts
  • Last active: Jan 02 2016 02:30 PM
  • Joined: 04 Apr 2009

while (SubStr(Field,1,1) = A_space or SubStr(Field,1,1) = A_Tab)
StringTrimLeft, Field, Field, 1
...
While (SubStr(Field,0,1) = A_space or SubStr(Field,0,1) = A_Tab)
StringTrimRight, Field, Field, 1

Trim() ;)

:) Yes I know, but my goal was to make RM2module compatible with both AutoHotkey Basic and AutoHotkey_L so I used While loop.