text to html, html to text (recreate AHK's Transform HTML subcommand)

Post your working scripts, libraries and tools for AHK v1.1 and older
User avatar
Posts: 6902
Joined: 19 Dec 2016, 01:58
Location: UK

text to html, html to text (recreate AHK's Transform HTML subcommand)

16 Oct 2017, 16:09

A description of the Transform HTML subcommand which is scheduled to be removed in AHK v2.

Note: it does text to html, but there no equivalent to do html to text. E.g. you use it to prepare text to be stored as html.

Converts String into its HTML equivalent by translating characters whose ASCII values are above 127 to their HTML names (e.g. £ becomes &pound;). In addition, the four characters "&<> are translated to "&<>. Finally, each linefeed (`n) is translated to <br>`n (i.e. <br> followed by a linefeed).
Converts certain characters to named expressions. e.g. € is converted to &euro;
Converts certain characters to numbered expressions. e.g. € is converted to &#8364;

Code: Select all

q:: ;attempt at recreating the Transform HTML subcommand for use in AHK v2
vText := ""
Loop, 500
	vText .= Chr(A_Index)
Loop, 4
	vFlags := A_Index-1
	vHtml1 := JEE_TransformHtml(vText, vFlags)
	Transform, vHtml2, HTML, % vText, % vFlags
	MsgBox, % (vHtml1 == vHtml2) "`r`n" vHtml1


;for AHK Unicode only
;0: 5 chars: Chr(10) and "&<>
;1: 5 chars, then 121 chars to named expressions
;2: 5 chars, then Chr(128) and above to numbered expressions
;3: perform mode 1, then mode 2

JEE_TransformHtml(vText, vFlags:=1)
	static oArray := Object(StrSplit("160,nbsp;161,iexcl;162,cent;163,pound;164,curren;165,yen;166,brvbar;167,sect;168,uml;169,copy;170,ordf;171,laquo;172,not;173,shy;174,reg;175,macr;176,deg;177,plusmn;178,sup2;179,sup3;180,acute;181,micro;182,para;183,middot;184,cedil;185,sup1;186,ordm;187,raquo;188,frac14;189,frac12;190,frac34;191,iquest;192,Agrave;193,Aacute;194,Acirc;195,Atilde;196,Auml;197,Aring;198,AElig;199,Ccedil;200,Egrave;201,Eacute;202,Ecirc;203,Euml;204,Igrave;205,Iacute;206,Icirc;207,Iuml;208,ETH;209,Ntilde;210,Ograve;211,Oacute;212,Ocirc;213,Otilde;214,Ouml;215,times;216,Oslash;217,Ugrave;218,Uacute;219,Ucirc;220,Uuml;221,Yacute;222,THORN;223,szlig;224,agrave;225,aacute;226,acirc;227,atilde;228,auml;229,aring;230,aelig;231,ccedil;232,egrave;233,eacute;234,ecirc;235,euml;236,igrave;237,iacute;238,icirc;239,iuml;240,eth;241,ntilde;242,ograve;243,oacute;244,ocirc;245,otilde;246,ouml;247,divide;248,oslash;249,ugrave;250,uacute;251,ucirc;252,uuml;253,yacute;254,thorn;255,yuml;338,OElig;339,oelig;352,Scaron;353,scaron;376,Yuml;402,fnof;710,circ;732,tilde;8211,ndash;8212,mdash;8216,lsquo;8217,rsquo;8218,sbquo;8220,ldquo;8221,rdquo;8222,bdquo;8224,dagger;8225,Dagger;8226,bull;8230,hellip;8240,permil;8249,lsaquo;8250,rsaquo;8364,euro;8482,trade", [",",";"])*)
	local vChar,vOrd,vText2

	;replace & before everything else
	;replace `n before <>
	vText := StrReplace(vText, "&", "&")
	vText := StrReplace(vText, Chr(34), """)
	vText := StrReplace(vText, "<", "<")
	vText := StrReplace(vText, ">", ">")
	vText := StrReplace(vText, "`n", "<br>`n")

	vText2 := RegExReplace(vText, "[[:ascii:]]")
	if vFlags
		while !(vText2 = "")
			vChar := SubStr(vText2, 1, 1)
			vOrd := Ord(vChar)
			if (vFlags & 1) && oArray.HasKey(vOrd)
				vText := StrReplace(vText, vChar, "&" oArray[vOrd] ";")
			else if (vFlags & 2)
				vText := StrReplace(vText, vChar, "&#" vOrd ";")
			vText2 := StrReplace(vText2, vChar)
	return vText
Other similar functions:

Code: Select all

	oHTML := ComObjCreate("HTMLFile")
	oHTML.write("<title>" vHtml "</title>")
	vText := oHTML.getElementsByTagName("title")[0].innerText
	oHTML := ""
	return vText


	oHTML := ComObjCreate("HTMLFile")
	oHTML.getElementsByTagName("title")[0].value := vText
	vHtml := oHTML.getElementsByTagName("title")[0].outerHTML
	oHTML := ""
	return SubStr(vHtml, 15, -10)
Note: I believe that the code below replaces characters 9-13 and 160 with spaces and trims leading/multiple/trailing spaces. And that it replaces ChrW(128) to ChrW(159) with ChrA(128) to ChrA(159).

[To achieve ChrA() in AHK Unicode, see ';get 255 ANSI characters (in AHK Unicode versions)' here:]
jeeswg's characters tutorial - AutoHotkey Community

Other similar code:

Code: Select all

;convert html special characters - Ask for Help - AutoHotkey Community

document := ComObjCreate("HTMLFile")
MsgBox % document.body.outerText

;How to transform like "&#8364; " this code into character? - AutoHotkey Community

  doc := ComObjCreate("HTMLfile")
  return doc.body.innerText
text/list/table functions - AutoHotkey Community
https://autohotkey.com/boards/viewtopic ... 89#p135289
Transform's HTML subcommand: char 8218 - AutoHotkey Community
https://autohotkey.com/boards/viewtopic ... 14&t=38422
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA

Return to “Scripts and Functions (v1)”

Who is online

Users browsing this forum: No registered users and 175 guests