Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

URL Encoding


  • Please log in to reply
16 replies to this topic
schandl
  • Members
  • 28 posts
  • Last active: Mar 28 2015 04:03 PM
  • Joined: 26 Apr 2005
I am wondering if there is an easy way to URL encode a string as described eg. at http://www.blooberry...urlencoding.htm? So if I have a string

xxx yyy{zzz

I would get

xxx%20yyy%7Bzzz

The hard way would be to use a lot of StringReplace, but maybe there is something like
URLencode, outString, inString

Bernd

Serenity
  • Members
  • 1271 posts
  • Last active:
  • Joined: 07 Nov 2004
You could use the Transform command, eg:
string = "an example"
transform, encoded, html, %string%
msgbox, %string%`n%encoded%

It doesn't seem to encode spaces though. Maybe Transform could be updated to work on ASCII values below 127?

HTML, String: Converts String into its HTML equivalent by translating characters whose ASCII values are above 127 to their HTML names (e.g. £ becomes £). In addition, the four characters "&<> are translated to "&<>. Finally, each linefeed (`n) is translated to
`n (i.e.
followed by a linefeed).


"Anything worth doing is worth doing slowly." - Mae West
Posted Image

BoBo
  • Guests
  • Last active:
  • Joined: --
Isn't there the option to use the javascript convert function which is embeded within the above mentioned page (check the link) ?

shimanov
  • Members
  • 610 posts
  • Last active: Jul 18 2006 08:35 PM
  • Joined: 25 Sep 2005
You code use the JavaScript "escape" function or AHk:

SetFormat, Integer, hex

text := "xxx yyy{zzz"

loop, parse, text
	if A_LoopField in %A_Space%,{
	{
		token := Asc( A_LoopField )
		StringTrimLeft, token, token, 2
		code = %code%`%%token%
	}
	else
		code = %code%%A_LoopField%

MsgBox, text = %text%`n`ncode = %code%
return


schandl
  • Members
  • 28 posts
  • Last active: Mar 28 2015 04:03 PM
  • Joined: 26 Apr 2005
Serenity, the transform function does something different. It encodes for HTML - URL encoding is something else.

Bobo and shimanov, what do you mean by using Javascript? Can I embed Javascript into AHK? Or should I look at the code and do something similar in AHK?

shimanov, in your AHK code, is it correct that I would have to add all "unsafe" character in the line with the first if? I assume you used the check for A_Space and { as an example.

Bernd

shimanov
  • Members
  • 610 posts
  • Last active: Jul 18 2006 08:35 PM
  • Joined: 25 Sep 2005

I assume you used the check for A_Space and { as an example.


That was your example. Then I countered with my own.

However, I could also use the codec, so here it is:

code1 := EncodeURL( "xxx yyy{zzz" )
text1 := DecodeURL( code1 )

code2 := EncodeURL( "http://www.autohotkey.com/forum/viewtopic.php?p=39572#39572" )
text2 := DecodeURL( code2 )

code3 := EncodeURL( "http://www.autohotkey.com/forum/viewtopic.php?p=39572#39572", false )
text3 := DecodeURL( code2 )

MsgBox, %code1%`n%text1%`n`n%code2%`n%text2%`n`n%code3%`n%text3%
return

EncodeURL( p_data, p_reserved=true, p_encode=true )
{
	old_FormatInteger := A_FormatInteger
	SetFormat, Integer, hex

	unsafe = 
		( Join LTrim
			25000102030405060708090A0B0C0D0E0F101112131415161718191A1B1C1D1E1F20
			22233C3E5B5C5D5E607B7C7D7F808182838485868788898A8B8C8D8E8F9091929394
			95969798999A9B9C9D9E9FA0A1A2A3A4A5A6A7A8A9AAABACADAEAFB0B1B2B3B4B5B6
			B7B8B9BABBBCBDBEBFC0C1C2C3C4C5C6C7C8C9CACBCCCDCECFD0D1D2D3D4D5D6D7D8
			D9DADBDCDDDEDF7EE0E1E2E3E4E5E6E7E8E9EAEBECEDEEEFF0F1F2F3F4F5F6F7F8F9
			FAFBFCFDFEFF
		)
		
	if ( p_reserved )
		unsafe = %unsafe%24262B2C2F3A3B3D3F40
	
	if ( p_encode )
		loop, % StrLen( unsafe )//2
		{
			StringMid, token, unsafe, A_Index*2-1, 2
			StringReplace, p_data, p_data, % Chr( "0x" token ), `%%token%, all 
		}
	else
		loop, % StrLen( unsafe )//2
		{
			StringMid, token, unsafe, A_Index*2-1, 2
			StringReplace, p_data, p_data, `%%token%, % Chr( "0x" token ), all
		}
		
	SetFormat, Integer, %old_FormatInteger%

	return, p_data
}

DecodeURL( p_data )
{
	return, EncodeURL( p_data, true, false )
}


random swede
  • Guests
  • Last active:
  • Joined: --
Hi,
I'm trying to decode a string (passed to AHK as a parameter) from tv-browser (tv-browser.org). The string has first been fed through a urlencode function in tv-browser: "urlencode {urlencode(title, "utf-8")} Encodes the parameter as a URL"

It sounds like shimanovs DecodeURL function could help me here, but I can't get it to work. Any suggestions?

Here's a sample of the encoded string tv-browser outputs (it's also encoded in swedish if you wonder why you can't read it even after decoding :lol: ):

Del+9+av+12.+Christopher+g%C3%B6r+ett+%C3%B6verraskande+tillk%C3%A4nnagivande+och+Paulies+sn%C3%A5lhet+sl%C3%A5r+tillbaka+p%C3%A5+honom+sj%C3%A4lv.+Tony+k%C3%A4nner+sig+upplivad+av+en+gammaldags+st%C3%B6t%2C+medan+Carmela+b%C3%B6rjar+misst%C3%A4nka+att+Adriana+inte+%C3%A4r+alls+%C3%A4r+%22f%C3%B6rsvunnen%22.+Amerikansk+serie+om+en+maffiafamilj.+Fr%C3%A5n+2006.+Bredbild+.%0A%28

trik
  • Members
  • 1317 posts
  • Last active: Jun 11 2010 11:48 PM
  • Joined: 15 Jul 2007
The code (listed above my post) looks to be a thread with the user names of those whom have posted in it.
Religion is false. >_>

random swede
  • Guests
  • Last active:
  • Joined: --
Huh?

If you are referring to this:

Del+9+av+12.+Christopher+g%C3%B6r+ett+%C3%B6verraskande+tillk%C3%A4nnagivande+och+Paulies+sn%C3%A5lhet+sl%C3%A5r+tillbaka+p%C3%A5+honom+sj%C3%A4lv.+Tony+k%C3%A4nner+sig+upplivad+av+en+gammaldags+st%C3%B6t%2C+medan+Carmela+b%C3%B6rjar+misst%C3%A4nka+att+Adriana+inte+%C3%A4r+alls+%C3%A4r+%22f%C3%B6rsvunnen%22.+Amerikansk+serie+om+en+maffiafamilj.+Fr%C3%A5n+2006.+Bredbild+.%0A%28

Then as I stated is is a sample of the encoded string that tv-browser (the application, not the website) outputs. In this case the sample is program information for an episode of the tv show the sopranos, in swedish.

Now back to my question.

YMP
  • Members
  • 424 posts
  • Last active: Apr 05 2012 01:18 AM
  • Joined: 23 Dec 2006
I tried it this way. But the output is UTF-8 text, so it didn't show correctly in the message box. Only when I saved it to a file and opened in Notepad, I could read letters with diacritics.
Input=
(Join %
Del+9+av+12.+Christopher+g%C3%B6r+ett+%C3%B6verraskande+
tillk%C3%A4nnagivande+och+Paulies+sn%C3%A5lhet+sl%C3%A5r+
tillbaka+p%C3%A5+honom+sj%C3%A4lv.+Tony+k%C3%A4nner+sig+
upplivad+av+en+gammaldags+st%C3%B6t%2C+medan+Carmela+
b%C3%B6rjar+misst%C3%A4nka+att+Adriana+inte+%C3%A4r+alls+
%C3%A4r+%22f%C3%B6rsvunnen%22.+Amerikansk+serie+om+en+
maffiafamilj.+Fr%C3%A5n+2006.+Bredbild+.%0A%28
)

Output := DecodeURL(Input) 

MsgBox, %Output%

;FileDelete, C:\test\output.txt
FileAppend, %Output%, C:\test\output.txt   ; Save to a file.
Run, Notepad.exe C:\test\output.txt

return

;-----------------------------------------------------------------------

EncodeURL( p_data, p_reserved=true, p_encode=true ) 
{ 
   old_FormatInteger := A_FormatInteger 
   SetFormat, Integer, hex 

   unsafe = 
      ( Join LTrim 
         25000102030405060708090A0B0C0D0E0F101112131415161718191A1B1C1D1E1F20 
         22233C3E5B5C5D5E607B7C7D7F808182838485868788898A8B8C8D8E8F9091929394 
         95969798999A9B9C9D9E9FA0A1A2A3A4A5A6A7A8A9AAABACADAEAFB0B1B2B3B4B5B6 
         B7B8B9BABBBCBDBEBFC0C1C2C3C4C5C6C7C8C9CACBCCCDCECFD0D1D2D3D4D5D6D7D8 
         D9DADBDCDDDEDF7EE0E1E2E3E4E5E6E7E8E9EAEBECEDEEEFF0F1F2F3F4F5F6F7F8F9 
         FAFBFCFDFEFF 
      ) 
       
   if ( p_reserved ) 
      unsafe = %unsafe%24262B2C2F3A3B3D3F40 
    
   if ( p_encode ) 
      loop, % StrLen( unsafe )//2 
      { 
         StringMid, token, unsafe, A_Index*2-1, 2 
         StringReplace, p_data, p_data, % Chr( "0x" token ), `%%token%, all 
      } 
   else 
      loop, % StrLen( unsafe )//2 
      { 
         StringMid, token, unsafe, A_Index*2-1, 2 
         StringReplace, p_data, p_data, `%%token%, % Chr( "0x" token ), all 
      } 
       
   SetFormat, Integer, %old_FormatInteger% 

   return, p_data 
} 

DecodeURL( p_data ) 
{ 
   return, EncodeURL( p_data, true, false ) 
}


random swede
  • Guests
  • Last active:
  • Joined: --
Thanks YMP.

I found UTF82Ansi(zString) here:
Helper script to convert from one to another codepages
http://www.autohotke...topic17343.html

These three steps seems to do what I want:

Output := DecodeURL(Input)
Output := UTF82Ansi(Output)
StringReplace, output, output,+,%A_Space%,all

Well there's some character left undecoded at the very end left but apart from that it works. Will testdrive it some more now.

random swede
  • Guests
  • Last active:
  • Joined: --
Back after some tests... These encoded characters slips through: %28c%29
That decodes to © so decoding for characters ( and ) is running into some problem.

YMP
  • Members
  • 424 posts
  • Last active: Apr 05 2012 01:18 AM
  • Joined: 23 Dec 2006
It seems you just have to add needed codes to the Unsafe variable:
if ( p_reserved ) 
  unsafe = %unsafe%24262B2C2F3A3B3D3F40[color=red]2829[/color]  ; 28 and 29 added.


Yure
  • Guests
  • Last active:
  • Joined: --
This is the code of my urlencode and urldecode functions, they are not optimized or anything, but they work fine to me.
hex(n){
	f:=n//16
	s:=Mod(n, 16)
	if (f=10)
		f=A
	else if (f=11)
		f=B
	else if (f=12)
		f=C
	else if (f=13)
		f=D
	else if (f=14)
		f=E
	else if (f=15)
		f=F
	if (s=10)
		s=A
	else if (s=11)
		s=B
	else if (s=12)
		s=C
	else if (s=13)
		s=D
	else if (s=14)
		s=E
	else if (s=15)
		s=F
	return "%" . f . s
}
urlencode(s){
	;http://www.blooberry.com/indexdot/html/topics/urlencoding.htm
	;must be first or we are in trouble
	StringReplace, s, s, % chr(37), % hex(37), All
	;AsciiControlCharacters 00-1F hex (0-31 decimal) and 7F (127 decimal.
	loop, 32
		StringReplace, s, s, % chr(A_Index-1), % hex(A_Index-1), All
	StringReplace, s, s, % chr(127), % hex(127), All
	;Non-ASCII characters 80-FF % hex (128-255 decimal.)
	loop, 128
		StringReplace, s, s, % chr(A_Index+127), % hex(A_Index+127), All
	;"Reserved characters"
	StringReplace, s, s, % chr(36), % hex(36), All
	StringReplace, s, s, % chr(38), % hex(38), All
	StringReplace, s, s, % chr(43), % hex(43), All
	StringReplace, s, s, % chr(44), % hex(44), All
	StringReplace, s, s, % chr(47), % hex(47), All
	StringReplace, s, s, % chr(58), % hex(58), All
	StringReplace, s, s, % chr(59), % hex(59), All
	StringReplace, s, s, % chr(61), % hex(61), All
	StringReplace, s, s, % chr(63), % hex(63), All
	StringReplace, s, s, % chr(64), % hex(64), All
	;"Unsafe characters"
	StringReplace, s, s, % chr(32), % hex(32), All
	StringReplace, s, s, % chr(34), % hex(34), All
	StringReplace, s, s, % chr(60), % hex(60), All
	StringReplace, s, s, % chr(62), % hex(62), All
	StringReplace, s, s, % chr(35), % hex(35), All
	StringReplace, s, s, % chr(123), % hex(123), All
	StringReplace, s, s, % chr(125), % hex(125), All
	StringReplace, s, s, % chr(124), % hex(124), All
	StringReplace, s, s, % chr(92), % hex(92), All
	StringReplace, s, s, % chr(94), % hex(94), All
	StringReplace, s, s, % chr(126), % hex(126), All
	StringReplace, s, s, % chr(91), % hex(91), All
	StringReplace, s, s, % chr(93), % hex(93), All
	StringReplace, s, s, % chr(96), % hex(96), All
	return s
}
urldecode(s){
	;AsciiControlCharacters 00-1F hex (0-31 decimal) and 7F (127 decimal.
	loop, 32
		StringReplace, s, s, % hex(A_Index-1), % chr(A_Index-1), All
	StringReplace, s, s, % hex(127), % chr(127), All
	;Non-ASCII characters 80-FF % chr (128-255 decimal.)
	loop, 128
		StringReplace, s, s, % hex(A_Index+127), % chr(A_Index+127), All
	;"Reserved characters"
	StringReplace, s, s, % hex(36), % chr(36), All
	StringReplace, s, s, % hex(38), % chr(38), All
	StringReplace, s, s, % hex(43), % chr(43), All
	StringReplace, s, s, % hex(44), % chr(44), All
	StringReplace, s, s, % hex(47), % chr(47), All
	StringReplace, s, s, % hex(58), % chr(58), All
	StringReplace, s, s, % hex(59), % chr(59), All
	StringReplace, s, s, % hex(61), % chr(61), All
	StringReplace, s, s, % hex(63), % chr(63), All
	StringReplace, s, s, % hex(64), % chr(64), All
	;"Unsafe characters"
	StringReplace, s, s, % hex(32), % chr(32), All
	StringReplace, s, s, % hex(34), % chr(34), All
	StringReplace, s, s, % hex(60), % chr(60), All
	StringReplace, s, s, % hex(62), % chr(62), All
	StringReplace, s, s, % hex(35), % chr(35), All
	StringReplace, s, s, % hex(123), % chr(123), All
	StringReplace, s, s, % hex(125), % chr(125), All
	StringReplace, s, s, % hex(124), % chr(124), All
	StringReplace, s, s, % hex(92), % chr(92), All
	StringReplace, s, s, % hex(94), % chr(94), All
	StringReplace, s, s, % hex(126), % chr(126), All
	StringReplace, s, s, % hex(91), % chr(91), All
	StringReplace, s, s, % hex(93), % chr(93), All
	StringReplace, s, s, % hex(96), % chr(96), All
	;must be last or we are in trouble
	StringReplace, s, s, % hex(37), % chr(37), All
}


  • Guests
  • Last active:
  • Joined: --
See Titan's post here