RegEx can't handle Unicode Characters from 55296 to 56319

Report problems with documented functionality
Guest4589

RegEx can't handle Unicode Characters from 55296 to 56319

21 Feb 2017, 21:51

Code: Select all

	;RegEx can't handle Unicode Characters from 55296 to 56319

Text := Chr(55295)

msgbox, % ""
. "Text = " . Text . "`r`n`r`n"
. RegExMatch(Text, Chr(55295)) . "`r`n"
. RegExMatch(Text, "\" Chr(55295)) . "`r`n"
. RegExMatch(Text, "\Q" Chr(55295)) . "`r`n"`
. "______________`r`n"
. RegExReplace(Text, Chr(55295), "@") . "`r`n"
. RegExReplace(Text, "\" Chr(55295), "@") . "`r`n"
. RegExReplace(Text, "\Q" Chr(55295), "@") . "`r`n"

;__________________________________________________________

Text := Chr(55296)

msgbox, % ""
. "Text = " . Text . "`r`n`r`n"
. RegExMatch(Text, Chr(55296)) . "`r`n"
. RegExMatch(Text, "\" Chr(55296)) . "`r`n"
. RegExMatch(Text, "\Q" Chr(55296)) . "`r`n"`
. "______________`r`n"
. RegExReplace(Text, Chr(55296), "@") . "`r`n"
. RegExReplace(Text, "\" Chr(55296), "@") . "`r`n"
. RegExReplace(Text, "\Q" Chr(55296), "@") . "`r`n"
kon
Posts: 1756
Joined: 29 Sep 2013, 17:11

Re: RegEx can't handle Unicode Characters from 55296 to 56319

22 Feb 2017, 14:30

fileformat.info says U+D7FF (Chr(55295)) is not a valid Unicode character.

Code: Select all

http://www.fileformat.info/info/unicode/char/d7ff/index.htm
    "U+D7FF is not a valid unicode character."
    
https://unicode-table.com/en/blocks/hangul-jamo-extended-b/
https://unicode-table.com/en/blocks/high-surrogates/
   "Isolated surrogate code points have no general interpretation; consequently, no character code charts or names
    lists are provided for this range."
https://unicode-table.com/en/blocks/high-private-use-surrogates/

Return to “Bug Reports”

Who is online

Users browsing this forum: No registered users and 12 guests