Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

Automatic coloring of comments in the code


  • Please log in to reply
59 replies to this topic
majkinetor
  • Moderators
  • 4512 posts
  • Last active: Jul 29 2016 12:40 AM
  • Joined: 24 May 2006
Good work Titan.
Posted Image

Chris
  • Administrators
  • 10727 posts
  • Last active:
  • Joined: 02 Mar 2004

Here is a test version that includes an example page: ahk_syntax.zip

It looks great. The coloring of comments can be decided via poll. Gray or green might be preferable, though perhaps green is unworkable unless the rest of the [code=auto:0] section were changed to blue, black, or purple). The ideal colors might be purple or dark blue for code sections with gray comments.

Thanks for working on it.

I don't think a 2kb javascript will do much harm but if you want I could gzip it inside a PHP wrapper with special headers to suggest a permanent cache?

Permanent cache might be a bad idea because if the .js file ever changes, wouldn't every user have to do a force-refresh (Control-F5) to update it? For images, I think I've set this domain to expire them after 4 weeks or so.

Chris, I played with [#CommentFlag]. Perhaps you should add to the help that it must be used only once, and before any comment...

The directive works (even multiple times) if you have at least one non-blank line before actually using the new comment flag. That has been a bug for a long time now; the fact that no one has reported it probably indicates that hardly anyone uses it.

Thanks.

PhiLho
  • Moderators
  • 6850 posts
  • Last active: Jan 02 2012 10:09 PM
  • Joined: 27 Dec 2005
var newComm = /^\s*#CommentFlag\s+(\S+)\s/mi.exec(html);
			if (newComm != null)
				comm = newComm[1];
8) Obstinate...

I see Chris answer in preview... I think we will stick to the first directive found... :-)

[EDIT] Switched from /regexp/mi(html) to /regexp/mi.exec(html) to be nice to IE...
Posted Image vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")

polyethene
  • Members
  • 5519 posts
  • Last active: May 17 2015 06:39 AM
  • Joined: 26 Oct 2012

Permanent cache might be a bad idea because if the .js file ever changes, wouldn't every user have to do a force-refresh (Control-F5) to update it?

If your server supports the If-Modified-Since HTTP header (which I'm very sure it does) browsers will check for newer versions. You can set the cache-control to a week or a few days instead.

var newComm = /^\s*#CommentFlag\s+(\S+)\s/mi.exec(html);

What's wrong with /\n\s*#CommentFlag\s/i? I don't want to use the exec method because it can use a lot of memory.

majkinetor
  • Moderators
  • 4512 posts
  • Last active: Jul 29 2016 12:40 AM
  • Joined: 24 May 2006
Just a note: second /* comment (spaning 2 lines) isn't colored in Opera, in IE it is. First isn't colore in both, but that is by design so far.
Posted Image

polyethene
  • Members
  • 5519 posts
  • Last active: May 17 2015 06:39 AM
  • Joined: 26 Oct 2012

Just a note: second /* comment (spaning 2 lines) isn't colored in Opera, in IE it is.

I'm not sure how good pcre is supported in different browsers, I heard Opera has some serious flaws

That would be a drawback; but if it goes beserk, you can instead have the script detect Opera and abort.

I'll download Opera and try to support it.

majkinetor
  • Moderators
  • 4512 posts
  • Last active: Jul 29 2016 12:40 AM
  • Joined: 24 May 2006
It is pretty bad not to support the best browser.
Posted Image

polyethene
  • Members
  • 5519 posts
  • Last active: May 17 2015 06:39 AM
  • Joined: 26 Oct 2012

It is pretty bad not to support the best browser.

Firefox is fully supported, and always will be.

Opera lacks good support for regex, alert(/\/\*(.*)\n\*\//g.exec(html)[0]); seems to stop the execution :?
I will continue to try to support Opera but if it becomes too difficult I might just leave it. When I discover how to regex for 2+ multilines I'll post an updated version.

majkinetor
  • Moderators
  • 4512 posts
  • Last active: Jul 29 2016 12:40 AM
  • Joined: 24 May 2006
regexing multilines was always a problem

Not because it was so dificult but because reg ex programs usulay don't have good support for this.

I did it with sed though, since it has some kind of mini assembler in witch you can say , when you find next line, execute this regex again :).
Posted Image

polyethene
  • Members
  • 5519 posts
  • Last active: May 17 2015 06:39 AM
  • Joined: 26 Oct 2012

regexing multilines was always a problem

Not because it was so dificult but because reg ex programs usulay don't have good support for this.

In that case I guess I should parse /* */ manually instead. That's unless you or anyone else can tell me what exp to use in pcre for multilines.

PhiLho
  • Moderators
  • 6850 posts
  • Last active: Jan 02 2012 10:09 PM
  • Joined: 27 Dec 2005

var newComm = /^\s*#CommentFlag\s+(\S+)\s/mi.exec(html);

What's wrong with /\n\s*#CommentFlag\s/i? I don't want to use the exec method because it can use a lot of memory.

Nothing wrong with the \n vs. ^, I just wanted to make a point... :-P
Otherwise, I find more elegant to get the comment delimiter by capture rather than by complex string manipulations.

I haven't seen yet anything on the cost of exec usage. Can you point me on some reference on the topic? I find it is a pity not to use some handy function because of poor (?) implementation.

What I did:
function SyntaxColoring()
{
	if (!document.getElementsByTagName)	// Old browser, perhaps Netscape Navigator
		return;	// Don't try anything...

	var container = "pre", containerClass = "code"; // container and class name
	var e = document.getElementsByTagName(container);
	for (var i = 0; i < e.length; i++)	// Loop on all PRE blocks
		if (e[i].className == containerClass)  // That's the right class (we never know...)
		{
			var html = e[i].innerHTML;
			var prevLen = html.len;
			var commentDelimiter = ";";
			var newCommentDelimiter = // Can fail if symbol is followed by  ... Bah.
					/^(?:\s| )*#CommentFlag(?:\s| )+(\S+)\s/mi.exec(html);
			if (newCommentDelimiter != null)
				commentDelimiter = newCommentDelimiter[1];

			var regex = new RegExp("(\\s| |^)(" + commentDelimiter + ".*)$", "mg");
			html = html.replace(regex, '$1<span class="SLC">$2</span>');
			html = html.replace(
					/^((?:\s| )*\/\*(.|\r|\n)*?^(?:\s| )*\*\/)/mg,
					'<span class="MLC">$1</span>');
			if (e[i].outerHTML)	// That's IE...
			{
				var wrap = e[i].outerHTML;
				// Have to replace the whole PRE section
				e[i].outerHTML = wrap.substr(0, wrap.indexOf(">") + 1) + html + "</" + container + ">"
			}
			else
			{
				// Just replace the content of the section
				e[i].innerHTML = html;
			}
		}
}
window.onload = SyntaxColoring;
<style type="text/css">
.MLC	/* Multiple Line Comment */
{
  background-color: #EEE;
  color: #888;
}
.SLC	/* Single Line Comment */
{
  color: #AAA;
}
</style>
I stick to my idea of spans with classes... And I reworked the code to use my coding conventions, so I find my way there. But of course, the final code [style] will be your.
I work on the multiline stuff too.

[EDIT] Saw the last two messages. I think separating the search of /* and */ separately is probably the way to go, indeed. It confirms that REs (not pcre, that's a library probably not used in JS implementations...) cannot do everything and the kitchen sink, but often need to be used with some procedural code around.

[EDIT again] Well, I was wrong here, a simple RE seems to be enough. Updated code above.
Note that block comments tolerate stuff on the same line: after /*, it is part of the comment, after */, it is executed.

[E] Oh, doesn't work on Titan' sample, back to drawing board... OK, just missed the non-greedy quantifier.

[E] I missed the first line comment when it started the code block. Fixed & updated.

[E] Updated to handle better the   The #CommentFlag can still fail if the symbol is followed by more than one space (ie. if followed, in HTML, by a  ) but probability is rather low, so I won't correct this.
Posted Image vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")

PhiLho
  • Moderators
  • 6850 posts
  • Last active: Jan 02 2012 10:09 PM
  • Joined: 27 Dec 2005
Bump, done...
I might try and convert this to GreaseMonkey...
Posted Image vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")

polyethene
  • Members
  • 5519 posts
  • Last active: May 17 2015 06:39 AM
  • Joined: 26 Oct 2012
In the new ahk_syntax.zip 1.0b:[*:3utuqneh]Compatable with IE and Netscape from versions 4+
[*:3utuqneh]Compatable with all versions of Firefox and Gecko based browsers
[*:3utuqneh]Compatable with new versions of Opera
[*:3utuqneh]Reduced overheads and predeclared vars
[*:3utuqneh]New parser for multiline commentsOnce I get confirmation that it works I'll post the PHP wrapper.

polyethene
  • Members
  • 5519 posts
  • Last active: May 17 2015 06:39 AM
  • Joined: 26 Oct 2012

Nothing wrong with the \n vs. ^, I just wanted to make a point... :-P

AutoHotkey parses scripts by \n so I chose to do the same.

Otherwise, I find more elegant to get the comment delimiter by capture rather than by complex string manipulations.

Bad coding styles do exist :?

I haven't seen yet anything on the cost of exec usage. Can you point me on some reference on the topic?

exec like StringSplit creates an array of all the matches which consumes more memory than needed and slows down the script.

What I did so far:

I'll take a look thanks.

I stick to my idea of spans with classes...

A whole new CSS stylesheet doesn't seem worth it for just two styles.

And I reworked the code to use my coding conventions, so I find my way there.

I've tried to keep my coding style standard so others can understand it.

I think separating the search of /* and */ separately is probably the way to go, indeed.

Done, see my previous post.

(not pcre, that's a library probably not used in JS implementations...)

Doesn't javascript use pcre for regex like PHP?

[EDIT again]Note that block comments tolerate stuff on the same line: after /*, it is part of the comment, after */, it is executed.

I know, check syntax.html in the zip.

PhiLho
  • Moderators
  • 6850 posts
  • Last active: Jan 02 2012 10:09 PM
  • Joined: 27 Dec 2005

Otherwise, I find more elegant to get the comment delimiter by capture rather than by complex string manipulations.

Bad coding styles do exist :?

Thank you. :cry:

I haven't seen yet anything on the cost of exec usage. Can you point me on some reference on the topic?

exec like StringSplit creates an array of all the matches which consumes more memory than needed and slows down the script.

In my code, I have only one capture and no g, so it has only one match. No problem here.

I stick to my idea of spans with classes...

A whole new CSS stylesheet doesn't seem worth it for just two styles.

Why whole new? Just add them to the existing stylesheet. Of course, that will be Chris' final choice, when/if he adopts your solution.
Well, I admit that the classes make more sense for the PHP solution than for the JS one: PHP sends less chars this way, but the fact that the HTML code generated by JS is bigger has no real impact.

And I reworked the code to use my coding conventions, so I find my way there.

I've tried to keep my coding style standard so others can understand it.

Standard? There is no standard, just preferences and good taste.
Nothing wrong with your code style! I wasn't criticizing it.

(not pcre, that's a library probably not used in JS implementations...)

Doesn't javascript use pcre for regex like PHP?

For Mozilla, I don't know, but JS doesn't use (full) PCRE syntax anyway, so I guess it has an engine of its own. And for MS, I am almost sure it doesn't. It even differs on $ handling in multiline REs: in Mozilla, which probably follows ECMAScript rules, it "matches immediately before a line break character." For IE, it matches after...
Posted Image vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")