Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Source Code Uncommentor in C#

0.00/5 (No votes)
15 Sep 2009 1  
One of the first C# application to remove comments across multiple C-style languages (C, C++, Java and C#)

Introduction

Are you a developer who finds it easier to read code instead of comments, only to discover how difficult it is to analyse the code and how simple a function/procedure turns out to be? Perhaps you have been in a situation in which you wanted to re-document your code which contains 20-pages (or more) from scratch knowing how tedious it is to remove comments line-by-line. Or maybe you are pondering how to "water down" your code before transmitting it to reduce network transfer times.

What happens to comments in your code when you compile your program? This article provides an insight to these and a tool which strips existing comments within an ASCII source code file.

How it works

Ordinary compilers do not understand comments. It simply skips over them. However, having them in the plain text source will likely cause problems during the compilation process. Hence to overcome this situation, most C-style languages use /* */ or // to denote comments. This will flag to the compiler/intepreter not to "read" what comes after

int x;    //commentary about an unknown alien x, in one line

Or

int x;    /* Tell me more about the 
    stars and the moon, in an essay */

So ever wonder what what happens to those pesky comments when the moment you hit that compile button? They get dumped! Well, I mean they stay in the source file. Surprised? As mentioned earlier, comments are not for the compiler! Having said that, it may be possible to store comments in a binary's metadata section. (Although I don't know of a compiler that implements this functionality, at much performance trade off)

Take this code snippet in Java for example

            
/**

* {@link Class#getSimpleName()} is not GWT compatible yet, so we

* provide our own implementation.

*/

@VisibleForTesting

static String simpleName(Class<?> clazz) {

    String name = clazz.getName();

    // we want the name of the inner class all by its lonesome

    int start = name.lastIndexOf('$');

    //if this isn't an inner class, just find the start of the

    // top level class name.

    if (start == -1) {

    start = name.lastIndexOf('.');

    }

    return name.substring(start + 1);

}
        

When you compile the code the compiler sees it as

 

@VisibleForTesting

static String simpleName(Class<?> clazz) {

    String name = clazz.getName();

 

    int start = name.lastIndexOf('$');

 

    if (start == -1) {

        start = name.lastIndexOf('.');

    }

 

    return name.substring(start + 1);

}

as the comments are striped on the fly. That's all the compiler needs to generate object or/and machine code! The removal of comments is always done prior to compilation and it is very often transparent and invisible to developers.

The application I present here today implements this functionality. Given a text source file, it strips of comments, leaving compilable code behind. This come in handy when you wish to redocument someone else's or your code without having to manually remove the code line-by-line. In the example screenshot, all single line, multiline and even Javadoc comments are removed. Likewise, in C#, XML comments are also removed.

Algorithm Overview

The basic rule is when a single line(//) comment is found in the line of code, the program should stop reading until a new line is encounted ('\n'), the next line read.

When a multiline (/*) token is found, the program should stop processing until a */ is found. A "\n" or a "*/" returns the state to normal.

In this implementation, a StreamReader is employed to read our ASCII source code. As code is processes line-by-line using the readLine() method, detecting and handling comment delimeters becomes slightly more difficult. Many compiler implementations written in C such as gcc parses the code on a char-by-char implementation, for performance and optimisation. However what we are developing is nothing close to a full fledged compiler, so line-by-line processing should be adequate.

I was able to keep it to a minimum of 2 methods, the main method doUncomment() and an internal method to handle string literals. All methods are implemented as static, so there is no need to create instances. For more information, please refer to the Uncommenter class.

Using the code

To use the code insert the directive

 using UcommenterCS;

Then simply call the static function to do the work. For example

Uncommenter.doUncomment("src.cpp");	//specify full path

That's it.

If you, however wish to run it standalone "out of the box" or simply like to try it out, I have included a compiled binary which is just as good. It is in the /bin folder. To use it, issue the following command

 UcommenterCS <source.cpp/c/cs/java/h/js>

Parsing capabilities

Comments within string delimeters should be avoided. This application is able to correctly ensure comments are not part of a string! It ignores comments delimeters between " and " blocks.

Future functionality includes detecting and warning against unterminated block comments and string literals with the option of breaking execution should they be found.

Because I'm not a compiler linguist, I am not able to think of all the possible scenarios in which the code may fail. However, if you are up for a challenge, you are welcome to attempt to break my code. If that happens, please do let me know.

History

  • 1st version - 9th July
  • 2nd update (repackaged under different class name and namespace and changed to static methods)   - 14th September

I plan to write a Window Forms version in the not too distant future. Also in the works is a Java version.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here