Question

I used the following code to count the number of comments in a code:

StringTokenizer stringTokenizer = new StringTokenizer(str);
int x = 0;

while (stringTokenizer.hasMoreTokens()) {
    if (exists == false && stringTokenizer.nextToken().contains("/*")) {
        exists = true;

    } else if (exists == true && stringTokenizer.nextToken().contains("*/")) {

        x++;
        exists = false;

    }
}

System.out.println(x);

It works if comments have spaces:

e.g.: "/* fgdfgfgf */ /* fgdfgfgf */ /* fgdfgfgf */".

But it does not work for comments without spaces:

e.g.: "/*fgdfgfgf *//* fgdfgfgf*//* fgdfgfgf */".

Was it helpful?

Solution 2

new StringTokenizer(str,"\n") tokenizes/splits str into lines rather than using the default delimiter which is \t\n\r\f, a combination of spaces, tabs, formfeed, carriage and newline

StringTokenizer stringTokenizer = new StringTokenizer(str,"\n");

This specifies newline as the only delimiter to use for Tokenizing

Using your current approach:

String line;
while(stringTokenizer.hasMoreTokens()){

 line=stringTokenizer.nextToken();

   if(!exists && line.contains("/*")){
        exists = true;
   }
   if(exists && line.contains("*/")){
        x++;
        exists = false;
 }    
}

For multiple comments I tried to use /\\* & \\*/ as patterns in split() and got length for their occurrence in the string, but unfortunately length were not exact due to uneven splitting.

Multiple/Single Comments can be: (IMO)

COMMENT=/* .* */
A = COMMENT;
B = CODE;
C = AB/BA/ABA/BAB/AAB/BAA/A;

OTHER TIPS

Using StringUtils class in commons lang, you can very easily archive this

    String str = "Your String"
    if (&& StringUtils.countMatches(str,"/*") != 0) {
       //no need this if condition
    } else if (StringUtils.countMatches(str,"*/") != 0) {
         x = StringUtils.countMatches(str,"*/");
    }
    System.out.println(x);

This reminds me of flip-flops in Ruby/Perl/Awk et al. There is no need to use a StringTokenizer. You just need to keep states to count the number of lines with comments.

  1. You are inside a comment block. You start printing or collecting all the characters. As soon as you encounter a */ in its entirety you toggle the comment block switch. And switch to state 2

  2. You reject everything until you encounter a /* and are back to state 1.

Something like this

public static int countLines(Reader reader) throws IOException {
    int commentLines = 0;

    boolean inComments = false;
    StringBuilder line = new StringBuilder();

    for (int ch = -1, prev = -1; ((ch = reader.read())) != -1; prev = ch) {
        System.out.println((char)ch);
        if (inComments) {
            if (prev == '*' && ch == '/') { //flip state
                inComments = false;
            }

            if (ch != '\n' && ch != '\r') {
                line.append((char)ch);
            } 

            if (!inComments || ch == '\n' || ch == '\r') {
                String actualLine = line.toString().trim();
                //ignore lines which only have '*/' in them
                commentLines += actualLine.length() > 0 && !"*/".equals(actualLine) ? 1 : 0;
                line = new StringBuilder();
            }
        } else {
            if (prev == '/' && ch == '*') { //flip state
                inComments = true;
            }
        }
    }

    return commentLines;
}

public static void main(String[] args) throws FileNotFoundException, IOException {
    System.out.println(countLines(new FileReader(new File("/tmp/b"))));
}

Above program ignores empty line comments or lines with only /* or */ in them. We also need to ignore nested comments which string tokenizer may fail todo.

Example file /tmp/b

#include <stdio.h>
int main()
{
    /* /******* wrong!  The variable declaration must appear first */
    printf( "Declare x next" );
    int x;

    return 0;
}

returns 1.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top