Question

Any way to remove them only from the for loop blocks in a file easily...

Before:

for( ... ) {
...
System.out.println("string");
...
System.out.println("string");
...
}

After:

for( ... ) {
...
... 
...
}
Was it helpful?

Solution

This is tricky: Which closing brace closes the for-loop? Either you parse the whole code, or you use some heuristic. In the below solution, I require the intendation of the closing brace to be the same as the intendation of the for keyword:

$ perl -nE'
    if( /^(\s*)for\b/ .. /^$ws\}/ ) {
      $ws = $1 // $ws;
      /^\s*System\.out\.println/ or print;
    } else { print }'

This uses the flip-flop operator COND1 .. COND2. The script can be used as a simple filter

$ perl -nE'...' <source >processed

or with backup functionality:

$ perl -i.bak -nE'...' source

(creates file source.bak as backup).

Only tested against the example input; not againts a sensible test suite.
This script passes the GLES Prateek Nina test.

To run this script on all Java files in a directory, do

$ perl -i.bak -nE'...' *.java

Edit

On Windows systems, the delimiter has to be changed to ". Also, we have to do all globbing ourselves.

> perl -nE"if(/^(\s*)for\b/../^$ws\}/){$ws=$1//$ws;/^\s*System\.out\.println/ or print}else{print}BEGIN{@ARGV=$#ARGV?@ARGV:glob$ARGV[0]}" *.java

Edit 2

Here is an implementation of the brace-counting algorithm I outlined in the comments. This solution does backups as well. The command line arguments will be interpreted as glob expressions.

#!/usr/bin/perl
use strict; use warnings;

clean($_) for map glob($_), @ARGV;

sub clean {
    local @ARGV = @_;
    local $^I = ".bak";
    my $depth = 0;
    while (<>) {
        $depth ||= /^\s*for\b/ ? "0 but true" : 0;
        my $delta = ( ()= /\{/g ) - ( ()= /\}/g );
        $depth += $delta if $depth && $delta;
        $depth = 0 if $depth < 0;
        print unless $depth && /^\s*System\.out\.println/;
    }
    return !!1;
}

This doesn't do comments either. This will only reckognize System.out.println-statements that start a new line.

Example usage: > perl thisScript.pl *.java.

Here is a test file with pseudo-java syntax that I used for testing. All lines marked with XXX will be gone once the script has run.

/** Java test suite **/

bare block {
    System.out.println(...); // 1 -- let stand
}

if (true) {
    for (foo in bar) {
        System.out.println; // 2 XXX
        if (x == y) {
            // plz kill this
            System.out.println // 3 XXX
        } // don't exit here
        System.out.println // 4 XXX
    }
}

for (...) {
    for {
        // will this be removed?
        System.out.println // 5 XXX
    }
}

/* pathological cases */

// intendation
for (...) { System.out.println()/* 6 */} 

// intendation 2
for (...)
{
    if (x)
    {
        System.out.println // 7 XXX
    }}

// inline weirdness
for (...) {
    // "confuse" script here
    foo = new baz() {void qux () {...}
    };
    System.out.println // 8 XXX
}

№ 1 should stay, and does. Statement № 6 should be removed; but these scripts are incapable of doing so.

OTHER TIPS

I'd suggest a two-fold approach using the static code analyser PMD to locate the problem statements and a simple script to remove the lines. All source and configuration is included below, EDIT including Python and Groovy alternatives.

PMD has an extension mechanism allowing new rules to be added very simply using a simple XPath expression. In my implementation below, I use:

        //WhileStatement/Statement/descendant-or-self::
            Statement[./StatementExpression/PrimaryExpression/PrimaryPrefix/Name[@Image="System.out.println"]]
        |
        //ForStatement/Statement/descendant-or-self::
            Statement[./StatementExpression/PrimaryExpression/PrimaryPrefix/Name[@Image="System.out.println"]]

The benefits of using this approach are:

  • No regular expressions
  • Graphical editor to develop and refine rules - you could fine-tune the rules I have given to deal with any other scenarios you have
  • Handles all weird formatting of the Java source - PMD uses the JavaCC compiler to understand the structure of all valid Java files
  • Handles where the System.out.println is in a conditional within the loop - to any depth
  • Handles where statements are split over multiple lines.
  • Can be used within Eclipse and IntelliJ

Instructions

  1. Create a PMD rule in a custom ruleset.

    • Create a directory rulesets/java somewhere on the CLASSPATH - it could be under the working directory if "." is on the path.
    • In this directory, create a ruleset XML file called custom.xml containing:

      <?xml version="1.0"?>
      <ruleset name="Custom"
          xmlns="http://pmd.sourceforge.net/ruleset/2.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://pmd.sourceforge.net/ruleset/2.0.0 http://pmd.sourceforge.net/ruleset_2_0_0.xsd">
      
          <description>Detecting System.out.println's</description>
          <rule name="LoopedSystemOutPrintlns"
                message="System.out.println() statements in a for or while loop"
                language="java"
                class="net.sourceforge.pmd.lang.rule.XPathRule">
            <description>
               Find System.out.println() statements in for or while loops.
            </description>
            <priority>1</priority>
            <properties>
              <property name="xpath">
              <value>
                <![CDATA[
      //WhileStatement/Statement/descendant-or-self::
          Statement[./StatementExpression/PrimaryExpression/PrimaryPrefix/Name[@Image="System.out.println"]]
      |
      //ForStatement/Statement/descendant-or-self::
          Statement[./StatementExpression/PrimaryExpression/PrimaryPrefix/Name[@Image="System.out.println"]]
                ]]>
                </value>
              </property>
            </properties>
          </rule>
      </ruleset>
      
    • Create a rulesets.properties file containing the following line:

      rulesets.filenames=rulesets/java/custom.xml
      
    • Great! PMD now has been configured with your new rule that identifies all occasions of a System.out.println inside any loop anywhere in your code. Your ruleset is now called 'java-custom' because it's 'custom.xml' in the directory 'java'

  2. Run PMD on your codebase selecting just your ruleset, java-custom. Use the XML report to get both the starting and ending lines. Capture the result in the file "violations.xml":

    $ pmd -d <SOURCEDIR> -f xml -r java-custom > violations.xml
    

    Produces a file similar to:

    <?xml version="1.0" encoding="UTF-8"?>
    <pmd version="5.0.1" timestamp="2013-01-28T11:22:25.688">
    <file name="SOURCEDIR/Example.java">
    <violation beginline="7" endline="11" begincolumn="13" endcolumn="39" rule="LoopedSystemOutPrintlns" ruleset="Custom" class="Example" method="bar" priority="1">
    System.out.println() statements in a for or while loop
    </violation>
    <violation beginline="15" endline="15" begincolumn="13" endcolumn="38" rule="LoopedSystemOutPrintlns" ruleset="Custom" class="Example" method="bar" priority="1">
    System.out.println() statements in a for or while loop
    </violation>
    <violation beginline="18" endline="18" begincolumn="13" endcolumn="38" rule="LoopedSystemOutPrintlns" ruleset="Custom" class="Example" method="bar" priority="1">
    System.out.println() statements in a for or while loop
    </violation>
    <violation beginline="20" endline="21" begincolumn="17" endcolumn="39" rule="LoopedSystemOutPrintlns" ruleset="Custom" class="Example" method="bar" priority="1">
    System.out.println() statements in a for or while loop
    </violation>
    </file>
    </pmd>
    

    You can use this report to check that PMD has identified the correct statements.

  3. Create a Python script (NOTE: a Groovy alternative is given at the bottom of the Answer) to read in the violations XML file and process the source files

    • Create a file called remover.py on a directory in the classpath
    • Add the following Python to it:

      from xml.etree.ElementTree import ElementTree
      from os import rename, path
      from sys import argv
      
      def clean_file(source, target, violations):
          """Read file from source outputting all lines, *except* those in the set
          violations, to the file target"""
          infile  = open(source, 'r' )
          outfile = open(target, "w")
          for num, line in enumerate(infile.readlines(), start=1):
              if num not in violations:
                  outfile.write(line)
          infile.close()
          outfile.close()
      
      
      def clean_all(pmd_xml):
          """Read a PMD violations XML file; for each file identified, remove all 
          lines that are marked as violations"""
          tree = ElementTree()
          tree.parse(pmd_xml)
          for file in tree.findall("file"):
              # Create a list of lists. Each inner list identifies all the lines
              # in a single violation.
              violations = [ range(int(violation.attrib['beginline']), int(violation.attrib['endline'])+1) for violation in file.findall("violation")]
              # Flatten the list of lists into a set of line numbers
              violations = set( i for j in violations for i in j )
      
              if violations:
                  name = file.attrib['name']
                  bak  = name + ".bak"
                  rename(name, bak)
                  clean_file(bak, name, violations)
      
      if __name__ == "__main__":
          if len(argv) != 2 or not path.exists(argv[1]):
              exit(argv[0] + " <PMD violations XML file>")
          clean_all(argv[1])
      
  4. Run the Python script. This will rename matching files by adding a ".bak", then rewrite the Java file without the offending lines. This may be destructive so ensure your files are properly backed up first. In particular, don't run the script twice in a row - the second time round will naively remove the same line numbers, even though they have already been removed:

    $ python remover.py violations.xml
    

EDIT

For those who prefer a more Java-oriented script to remove System.out.println statements from the violations.xml, I present the following Groovy:

    def clean_file(source, target, violations) {
        new File(target).withWriter { out ->
            new File(source).withReader { reader ->
                def i = 0
                while (true) {
                    def line = reader.readLine()
                    if (line == null) {
                        break
                    }  else {
                        i++
                        if(!(i in violations)) {
                            out.println(line)
                        }
                    }
                }
            }
        }
    }

    def linesToRemove(file_element) {
        Set lines = new TreeSet()
        for (violation in file_element.children()) {
            def i = Integer.parseInt(violation.@beginline.text())
            def j = Integer.parseInt(violation.@endline.text())
            lines.addAll(i..j)
        }
        return lines
    }

    def clean_all(file_name) {
        def tree = new XmlSlurper().parse(file_name)
        for (file in tree.children()) {
            def violations = linesToRemove(file)
            if (violations.size() > 0) {
                def origin = file.@name.text()
                def backup = origin + ".bak"
                new File(origin).renameTo(new File(backup))
                clean_file(backup, origin, violations)
            }
        }
    }

    clean_all("violations.xml")

As a general observation, System.out.println calls are not necessarily the problem - it may be that your statements are of the form "Calling method on " + obj1 + " with param " + obj2 + " -> " + (obj1.myMethod(obj2)) and the real cost is both the string concatenation (better with StringBuffer/StringBuilder) and the cost of the method call.

Edit:

1. nested for loops corrected

2. .java files now fetched recursively

Note:

When you are confident with the code, replace line 45: open( hanw , "+>".$file.".txt" );

with this line: open( hanw , "+>".$file );

application.pl

use strict;
use File::Find qw( finddepth );
our $root = "src/";
our $file_data = {};
our @java_files;

finddepth( sub {
  if( $_ eq '.' || $_ eq '..' ) {
    return;
  } else {
    if( /\.java$/i ) {
      push( @java_files , $File::Find::name );
    }
  }
} , $root );

sub clean {
  my $file = shift;
  open( hanr , $file );
  my @input_lines = <hanr>;
  my $inside_for = 0;

  foreach( @input_lines ) {
    if( $_ =~ /(\s){0,}for(\s){0,}\((.*)\)(\s){0,}\{(\s){0,}/ ) {
      $inside_for++;
      push( @{$file_data->{$file}} , $_ );
    } elsif( $inside_for > 0 ) {
        if( $_ =~ /(\s){0,}System\.out\.println\(.*/ ) {
        } elsif( $_ =~ /(\s){0,}\}(\s){0,}/ ) {
          $inside_for--;
          push( @{$file_data->{$file}} , $_ );
        } else {
          push( @{$file_data->{$file}} , $_ );
        }
    } else {
      push( @{$file_data->{$file}} , $_ );
    }
  }
}

foreach ( @java_files ) {
  $file_data->{$_} = [];
  clean( $_ );
}

foreach my $file ( keys %$file_data ) {
  open( hanw , "+>".$file.".txt" );
  foreach( @{$file_data->{$file}} ) {
    print hanw $_;
  }
}

data1.java

class Employee {
  /* code */
  public void Employee() {
    System.out.println("string");
    for( ... ) {
      System.out.println("string");
      /* code */
      System.out.println("string");
      for( ... ) {
        System.out.println("string");
        /* code */
        System.out.println("string");
      }
    }
  }
}

for( ... ) {
  System.out.println("string");
  /* code */
  System.out.println("string");
}

data2.java

for( ... ) {
  /* code */
  System.out.println("string");
  /* code */
  System.out.println("string");
  /* code */
  for( ... ) {
    System.out.println("string");
    /* code */
    System.out.println("string");
    for( ... ) {
      System.out.println("string");
      /* code */
      System.out.println("string");
    }
  }
}

public void display() {
  /* code */
  System.out.println("string");
  for( ... ) {
    System.out.println("string");
    /* code */
    System.out.println("string");
    for( ... ) {
      System.out.println("string");
      /* code */
      System.out.println("string");
    }
  }
}

data1.java.txt

class Employee {
  /* code */
  public void Employee() {
    System.out.println("string");
    for( ... ) {
      /* code */
      for( ... ) {
        /* code */
      }
    }
  }
}

for( ... ) {
  /* code */
}

data2.java.txt

for( ... ) {
  /* code */
  /* code */
  /* code */
  for( ... ) {
    /* code */
    for( ... ) {
      /* code */
    }
  }
}

public void display() {
  /* code */
  System.out.println("string");
  for( ... ) {
    /* code */
    for( ... ) {
      /* code */
    }
  }
}

image-1

So you are looking to parse Java. A quick Google search reveals, javaparser, a Java 1.5 parser written in Java.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top