How can i filtered with the best performance? (JAVA)

https://stackoverflow.com/questions/2083750

21-09-2019
|

Question

I'm working at a small office,I have an application,it's generate a big text file with 14000 lines;

after each generate i must filter it and it's really boring;

I wanna write an application with java till I'll can handle it as soon as possible.

Please help me; I wrote an application with scanner (Of course with help :) ) but it's not good becase it was very slow;

For example it's my file :

SET CELL:NAME=CELL:0,CELLID=3;
SET LSCID:NAME=LSC:0,NETITYPE=MDCS,T32=5,EACT=FILTER-NOFILTER-MINR-FILTER-NOFILTER,ENSUP=GV2&NCR,MINCELL=6,MSV=PFR,OVLHR=9500,OTHR=80,BVLH=TRUE,CELLID=3,BTLH=TRUE,MSLH=TRUE,EIHO=DISABLED,ENCHO=ENABLED,NARD=NAP_STLP,AMH=ENABLED(3)-ENABLED(6)-ENABLED(9)

and I want this output (filter :)

CELLID :  3
ENSUP  :  GV2&NCR
ENCHO  :  ENABLED
MSLH   :  TRUE
------------------------
Count of CELLID : 2

which solution is the best and the fastest than the other ?

it's my source code :

public static void main(String[] args) throws FileNotFoundException {
        Scanner scanner = new Scanner(new File("i:\\1\\2.txt"));
        scanner.useDelimiter(";|,");
        Pattern words = Pattern.compile("(CELLID=|ENSUP=|ENCHO=)");

        while (scanner.hasNextLine()) {
          String key = scanner.findInLine(words);

          while (key != null) {
            String value = scanner.next();
            if (key.equals("CELLID=")) 
              System.out.print("CELLID:" + value+"\n");
             //continue with else ifs for other keys
              else if (key.equals("ENSUP="))
            System.out.print("ENSUP:" + value+"\n");

            else if (key.equals("ENCHO="))
            System.out.print("ENCHO:" + value+"\n");
            key = scanner.findInLine(words);
          }
          scanner.nextLine();
        }

}

Thank you very much indeed ...

Solution

Since your code has performance issues, you first need to find bottle neck. You can profile it with profiler available with IDE you use.

However since your code is not high in computation but IO intensive, both in reading file and output using System.out.print, that is where I would suggest you to improve on for improving on file IO.

Replace this line of code

Scanner scanner = new Scanner(new File("i:\\1\\2.txt"));

With this lines of code

File file = new File("i:\\1\\2.txt");
BufferedReader br = new BufferedReader( new FileReader(file)  );
Scanner scanner = new Scanner(br);

Let us know if this helps.

Since previous solution did not helped much, I made few more changes to improve your code. You may have to correct errors in parsing if any. I was able to display output of parsing 392832 lines in approx 5 seconds. Original solution takes more than 50 seconds.

Chages are as below:

Use of StringTokenizer instead of Scanner
Use of BufferedReader for reading file
Use of StringBuilder to buffer output

public class FileParse {

    private static final int FLUSH_LIMIT = 1024 * 1024;
    private static StringBuilder outputBuffer = new StringBuilder(
            FLUSH_LIMIT + 1024);
    private static final long countCellId;

    public static void main(String[] args) throws IOException {
        long start = System.currentTimeMillis();
        String fileName = "i:\\1\\2.txt";
        File file = new File(fileName);
        BufferedReader br = new BufferedReader(new FileReader(file));
        String line;
        while ((line = br.readLine()) != null) {
            StringTokenizer st = new StringTokenizer(line, ";|, ");
            while (st.hasMoreTokens()) {
                String token = st.nextToken();
                processToken(token);
            }
        }
        flushOutputBuffer();
        System.out.println("----------------------------");
        System.out.println("CELLID Count: " + countCellId);
        long end = System.currentTimeMillis();
        System.out.println("Time: " + (end - start));
    }

    private static void processToken(String token) {
        if (token.startsWith("CELLID=")) {
            String value = getTokenValue(token);
            outputBuffer.append("CELLID:").append(value).append("\n");
            countCellId++;
        } else if (token.startsWith("ENSUP=")) {
            String value = getTokenValue(token);
            outputBuffer.append("ENSUP:").append(value).append("\n");
        } else if (token.startsWith("ENCHO=")) {
            String value = getTokenValue(token);
            outputBuffer.append("ENCHO:").append(value).append("\n");
        }
        if (outputBuffer.length() > FLUSH_LIMIT) {
            flushOutputBuffer();
        }
    }

    private static String getTokenValue(String token) {
        int start = token.indexOf('=') + 1;
        int end = token.length();
        String value = token.substring(start, end);
        return value;
    }

    private static void flushOutputBuffer() {
        System.out.print(outputBuffer);
        outputBuffer = new StringBuilder(FLUSH_LIMIT + 1024);
    }

}

Update on ENSUP and MSLH:

To me it looks like you have switched ENSUP and MSLH in if statement as below. Hence you see "MSLH" value for "ENSUP" and vice a versa.

} else if (token.startsWith("MSLH=")) {
    String value = getTokenValue(token);
    outputBuffer.append("ENSUP:").append(value).append("\n");
} else if (token.startsWith("ENSUP=")) {
    String value = getTokenValue(token);
    outputBuffer.append("MSLH:").append(value).append("\n");
}

OTHER TIPS

Simple text filtering is probably easier to write in Perl (my choice because I've been using it for years) or Python (what I recommend to new people because it's a more modern language).

Several solutions to a similar problem using Java Scanner or StreamTokenizer were recently discussed here.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow