Question

I have a .csv file with lines like:

30-11-2013 ;30-11-2013 ;SUMMARY ;0.0 ;200.0 ;2800.0 ;2800.0
31-12-2013 ;31-12-2013 ;SUMMARY ;0.0 ;200.0 ;3000.0 ;3000.0
02-01-2014 ;02-01-2014 ;TRANSF ;0.0 ;300.0 ;3300.0 ;3300.0
02-01-2014 ;02-02-2014 ;TRANSF ;0.0 ;300.0 ;3600.0 ;3600.0
03-01-2014 ;03-01-2014 ;TRANSF ;0.0 ;300.0 ;3900.0 ;3900.0

I have a Scanner running those line, and I need to create a while(scanner.hasNext(somePattern)) that returns true if the next line start with a date like 30-11-2013.

What should somePattern be?

P.S. in the cycle, the scanner runs all the lines so I need to know that the next token starts with a date.

Was it helpful?

Solution

Pattern is class which handles regular expressions. So in your case you want to check if data after start of line has two digits - two digits - four digits. Regex which represents such situation can look like

^\d{2}-\d{2}-\d{4}

where ^ represents start of line, \d represents single digit and {x} describe how many times element before it should appear.

So you can try using following Pattern (instances of Pattern class are created via compile method which beside regex can accept combination of flags which can slightly change default regex behaviour).

Pattern pattern = Pattern.compile("^\\d{2}-\\d{2}-\\d{4}", Pattern.MULTILINE);

I added MULTILINE flag to let ^ represent start of each line, not just start of entire data which would be its default meaning.

DEMO

String input = 
        "30-11-2013 ;30-11-2013 ;SUMMARY ;0.0 ;200.0 ;2800.0 ;2800.0\r\n"
        + "31-12-2013 ;31-12-2013 ;SUMMARY ;0.0 ;200.0 ;3000.0 ;3000.0\r\n"
        + "02-01-2014 ;02-01-2014 ;TRANSF ;0.0 ;300.0 ;3300.0 ;3300.0\r\n"
        + "x02-01-2014 ;02-02-2014 ;TRANSF ;0.0 ;300.0 ;3600.0 ;3600.0\r\n"
        + "03-01-2014 ;03-01-2014 ;TRANSF ;0.0 ;300.0 ;3900.0 ;3900.0";
Scanner scanner = new Scanner(input);

Pattern pattern = Pattern.compile("^\\d{2}-\\d{2}-\\d{4}",
        Pattern.MULTILINE);
while (scanner.hasNext(pattern)) {
    System.out.println(scanner.nextLine());
}

Output:

30-11-2013 ;30-11-2013 ;SUMMARY ;0.0 ;200.0 ;2800.0 ;2800.0
31-12-2013 ;31-12-2013 ;SUMMARY ;0.0 ;200.0 ;3000.0 ;3000.0
02-01-2014 ;02-01-2014 ;TRANSF ;0.0 ;300.0 ;3300.0 ;3300.0

OTHER TIPS

If your file is so regularly formatted, all you need is a simple list of character patterns representing each part of your string:

^\d\d-\d\d-\d\d\d\d ;\d\d-\d\d-\d\d\d\d ;SUMMARY ;\d+[.]\d+ ;\d+[.]\d+ ;\d+[.]\d+ ;\d+[.]\d+$

Of course you need to escape the slashes to put this expression in a Java string.

The documentation for the Pattern class provides a great cheat-sheet of the pattern elements. \d matches a single digit; \d+ matches one or more digits.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top