Question

When parsing a YYYYMMdd date, e.g. 20120405 for 5th April 2012, what is the fastest method?

int year = Integer.parseInt(dateString.substring(0, 4));
int month = Integer.parseInt(dateString.substring(4, 6));
int day = Integer.parseInt(dateString.substring(6));

vs.

int date = Integer.parseInt(dateString)
year = date / 10000;
month = (date % 10000) / 100; 
day = date % 100;

mod 10000 for month would be because mod 10000 results in MMdd and the result / 100 is MM

In the first example we do 3 String operations and 3 "parse to int", in the second example we do many things via modulo.

What is faster? Is there an even faster method?

Was it helpful?

Solution

As you see below, the performance of the date processing only is relevant when you look at millions of iterations. Instead, you should choose a solution that is easy to read and maintain.

Although you could use SimpleDateFormat, it is not reentrant so should be avoided. The best solution is to use the great Joda time classes:

private static final DateTimeFormatter DATE_FORMATTER = new DateTimeFormatterBuilder()
     .appendYear(4,4).appendMonthOfYear(2).appendDayOfMonth(2).toFormatter();
...
Date date = DATE_FORMATTER.parseDateTime(dateOfBirth).toDate();

If we are talking about your math functions, the first thing to point out is that there were bugs in your math code that I've fixed. That's the problem with doing by hand. That said, the ones that process the string once will be the fastest. A quick test run shows that:

year = Integer.parseInt(dateString.substring(0, 4));
month = Integer.parseInt(dateString.substring(4, 6));
day = Integer.parseInt(dateString.substring(6));

Takes ~800ms while:

int date = Integer.parseInt(dateString);
year = date / 10000;
month = (date % 10000) / 100; 
day = date % 100;
total += year + month + day;

Takes ~400ms.

However ... again... you need to take into account that this is after 10 million iterations. This is a perfect example of premature optimization. I'd choose the one that is the most readable and the easiest to maintain. That's why the Joda time answer is the best.

OTHER TIPS

SimpleDateFormat format = new SimpleDateFormat("yyyyMMdd");
Date date = format.parse("20120405");

I did a quick benchmark test where both methods were executed 1 million times each. The results clearly show that the modulo method is much faster, as Dilum Ranatunga predicted.

t.startTiming();
for(int i=0;i<1000000;i++) {
    int year = Integer.parseInt(dateString.substring(0, 4));
    int month = Integer.parseInt(dateString.substring(4, 6));
    int day = Integer.parseInt(dateString.substring(6));
}
t.stopTiming();
System.out.println("First method: "+t.getElapsedTime());

Time t2 = new Time();
t2.startTiming();
for(int i=0;i<1000000;i++) {
    int date = Integer.parseInt(dateString);
    int y2 = date / 1000;
    int m2 = (date % 1000) / 100;
    int d2 = date % 10000;
}
t2.stopTiming();
System.out.println("Second method: "+t2.getElapsedTime());

The results don't lie (in ms).

First method: 129
Second method: 53

The second will certainly be faster, once you change mod to % and add missing semicolons and fix the divisor in the year calculation. That said, I'm finding it hard to picture the application where this is a bottleneck. Just how many times are you parsing YYYYMMdd dates into their components, without any need to validate them?

How about (but it would parse an invalid date without saying anything...):

public static void main(String[] args) throws Exception {
    char zero = '0';
    int yearZero = zero * 1111;
    int monthAndDayZero = zero * 11;
    String s = "20120405";
    int year = s.charAt(0) * 1000 + s.charAt(1) * 100 + s.charAt(2) * 10 + s.charAt(3) - yearZero;
    int month = s.charAt(4) * 10 + s.charAt(5) - monthAndDayZero;
    int day = s.charAt(6) * 10 + s.charAt(7) - monthAndDayZero;
}

Doing a quick and dirty benchmark with 100,000 iterations warm up and 10,000,000 timed iterations, I get:

  • 700ms for your first method
  • 350ms for your second method
  • 10ms with my method.

I believe the mod method will be faster. By calling the function your creating variable and location instances on the stack and create a heavier solution.

Mod is standard math operator and is likely very optomized.

But as Hunter McMillen said "You should look at the Calendar class API"

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top