Question

We all know the effects that lots of thrown exceptions can have over the performance of our applications, thus, we should stay away from things like using exceptions for control flow. After this statement I must confess that when coding I didn't care that much about this. I've been working mostly on Java platform but lately I was doing it on .NET platform and just found out this handy method: public static bool TryParse(string s,out int result) ,which allows you to transform a String into int whithout raise an exception. From that moment on, I'm keeping on using it. I just wanted to ask you about your preferences regarding the use of public static bool TryParse(string s,out int result) or public static int ToInt32(string value).

And from the point of view of Java, just pointing that it's missing such a similar method, despite we could get it through things like:

boolean isInteger = Pattern.matches("^\d*$", myString);

Thanks.

Was it helpful?

Solution

Yes, Java is missing a similar method, although without out parameters it's actually pretty difficult to express (while wanting to return a primitive). Generally, though, in C# you should use TryParse if you expect the value to not be an integer sometimes, and ToInt32 otherwise; this way the "exceptional" situation is treated as such.

In particular if performance is your main reason for wanting TryParse, the regex matches method you post is considerably worse. The performance "expense" of Exceptions (which is, in reality, very minimal) is dwarfed by how much using them wrongly can fuzz easy understanding of control flow.

OTHER TIPS

I don't know about C#, but in Java exceptions are only expensive when they're actually thrown, but then they're very expensive indeed. If you expect a significant fraction of the strings to be invalid, it's worth your while to validate them first, even if you use a regex.

But don't use String.matches() or Pattern.matches() to apply the regex; those methods recompile the regex every time you call them. Instead, compile the regex ahead of time and save it as a Pattern object, then do your validating with that. In my tests, parsing a list of 10,000 strings of which 20% were invalid, pre-validating with a Pattern is almost twice as fast as using Integer.parseInt() alone and catching the exceptions.

However, this discussion only applies if you're doing a lot of conversions in a tight loop. If you're only doing them once in a while, like when you accept user input, letting Integer.parseInt() do the validating is fine. And if you do choose to validate with a regex, you'll need a much better regex than ^\d*$ - that regex will match the empty string as well as "numbers" larger than Integer.MAX_VALUE, and it won't match negative numbers at all.

For that purpose in Java you can use the well know StringUtils (on the commons-lang), this class have a method isNumeric

Probably you can take a look to the code that those guys have write for that function:

public static boolean isNumeric(String str) {
  if (str == null) {
    return false;
  }
  int sz = str.length();
  for (int i = 0; i < sz; i++) {
    if (Character.isDigit(str.charAt(i)) == false) {
      return false;
    }
  }
  return true;
 }

I am not saying that this is the most efficient way to do it, but there is another alternative for you without using regex. Good luck!

And from the point of view of Java, just pointing that it's missing such a similar method, despite we could get it through things like:

boolean isInteger = Pattern.matches("^\d*$", myString);

To predict if Integer.parseInt(myString) would throw an Exception there's more work to do. The String could start with a -. Also an int cannot have more than 10 significant digits. So a more reliable expression would be ^-?0*\d{1,10}$. But even this expression wouldn't predict every Exception because it's still too imprecise.

To generate a reliable regular expression is possible. But it would be very long. It's also possible to implement a method which precisely determines if parseInt would throw an Exception. It could look like this:

static boolean wouldParseIntThrowException(String s) {
    if (s == null || s.length() == 0) {
        return true;
    }

    char[] max = Integer.toString(Integer.MAX_VALUE).toCharArray();
    int i = 0, j = 0, len = s.length();
    boolean maybeOutOfBounds = true;

    if (s.charAt(0) == '-') {
        if (len == 1) {
            return true; // s == "-"
        }
        i = 1;
        max[max.length - 1]++; // 2147483647 -> 2147483648
    }
    while (i < len && s.charAt(i) == '0') {
        i++;
    }
    if (max.length < len - i) {
        return true; // too long / out of bounds
    } else if (len - i < max.length) {
        maybeOutOfBounds = false;
    }
    while (i < len) {
        char digit = s.charAt(i++);
        if (digit < '0' || '9' < digit) {
            return true;
        } else if (maybeOutOfBounds) {
            char maxdigit = max[j++];
            if (maxdigit < digit) {
                return true; // out of bounds
            } else if (digit < maxdigit) {
                maybeOutOfBounds = false;
            }
        }
    }
    return false;
}

I don't know which version is more efficient though. And it depends mostly on the context what kind of checks are reasonable.

In C# to check if a string can be converted you would use TryParse. And if it returns true then as a byproduct it got converted at the same time. This is a neat feature and I don't see a problem with just reimplementing parseInt to return null instead of throwing an exception.

But if you don't want to reimplement the parsing method it could still be nice to have a set of methods at hand that you can use depending on the situation. They could look like this:

private static Pattern QUITE_ACCURATE_INT_PATTERN = Pattern.compile("^-?0*\\d{1,10}$");

static Integer tryParseIntegerWhichProbablyResultsInOverflow(String s) {
    Integer result = null;
    if (!wouldParseIntThrowException(s)) {
        try {
            result = Integer.parseInt(s);
        } catch (NumberFormatException ignored) {
            // never happens
        }
    }
    return result;
}

static Integer tryParseIntegerWhichIsMostLikelyNotEvenNumeric(String s) {
    Integer result = null;
    if (s != null && s.length() > 0 && QUITE_ACCURATE_INT_PATTERN.matcher(s).find()) {
        try {
            result = Integer.parseInt(s);
        } catch (NumberFormatException ignored) {
        // only happens if the number is too big
        }
    }
    return result;
}

static Integer tryParseInteger(String s) {
    Integer result = null;
    if (s != null && s.length() > 0) {
        try {
            result = Integer.parseInt(s);
        } catch (NumberFormatException ignored) {
        }
    }
    return result;
}

static Integer tryParseIntegerWithoutAnyChecks(String s) {
    try {
        return Integer.parseInt(s);
    } catch (NumberFormatException ignored) {
    }
    return null;
}

I just wanted to ask you about your preferences regarding the use of public static bool TryParse(string s,out int result) or public static int ToInt32(string value).

Yes, I use TryParse except where I expect the value to always be valid. I find that it reads cleaner than using the exceptions. Even if I want an exception, I usually want to customize the message or throw my own custom exception; hence, I use TryParse and manually throw an exception.

In both Java and C#, I try to catch the minimum set of exceptions possible. In Java, that means I have to separately catch NullPointerException and NumberFormatException in response to Number.ValueOf(...); alternatively, I can catch "Exception" and risk catching something unintended. With TryParse in C#, I don't worry about that at all.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top