Question

I have a String which contains formatted currency values like 45,890.00 and multiple values seperated by comma like 45,890.00,12,345.00,23,765.34,56,908.50 ..

I want to extract and process all the currency values, but could not figure out the correct regular expression for this, This is what I have tried

public static void main(String[] args) {
    String currencyValues = "45,890.00,12,345.00,23,765.34,56,908.50";
    String regEx = "\\.[0-9]{2}[,]";
    String[] results = currencyValues.split(regEx);
    //System.out.println(Arrays.toString(results));
    for(String res : results) {
        System.out.println(res);
    }
}

The output of this is:

45,890 //removing the decimals as the reg ex is exclusive
12,345
23,765
56,908.50

Could someone please help me with this one?

Was it helpful?

Solution

You need a regex "look behind" (?<=regex), which matches, but does consume:

String regEx = "(?<=\\.[0-9]{2}),";

Here's your test case now working:

public static void main(String[] args) {
    String currencyValues = "45,890.00,12,345.00,23,765.34,56,908.50";
    String regEx = "(?<=\\.[0-9]{2}),"; // Using the regex with the look-behind
    String[] results = currencyValues.split(regEx);
    for (String res : results) {
        System.out.println(res);
    }
}

Output:

45,890.00
12,345.00
23,765.34
56,908.50

OTHER TIPS

You could also use a different regular expression to match the pattern that you're searching for (then it doesn't really matter what the separator is):

 String currencyValues = "45,890.00,12,345.00,23,765.34,56,908.50,55.00,345,432.00";
 Pattern pattern = Pattern.compile("(\\d{1,3},)?\\d{1,3}\\.\\d{2}");
 Matcher m = pattern.matcher(currencyValues);
 while (m.find()) {
    System.out.println(m.group());
 }

prints

45,890.00
12,345.00
23,765.34
56,908.50
55.00
345,432.00

Explanation of the regex:

  • \\d matches a digit
  • \\d{1,3} matches 1-3 digits
  • (\\d{1,3},)? optionally matches 1-3 digits followed by a comma.
  • \\. matches a dot
  • \\d{2} matches 2 digits.

However, I would also say that having comma as a separator is probably not the best design and would probably lead to confusion.

EDIT:

As @tobias_k points out: \\d{1,3}(,\\d{3})*\\.\\d{2} would be a better regex, as it would correctly match:

  • 1,000,000,000.00

and it won't incorrectly match:

  • 1,00.00

In all of the above solutions, it takes care if all values in the string are decimal values with a comma. What if the currency value string looks like this:

String str = "1,123.67aed,34,234.000usd,1234euro";

Here not all values are decimals. There should be a way to decide if the currency is in decimal or integer.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top