Question

I need some assistance constructing a regular expression in a ColdFusion application. I apologize if this has been asked. I have searched, but I may not be asking for the correct thing.

I am using the following to search an email subject line for an issue number:

reMatchNoCase("[0-9]{5}", mailCheck.subject)

The issue number contains only numeric values, and should be exactly 5 digits. This is working except in cases where I have a longer number that appears in the string, such as 34512345. It takes the first 5 digits of that string as a valid issue number as well.

What I want is to retrieve only 5 digit numbers, nothing shorter or longer. I am then placing these into a list to be looped over and processed. Do I perhaps need to include spaces before and after in the regex to get the desired result?

Thank you.

Was it helpful?

Solution

The general way to exclude content from occurring before/after a match is to use negative lookbehind before the match and a negative lookahead afterwards. To do this for numeric digits would be:

(?<!\d)\d{5}(?!\d)

(Where \d is the shorthand for [0-9])

CF's regex supports lookaheads, but unfortunately not lookbehinds, so that wouldn't work directly in rematch - however that probably doesn't matter in this case because it's likely that you don't want, for example, abc12345 to match either - so what you more likely want is:

\b\d{5}\b

Where \b is a "word boundary" - roughly, it checks for a change between a "word character" and a non-word character (or visa versa) - so in this case the first \b will check that there is NOT one of [a-zA-Z0-9_] before the first digit, and the second \b will check that there isn't one after the fifth digit. A \b does not append any characters to the match (i.e. it is a zero-width assertion).

Since you're not dealing with case, you don't need the nocase variable and can simply write:

rematch( '\b\d{5}\b' , mailCheck.subject )

The benefit of this over simply checking for spaces is that the result is five digits (no need to trim), but the downside is that it would match values such as [12345] or 3.14159^2 which are probably not what you want?

To check for spaces, or the start/end of the string, you can do:

rematch( '(?:^| )\d{5}(?= |$)' , mailCheck.subject )

Then use trim on each result to remove spaces.

If that's not what you're after, go ahead and provide more details.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top