Question

Medicare Eligibility EDI Example Responses is what I'm trying to match.

I have a string that looks like this:

LN:SMITHbbbbbbbbFN:SAMANTHAbbBD:19400515PD:1BN:123456PN:9876543210GP:ABCDEFGHIJKLMNOID:123456789012345bbbbbPC:123PH:8005551212CD:123456PB:123ED:20060101TD:2070101LC:NFI:12345678FE:20070101FT:20080101

I need a set of matches that look like this:

Key | Value
-------------------
LN  | SMITHbbbbbbbb
FN  | SAMANTHAbb
BD  | 19400515
... etc

I've been dealing with this all day, and I can't seem to get an acceptable matching scenario. I'm about to program it procedurally with a for loop and finding indexes of colons if I can't figure something out.

I've tried using negative lookahead and I'm not getting anywhere. This is C#, and I'm using this tester (.Net) while I'm testing, along with The Regex Coach (non .Net).

I've tried using this:

([\w]{2})\:(?![\w]{2}\:)

But that only matches the keys and their colons, like "LN:", "FN:", etc.

If I use:

([\w]{2})\:(.+?)([\w]{2})\:

It consumes the next matching two character key and colon as well, leading to me only matching every other key/value pair.

Is there a way for me to match these using RegEx in .Net correctly, or am I stuck with a more procedural solution? Keep in mind, I can't assume that the keys will always be upper case letters. They could possibly include numbers, but they will always be two characters and then a colon.

Thanks in advance for any help I can get.

Was it helpful?

Solution

I think what you want is positive lookahead, not negative, so that you find the key-colon combo ahead of the current position, but you don't consume it. This appears to work for your test example:

([\w]{2})\:(.+?)(?=[\w]{2}\:|$)

Yielding:

LN: SMITHbbbbbbbb
FN: SAMANTHAbb
BD: 19400515
PD: 1
BN: 123456
PN: 9876543210
...

Note: I added the colons in my test output, they aren't captured by the regex.

EDIT: Thanks, Douglas, I've edited the regex to capture end-of-string so the last entry is captured, too.

OTHER TIPS

This works in JavaScript (I always fire up the Error Console in Firefox to play around with regular expressions) but it should also work fine in .NET:

([^:]{2}):((?:[^:](?!(?:[^:]:)))+)

It uses negative lookahead:

( -> start capturing first token (the label)
    [^:]{2} -> two non-colon characters
) -> end capturing first token
: -> skip the colon
( -> start capturing the second token (the value)
    (?: -> don't capture this group as a token
        [^:](?! -> a non-colon character, not followed by:
                (?: -> don't capture this group
                    [^:]: -> a non-colon, followed by a colon
                ) -> end group
            ) -> end negative lookahead
    )+ -> one or more of this group
) -> end capturing the second token

Test:

"LN:SMITHbbbbbbbbFN:SAMANTHAbbBD:19400515"
    .replace(
        /([^:]{2}):((?:[^:](?!(?:[^:]:)))+)/g,
        "[$1] = [$2]\n")

Yields:

[LN] = [SMITHbbbbbbbb]
[FN] = [SAMANTHAbb]
[BD] = [19400515]

Looking at the link each field is of a fixed length, so you could do something like this:

int pos = 0;
Dictionary<string, string> parsedResults = new Dictionary<string, string>();

foreach (int length in new int[] { 13, 10, 8, 1, 6, 10, 15, 20, 3, 10, 6, 3, 8, 8, 1, 8, 8, 8, })
{
    string fieldId = message.Substring(pos, 2);
    string fieldValue = message.Substring(pos + 3, length);
    parsedResults.Add(fieldId, fieldValue);
    pos += length + 3;
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top