Question

I am using .Net. I want to match last name which has charecters other than a-z, A-Z, space and single quote and len of charecters should not be between 1-40 . The string that has to be matched is a XML look like this <FirstName>SomeName</FirstName><LastName>SomeLastName</LastName><Address1>Addre1</Address1>

I wrote regualr expression but that is matching only [a-zA-Z'.\s]{1,40} <LastName>[a-zA-Z'.\s]{1,40}</LastName> EDIT:LastName tag is missed. But I want to get negation of this expression. Is that possible or should I take different approach?

Was it helpful?

Solution

You can have negated character classes. [^abc] matches any character that is NOT a, b, or c. For your case, you might want [^a-zA-Z'.\s]{1,40}

Since your data is in XML tags, you will probably want to extract from those first. XML and regular expressions don't always mix well.


If you absolutely must deal with the XML tags in the regex you could try something like this:

<FirstName>([^a-zA-Z'.\s]{1,40})</FirstName><LastName>([^a-zA-Z'.\s]{1,40})</LastName>

Capture group 1 will be the first name, capture group 2 will be the last name.


Misread original question, if you want to match strings MORE than 40 characters, the length should be {41,} not {1,40}. This will ensure you only match on strings with more than 40 characters.

OTHER TIPS

You seem to want to know how to negate a pattern match without using some "not"-type logic in the language, but placing it in the pattern match itself.

If that's what you really mean, all you need to do is convert your "regex" into "^(?:(?!regex).)*$".

The first is true of any string that contains "regex", and the second is true of any string that does not contain "regex".

I suppose if you want to be mindful of multilined input strings, that should be "\A(?:(?!regex)(?s).)*\z" just to be super-careful.

The negation character is "^". So your expression would read like the following:

[^a-zA-Z'\S]{1,40}.

Here is a link to Microsoft's site about negation.

Enjoy

try this pattern

"<LastName>([^a-zA-Z'\s])|(.{41,})</LastName>"

[EDIT] - Removed other stuff. Here's something that worked for all conditions (including empty) in my tests, including have the XML in the tested string.

/^(<LastName><\/LastName>)|(<LastName>.*[^a-zA-Z'\s]+.*<\/LastName>)|(<LastName>(.{41,})<\/LastName>)$/
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top