Domanda

I can't figure out the correct regex expression for what I am looking for. Essentially what I need is the following.

If user searches for a street prefix such as N W E S and includes a wildcard (%,*) that the regex ignores it. I only want the regex to work with N W E S exclusively.

So how do I write the regex to say, if you have a character next to you then ignore. This is what I have so far.

^(N|S|W|E)\b

But it's picking up N% and other wildcards... I don't want it too.

È stato utile?

Soluzione

Description

This regex will match only streets with N, S, E, W followed by whitespace and more text or the end of the line.

^([nsew])\b(?:\s.*?)?$

enter image description here

  • Group 0 will receive the entire matched value
  • Group 1 will receive just the N, S, E, or W

    N Wisconsin Drive S Voter Booth E Kitten Ave W Washington Street Noghtington Lane Silver Stone Drive Edans Expressway Wireware Waythrough

Example

You didn't specify a language, so I picked PHP to demo the regex.

<?php
$sourcestring="N Wisconsin Drive
S Voter Booth
E Kitten Ave
W Washington Street
Noghtington Lane
Silver Stone Drive
Edans Expressway
Wireware Waythrough";
    Dim re As Regex = New Regex("^([nsew])\b(?:\s.*?)?$",RegexOptions.IgnoreCase OR RegexOptions.Multiline)
    Dim mc as MatchCollection = re.Matches(sourcestring)
    Dim mIdx as Integer = 0
    For each m as Match in mc
      For groupIdx As Integer = 0 To m.Groups.Count - 1
        Console.WriteLine("[{0}][{1}] = {2}", mIdx, re.GetGroupNames(groupIdx), m.Groups(groupIdx).Value)
      Next
      mIdx=mIdx+1
    Next
  End Sub
End Module

$matches Array:
(
    [0] => Array
        (
            [0] => N Wisconsin Drive
            [1] => S Voter Booth
            [2] => E Kitten Ave
            [3] => W Washington Street
        )

    [1] => Array
        (
            [0] => N
            [1] => S
            [2] => E
            [3] => W
        )

)

Altri suggerimenti

^ in that case is the beginning of the string rather than "not." You can do this with a negative lookahead.

[NSWE](?!%|\*)

From your question, it sounds like you're allowing the user to enter a search expression. Then you're trying to act on search term by passing it directly into a regex function.

If a user enters N% expecting to look for an N followed by % any number of characters but the regex engine simply looks at the % character and tries to match it. You can correct the user provided search term by replacing the % with one of the following before using it as a regular expression:

  • .* greedy match all remaining characters on the line
  • .*? to non greedy match any number of characters
  • . to match any single character

The same would need to be done if the user entered a *.

Disclaimer: Depending on options used in the regex function . may or may not match new lines. Depending on your user base it might be better to simply tell users that the search term needs adhere to regular expression syntax. This would allow a knowledgeable user to build their own esoteric expressions.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top