Question

I'm matching urls, so I can connect requests to controllers/views, and there are multiple options for a few of the urls, only one of which can have anything following it in the url, but I also need to have what comes after available as a named group.

Examples:

  • /admin/something #match
  • /admin/something/new #match
  • /admin/something/new/id #fail
  • /admin/something/edit #fail
  • /admin/something/edit/id #match

There are many other possibilities, but thats good enough for an example. Basically, if the url ends in 'new', nothing can follow, while if it ends in 'edit' it also must have an id to edit.

The regex I've been using so far:

^/admin/something(?:/(?P<action>new|edit(?:/(?P<id>\d{1,5}))))?$

A whitespace-exploded version:

^/admin/something(?:/
    (?P<action>
        new|        # create a new something
        edit(?:/    # edit an old something
                (?P<id>\d{1,5})    # id to edit
            )
        )
    )?    # actions on something are optional
$

But then if the url is '/admin/something/edit/id' the 'action' group is 'edit/id'. I've been using a little bit of string manip within the controller to cut down the action to just... the action, but I feel like a positive lookahead would be much cleaner. I just haven't been able to get that to work.

The lookahead regex I've been working at: (will match 'new', but not 'edit' [with or without an id])

^/admin/something(?:/(?P<action>new|edit(?=(?:/(?P<id>\d{1,5})))))?$

Any tips/suggestions would be much appreciated.

Was it helpful?

Solution

Your problem lies with the $ at the end. This is a zero-width assertion that the regex matches to the end of the line. However, your lookahead is also a zero-width assertion (that id follows edit). The reason it's called a lookahead is because it matches within the lookahead, and then skips back to the beginning of that match. So it's failing on ...edit/id because it's trying to assert both that /id follows edit and /edit is the end of the line. It fails on ...edit alone because it's trying to assert that /id follows edit.

There are two potential solutions. The first is to simply take out the $. This may not be desirable because then it could match .../edit/id/gobbledygook. The second solution is to use your regex language's method of reusing captured groups. I can't help you there because I don't know what regex you're using. I don't recognize the P<name> syntax for named capturing. You would put whatever you need for that after the <action> group.

OTHER TIPS

^/admin/something
(
    $               |
    /new$           |
    /edit/(\d{5})$
)

non regex way,

$str = "/admin/something";
$s = explode("/",$str);
if ( end($s) == "something" || end($s) == "new" ){
    print "ok\n";
}
if ( strpos($str,"edit" )!==FALSE && is_numeric(end($s)) ){
    print "ok\n";
}

The answer I came to uses parts from both of the above answers to create a regex with lookahead that also stores all the values I want in named groups, without extra clutter such as forward slashes. It matches everything I want it to, and fails everything else. Perfect.

^/admin/something(?:(?:/
                        (?P<action>
                            new$|
                            edit(?=/(?P<id>\d{1,5})$)
                        )
                    )|$)

I wish I could mark more than one as the answer, since they both helped me find the one true path.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top