Question

I am trying to parse my URL into different parts using this RegExp:

([\w\\.-]*)

Given an example URL http://www.foo.com/bar/baz I get those results from preg_match_all():

Array
(
[0] => Array
    (
        [0] => http
        [1] => 
        [2] => 
        [3] => 
        [4] => www.foo.com
        [5] => 
        [6] => bar
        [7] => 
        [8] => baz
        [9] => 
    )

)

It seems that it parses any invalid character into an empty item.
How do I solve this?

Was it helpful?

Solution

By using * you're capturing empty groups - use + instead:

([\w\.-]+)

I assume the extra \ in your RE is because you have it inside a quoted string.

OTHER TIPS

this may do what you want :([\w.-]+|.) This will match all part of the address.

You sure you want \\. ?

In other words, from what you've posted, it looks like you've escaped a backslash instead of the period as you've likely intended to. EDIT: For tidiness, no harm to remove redundant escaping, but this isnt the actual problem [as pointed out by blixt -- thanks].

Highly recommend The Regulator as a regex debugging tool [Though its based on .NET regexes so isnt ideal for PHP work - but the general point that there are tools that will let you identify the basis on which matching is operating]

Still don't understand what you want with the backslashes in the range. Can you post the final regex you use in the question please? And sorry for the distractions that this answer has been!

EDIT: As blixt pointed out, period doesnt act as a metachar as I suggested.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top