Question

Consider the following file path:

\\fileserver\share\documents\department\my_project\a_sub_folder\myfile.doc

I need to extract the text "\documents\department\my_project" with a regular expression. Details:

  • Exclude "fileserver" and "share"
  • Limit to 3 "logical" top level folders after, thereby excluding "\a_sub_folder"
  • Don't include file name ("myfile.doc")

Using the following regex..:

^.*share(?P<folders>\\.+)\\.+

..I get this in my "folders" group:

\documents\department\my_project\a_sub_folder

The part that nags me is how to get rid of "a_sub_folder". I've tried adding repetition operators to the folders-group with no effect:

^.*share(?P<folders>\\.+){1,3}\\.+
^.*share(?P<folders>\\.+){1,3}?\\.+

The first one of the two above doesn't change the output, while the second one returns an empty group "folders"

I have a feeling that my regex is fundamentally wrong, but unable to see why. Can anyone please shed some light this?

thanks :)

/Geir

Was it helpful?

Solution

How about:

^.*share(?P<folders>(?:\\[^\\]+){1,3})
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top