Question

I have recently been learning about the \x modifier in Perl Best Practices, enabling you to do cool things like multi-line indentation and documentation:

$txt =~ m/^                     # anchor at beginning of line
      The\ quick\ (\w+)\ fox    # fox adjective
      \ (\w+)\ over             # fox action verb
      \ the\ (\w+) dog          # dog adjective
      (?:                       # whitespace-trimmed comment:
        \s* \# \s*              #   whitespace and comment token
        (.*?)                   #   captured comment text; non-greedy!
        \s*                     #   any trailing whitespace
      )?                        # this is all optional
      $                         # end of line anchor
     /x;                        # allow whitespace

However, I was unable to do the equivalent for find/replace string substitutions? Is there some other similar best practice that should be used to more effectively manage complex substitutions?

Edit Take this for an example:

$test =~ s/(src\s*=\s*['"]?)(.*?\.(jpg|gif|png))/${1}something$2/sig;

Is there a similar way that this could be documented using multi-line/whitespace for better readability?

Many thanks

Was it helpful?

Solution

Since you've chosen not to provide an example of something that doesn't work, I'll offer a few guesses at what you might be doing wrong:

  • Note that the delimiter (in your case /) cannot appear inside any comments inside the regex, because then they'll be indicating the end of the regex. For example, this:

    s/foo # this is interesting and/or cool
     /bar/x
    

    will not work, because the regex is terminated by the slash between and and or.

  • Note that /x does not work on the replacement-string, only on the regex itself. For example this:

    s/foo/bar # I love the word bar/x
    

    will replace foo with bar # I love the word bar.

    If you really want to be able to put comments in the replacement-string, then I suppose you could use a replacement-expression instead, using the /e flag. That would let you use the full syntax of Perl. For example:

    s/foo/'bar' # I love the word bar/e
    

Here is an example that does work:

$test =~
  s/
    # the regex to replace:
    (src\s*=\s*['"]?)      # src=' or src=" (plus optional whitespace)
    (.*?\.(jpg|gif|png))   # the URI of the JPEG or GIF or PNG image
  /
    # the string to replace it with:
    $1 .                   # src=' or src=" (unchanged)
    'something' .          # insert 'something' at the start of the URI
    $2                     # the original URI
  /sige;

OTHER TIPS

If we just add the /x, we can break up the regular expression portion easily, including allowing comments.

my $test = '<img src = "http://www.somewhere.com/im/alright/jack/keep/your/hands/off/of/my/stack.gif" />';

$test =~ s/
    ( src \s* = \s* ['"]? ) # a src attribute ...
    ( .*? 
      \. (jpg|gif|png)      # to an image file type, either jpeg, gif or png
    )
    /$1something$2/sigx     # put 'something' in front of it
    ;

You have to use the evaluation switch (/e) if you want to break up the replacement. But the multi-line for the match portion, works fine.

Notice that I did not have to separate $1, because $1something is not a valid identifier anyway, so my version of Perl, at least, does not get confused.

For most of my evaluated replacements, I prefer the bracket style of substitution delimiter:

$test =~ s{
      ( src \s* = \s* ['"]? ) # a src attribute ... '
      ( .*? 
        \. (jpg|gif|png)      # to an image file type, either jpeg, gif or png
      )
    }{
        $1 . 'something' . $2
    }sigxe 
    ;

just to make it look more code-like.

Well

$test =~ s/(src\s*=\s*['"]?)    # first group
        (.*?\.(jpg|gif|png))        # second group
        /${1}something$2/sigx;

should and does work indeed. Of course, you can't use this on the right part, unless you use somethig like :

$test =~ s/(src\s*=\s*['"]?)    # first group
        (.*?\.(jpg|gif|png))        # second group
        /
        $1              # Get 1st group
        . "something"   # Append ...
        . $2            # Get 2d group
        /sigxe;
s/foo/bar/

could be written as

s/
   foo     # foo
/
   "bar"   # bar
/xe
  • /x to allow whitespace in the pattern
  • /e to allow code in the replacement expression
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top