Perl regex issue with brackets where content are multiline

Question 1

You can use this:

#!/usr/bin/perl
use strict;
use warnings;

my $str = "previous content ending with a linebreak
keyword: content
next content

previous contnet, also ending with a line end
keyword: { content that contains {
nested parenthesis } and may span
multiple lines,c losed by matching parethesis}
next content";

while ($str =~ /\nkeyword:  
            (?| # branch reset: i.e. the two capture groups have the same number
                \s*
                ({ (?> [^{}]++ | (?1) )*+ }) # recursive pattern
              |               # OR
                \h*
                (.*+)   # capture all until the end of line
            )   # close the branch reset group
             /xg ) {

    print "$1\n";
}

This pattern try a possible content with nested curly brackets, if curly brackets are not found or are not balanced, the second alternative is tried and match only the content of the line (since the dot can't match newlines).

The branch reset feature (?|..|..) is useful to give the same number to the capturing group of each part of the alternation.

recursive pattern details:

(                 # open the capturing group 1
    {             # literal opening curly bracket
    (?>           # atomic group: possible content between brackets
        [^{}]++   # all that is not a curly bracket
      |           # OR
        (?1)      # recurse to the capturing group 1 (!here is the recursion!)
    )*+           # repeat the atomic group zero or more times
    }             # literal closing curly bracket
)                 # close the capturing group 1

In this subpattern I use an atomic group (?>...) and possessive quantifiers ++ and *+ to avoid backtracking the most possible.

Question 2

How about something like this?

if ($str =~ /keyword:\s*{(.*)}/s) {
    my $key = $1;
    if ($key =~ /([^{}]*)/) {
        print "$1\n";
    }
    else {
        print "$key\n";
    }
}
elsif ($str =~ /keyword:\s*(.*)/) {
    print "$1\n";
}

[^{|^}] is looking for a chunk of letters that doesn't have any braces in it i.e. the most inner letters of the nested braces.

The s modifier allows you to look at multiple lines even when using .*. However, you don't want to look at multiple lines for keywords without braces, so that part is in the elsif statement.

Do you need to have the same number of matching braces? For example, should keyword: {foo{bar{hello}}} output {{{hello}}}? If so, I feel like it would be better to stick with counters.

Edit:

For the input

keyword: {multiline 
with nested {parenthesis} }

if you want the output

{multiline with nested {parenthesis} }

I believe that would be

if ($str =~ /keyword:\s*({.*})/s) {
    my $match = $1;
    $match =~ s/\n//g;
    print "$match\n";
}
elsif ($str =~ /keyword:\s*(.*)/) {
    print "$1\n";
}