upgraded from perl 5.8 (32bit) to 5.16 (64bit) - regex performance hit

Question 1

Yes, the regex engine improved greatly after v8. Alone in v10, we saw:

pattern recursion
named captures
possessive quantifiers
backtrack control verbs like (*FAIL) or (*SKIP).
The \K operator
… and some more

Also, more internals were made Unicode-aware.

In v12, the Unicode support was cleaned up. The \p and \X operators in regexes are now greatly enhanced.

In v14, the Unicode support was bumped to 6.0. Charnames for the \N operator were improved (see also charnames pragma). The new character model can treat any unsigned integer as a codepoint. In the regex engine,

regexes can now carry charclass modifiers like /u, /d, /l, /a, /aa.
Non-destructive susbtitution with /r was implemented.
The RE engine is now reentrant, so embedded code can use regexes.
\p was cleaned up
regex compilation is faster when a switch to unicode semantics is neccessary.

In v16, perl almost supports Unicode 6.1. In the regex engine,

efficiency of \p charclasses was increased.
Various regex bugs (often involving case-insensitive matching) were fixed.

Obviously, not all of these features come at a price, but especially Unicode-awareness makes internals more complicated, and slower.

You also cannot waive a hand and state that the execution time of a script doubled from perl5 v8 x86 to perl5 v16 x64; there are too many variables:

were both Perls compiled with the same flags?
- are both perls threaded perls (disabling threading support makes it faster)
- how big are your integers? 64 bit or 32 bit?
- what compiler optimizations were chosen?
did your previous Perl have some distribution-specific patches applied?

Basically, you have to compare the whole perl -V output.

If you are hitting a performance ceiling with regexes, they may be the wrong tool for extensive parsing. At the very least, you may use the newer features to optimize the regexes to eliminate some backtracking.

If your parsing code describes a (roughly) context-free language (i.e. you don't use (?{...}), (?=...) or related regex features), and parsing means doing something like generating a tree, then Marpa::R2 might speed things up considerably.

Question 2

If you are looking for better performance you may also want to make sure that a regex is what you want. You didn't specify what kind of regexes your system was using but often you can replace a regex with a built-in function.

Examples:

if (lc($name) eq 'bob') { $bob_count++ }  #Faster
if ($name =~ /^bob$/i)  { $bob_count++ }  #Slower

my $sentiment = "I don't like beans.";
substr($sentiment, 13, 5) = 'broccoli';   #Faster
$sentiment = "I don't like beans.";
$sentiment =~ s/beans/broccoli/;          #Slower

These examples, as well as unpack, and index, might not apply to your code, but if they do you should benchmark them and see if it helps with performance.