What are the differences between regular expression syntaxes for different tools? [closed]

StackOverflow https://stackoverflow.com/questions/22491944

  •  17-06-2023
  •  | 
  •  

Different tools implement regular expressions differently. For example to match "foo" or "bar":

printf "%s\n" foo bar baz food | grep -o '\<\(fo\+\|bar\)\>'
printf "%s\n" foo bar baz food | awk '/\<(fo+|bar)\>/'
printf "%s\n" foo bar baz food | sed -n '/\<\(fo\+\|bar\)\>/p'
printf "%s\n" foo bar baz food | sed -nr '/\<(fo+|bar)\>/p'

Where are these differences documented?

有帮助吗?

解决方案

Score! I'm so happy to have found this page:
https://www.gnu.org/software/gnulib/manual/html_node/Regular-expression-syntaxes.html

14.8 Regular expression syntaxes

Gnulib supports many different types of regular expressions; although the underlying features are the same or identical, the syntax used varies. The descriptions given here for the different types are generated automatically.

  • awk regular expression syntax
  • egrep regular expression syntax
  • ed regular expression syntax
  • emacs regular expression syntax
  • gnu-awk regular expression syntax
  • grep regular expression syntax
  • posix-awk regular expression syntax
  • posix-basic regular expression syntax
  • posix-egrep regular expression syntax
  • posix-extended regular expression syntax
  • posix-minimal-basic regular expression syntax
  • sed regular expression syntax

其他提示

It may also be helpful to note that the only difference in the regex part is the difference between Basic Regular Expression (BRE) and Extended Regular Expressions (ERE).

BRE (+GNU)

printf "%s\n" foo bar baz food | grep '\<\(fo\+\|bar\)\>'
printf "%s\n" foo bar baz food | sed -n '/\<\(fo\+\|bar\)\>/p'

ERE (+GNU)

printf "%s\n" foo bar baz food | grep -E '\<(fo+|bar)\>'
printf "%s\n" foo bar baz food | sed -nr '/\<(fo+|bar)\>/p'
printf "%s\n" foo bar baz food | awk '/\<(fo+|bar)\>/'

I left out the -o with grep above.

It may be also good to note that all examples above are with GNU utilities with GNU extensions to POSIX regular expressions.

All examples are using the GNU extension :

\< ... \>

And in addition the BRE examples are using the GNU extension:

\+

Which will probably not work if used with other versions of these utilities..

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top