Question

What do people mean when they say "Perl is very good at parsing"?

How is Perl any better or more powerful than other scripting languages such as Python or Ruby?

Was it helpful?

Solution

They mean that Perl was originally designed for processing text files and has many features that make it easy:

  • Perl has many functions for string processing: substr, index, chomp, length, grep, sort, reverse, lc, ucfirst, ...
  • Perl automatically converts between numbers and strings depending on how a value is used. (e.g. you can read the character string '100' from a file and add one to it without needing to do an string to integer conversion first)
  • Perl automatically handles conversion to and from the platform encoding (e.g. CRLF on Windows) and a logical newline ("\n") within your program.
  • Regular expressions are integrated into the syntax instead of being a separate library.
  • Perl's regular expressions are the "gold standard" for power and functionality.
  • Perl has full Unicode support.

Python and Ruby also have good facilities for text processing. (Ruby in particular took much inspiration from Perl, much as Perl has shamelessly borrowed from many other languages.) There's little point in asking which is better. Use what you like.

OTHER TIPS

Don't take a statement of Perl's strengths to be a statement of another language's failings. Perl is good for text processing, but that doesn't mean Ruby or Python suck.

When people talk about Perl being "good for parsing", they're mainly echoing Perl's history; it was invented in the day when heavy-duty text processing wasn't easy. Try doing some of that in C or C++ (Java hadn't been invented yet, either!). Back in the day, Larry was trying to do his work with sed and awk, but running into their limitations. He made a tool that made text even easier to work with.

Perl is still very good for text manipulation tasks, but now so are a lot of other languages.

Perl is good for ETL or batch processing motions as well. It's a minimal amount of code to pick up the file; push it through split to get a map, perform some logical business actions on the record, and write it back out to disk.

I suppose that's more data processing then data parsing, but data processing is bulk data parsing.

Perl is very good in text parsing, when compared to C/C++/Java.

It's probably because people are used to what it was built for, as described in the perl documentation, so it has become commonplace for many people to associate parsing of text files with Perl. Not to exclude Ruby or Python, it's just more of a household name IMHO.

Perl is a language optimized for scanning arbitrary text files, extracting information from those text files, and printing reports based on that information. It's also a good language for many system management tasks. The language is intended to be practical (easy to use, efficient, complete) rather than beautiful (tiny, elegant, minimal).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top