Question

Are Regular Expressions a must for doing programming?

Was it helpful?

Solution

One could easily go without them but one should (IMHO) know the basics, for 2 reasons.
1) There may come a time where RegEx is the best solution to the problem at hand (see image below)
2) When you see a Regex in someone else's code it shouldn't be 100% mystical.

preg_match('/summarycount">.*?([,\d]+)<\/div>.*?Reputation/s', $page, $rep);

This code is simple enough but if you don't know RegEx then that stuff thats in the first parameter may as well be a Martian language. The RegEx thats used here is actually pretty simple once you learn the basics, and to get you that far head over to http://www.regular-expressions.info/ they have ALOT of info about RegEx and its various implimentations on the different platforms/langauges they also have a great tutorial to get started with. After that check out RegexBuddy, it can help you build RegExs and while you build them if you watch what it does then it can help you lean, it by far was the best $39.95 I've ever spent.



Original Comic

OTHER TIPS

Yes. You can manage without them, but you really should learn at least the basics as most computing tasks could use them. You will save a lot of pain and hassle in the long run. Regex's are much easier than you think once you get over the initial 'wtf' stage.

I would say no, they are not a must. You can be a perfectly good programmer without knowing them.

I find I use Regular Expressions mostly for one-off tasks of data manipulation rather than for actually putting in application code. They can be handy for validating input data but these days your controls often do that for you anyway.

Not at all. Anything that you can do with regular expressions is entirely possible to do without them.

However, it's a powerful pattern matching system, so some things that is quite easy to accomplish with a simple regular expression pattern takes a lot of code to do without it.

For example, this:

s = Regex.Replace(s, "[bcdfghjklmnpqrstvwxz]", "$1o$1");

needs a bit more code to do without a regular expression:

StringBuilder b = new StringBuilder();
foreach (char c in s) {
   if ("bcdfghjklmnpqrstvwxz".IndexOf(c) != -1) {
      b.Append(c).Append('o').Append(c);
   } else {
      b.Append(c);
   }
}
s = b.ToString();

Or if you are not quite as experienced a programmer, you could easily create something that is even more code and performs horribly bad:

string temp = "";
for (int i = 0; i < s.Length; i++ ) {
   if (
      s[i] == 'b' || s[i] == 'c' || s[i] == 'd' ||
      s[i] == 'f' || s[i] == 'g' || s[i] == 'h' ||
      s[i] == 'j' || s[i] == 'k' || s[i] == 'l' ||
      s[i] == 'm' || s[i] == 'n' || s[i] == 'p' ||
      s[i] == 'q' || s[i] == 'r' || s[i] == 's' ||
      s[i] == 't' || s[i] == 'v' || s[i] == 'w' ||
      s[i] == 'x' || s[i] == 'z'
   ) {
      temp += s.Substring(i, 1);
      temp += "o";
      temp += s.Substring(i, 1);
   } else {
      temp += s.Substring(i, 1);
   }
}
s = temp;

Let me put it this way, if you have regular expressions in your toolkit, you'll save yourself a lot of time and energy. If you don't have them, you won't know what you're missing out on so you'll still be happy.

As a web developer, I use them very often (input validation, extracting data from a site etc).

EDIT: I realized it might help you to look at some common problems that regex is used for by looking at the regex tag right here on stackoverflow.

I would say yes.

They're so universally useful that it's a pretty significant handicap to be entirely without the ability to at least read & write simple ones.

Languages that Support Regular Expressions

  • Java
  • perl
  • python
  • PHP .
  • C#
  • Visual Basic.NET
  • ASP
  • powershell
  • javascript
  • ruby
  • tcl
  • vbscript
  • VB6
  • XQuery
  • XPath
  • XSDs
  • MySQL
  • Oracle
  • PostgreSQL

IDEs and Editors that Support Regular Expressions

  • Eclipse
  • IntelliJ
  • Netbeans
  • Gel
  • Visual Studio
  • UltraEdit
  • JEdit
  • Nedit
  • Notepad++
  • Editpad Pro
  • vi
  • emacs
  • HAPEdit
  • PSPad

And let's not forget grep and sed!

As an employer, which would you rather have, a good programmer that - once in a while - will have to manually find/replace some set of similar strings across thousands of source files and require hours or days to do it, or a good programmer that - once in a while - spends five, or even ten minutes crafting a regex to accomplish the same thing that runs in the time it takes them to go get some coffee?

Real World Practical Usage in this very Answer

In fact, I actually used a regex in crafting this post. I initially listed the languages that support it in comma delimited prose. I then rethought it and changed the format to a bulleted list by searching for the expression (\w+), and replacing it with \n* $1 in JEdit. And the more experience you get with them, using them will become more and more cost effective for shorter and shorter sets of actions.

No. You can be programming for years without touching regular expressions. Of course it will mean that for some cases where someone who knows RE:s would use them, you would do something else. There is always more than one way to solve a particular problem, and regular expressions is just one way (a very efficient, and perhaps therefore popular way) of expressing patterns.

If you care about developing a career as a software engineer, then yes. I hire software engineers and if they don't know the basics of using regular expressions, or have never heard of them, then I wonder how much experience they actually have across the entire spectrum of programming techniques. What else don't they know?

Most of the comments above say 'no, you can solve the problem in other ways' and they also mostly say the alternatives are more code and take longer to write... now think maintainability and how easy this bespoke code would be to change... Use a regular expression - then it's just a single line of code.

At least knowing that regular expressions exist and what they can be used for is an absolute must. Otherwise you will be in danger of reinventing the wheel in many situations. If you know about their existence you can go into the details once you have to apply them. BTW, the theory behind regular expressions is quite interesting :-)

There is a great book out there written by Jeffrey Friedl called Mastering Regular Expressions. It gave me insight and was a real joy to read.

Even though I do not use regexes that often, they recently came in handy:

  • Input: Some CSV dictionary file with some kind of loose format, multiple translations, sayings, etc.

  • Output: Nice JSON.

  • First thought: Write a short grammar to parse all possible fields and values.

  • First attempt: Wrote a grammar, but there were some rough edges, mainly special cases, which occured in just 0-1% of the data. Making a grammar that catches all would have been too-much-design.
  • Second attempt: I used a simple grammar catching the main fields and then passed over the rest to a routine, which applied some regular expressions. It was fast, conceptually easier than a full grammar and fun to write, too.

  • Summary: Regular expressions saved me hours and actually helped me seeing the special cases in the data and how and where they appeared.

Are they worth learning? Yes.

A must? No, but I know almost no one in the field whose not familiar with them.

Difficult to learn? Not at all.

In a word, No.

But they can certainly be the right tool for the right job and are worth learning for those string matching operations where they work best. However, just because you've got a good, big hammer, it doesn't mean you should use it to crack every nut.

No. I'm terrible at regular expressions myself, and still I'm a bad programmer. Wait. What?

On a more serious note: I don't know regular expressions, but hardly ever need them. If I really need one, for instance when I need to validate user input like Dave mentions, I ask a colleague.

There are so many things that are valuable to know / learn as a programmer, but I'd dare say regular expressions is far from being anywhere near the top of that list.

Actually, my feeling is that it is a must...

For example, I was looking at why a portion of our YouTube video didn't work... and it turned out the links for those videos are

http://ca.youtube.com/v/raINk2Ii1A4 (not actual URL, just as an example)

instead of

http://www.youtube.com/v/raINk2Ii1A4

Another programmer earlier used "substr()" to extract the youtube video ID, and because of the ca.youtube.com portion, the ID was extracted wrong.

So to my feeling, regular expressions are very important and without that, hidden bugs can be introduced more often than usual.

But I actually met 3 developers before, one was a very good web applications developer, one had a Master of Science degree from a prestigious Silicon Valley top university, and one was a high-profile master grad, and it turned out they all didn't know regular expressions. That was a bit surprising to me.

Well, in computer science theoretical field it's very strong and useful "equipment", since with it you are able to define regular languages and identify with it NFA or even DFA, therefore prove some difficult theorem in computation theory or finite automate and formal languages field. In practical programming it's very useful as well, since using it you are able to perform a complex string manipulation in relative easy way.

Probably not. But they are really easy to learn. At least the basics (the stuff all the regex engines do) are quickly taught. I learnt it in a chat window from another guy in like 30 minutes...

I guess it is not a must but they will ease your life and save you so much time.

If you dont know how to use regular expressions you dont know what you are missing. But just looking at a person using them to complete a task makes you feel that it is a skill you should definitely have.

No... and Yes,

This is very much like one of those, "Should I learn C" questions. No regular expressions are never necessarily the only way to do something. But they are often a helpful abstraction that simplifies code and can (I really think) even make it more readable. Maybe is because I love Jeff Friedl's Mastering Regular Expressions or maybe its because I do allot at work in perl. But for whatever reason regular expressions are my go to tool. It now seems easier for me to use a regex then most other string manipulation techniques.

Understanding at least at the lowest level what regular expressions are/can do is immensely important. If you understand the concepts behind and NFA then you will understand other problems much better.

As for begin good at Regular Expressions, I would say not necessary but really valuable. The fact is every Regular expression engine is different, so even if you've mastered one you may not be able to quickly do it elsewhere.

Regular expressions are important at least to learn if not to use.

First, you must be able to read and understand others' regular expression code.

Second, basic regular expressions correspond to finite automata (by the Kleene theorem), which makes them fundamentally important for algorithm design.

Actually, there is a cheat sheet skirt for girls

http://store.xkcd.com/xkcd/#RegexCheatSkirt

If you happen to be a girl, this might be a fantastic learning opportunity.

No, you always have two other options for suitable requirements.

  1. Ask a friend who knows regexes.

  2. Post the problem on SO.

Depending on your field there are certain problems that lend themselves to regexes - or rather the other way around: the solution /not/ using regular expressions is extremely clumsy. email verification/url verification/minimum password strength/date parsing come to mind.

Must it is not. Though there is come perception that a good programmer should know it, i wouldn't say so. When the time comes and you'll need it, you'll just use it. Anyway, give it a six months not using it and you won't remember any expression options.

Like everything factual in programming, you learn it, you forget it, you relearn it again.

No.

Depending on what you're trying to achieve, Regex can be useful. But I would hazard that 80% or more of programmers never use Regex, some 15% or so only occasionally (and have to Google it) and only a small % of the remainder ate actually Regex Ninjas.

I have found Regexr is pretty good for the rare occasions I use Regex.

Also, someone will mention a certain quote from jwz within the next minute or so...

Regular Expressions is a powerful pattern matching language. And it is not limited to text strings. But as always, your code, your call.

Simply, no. It all depends on what your program is set out to achieve.

Of course knowing what a RegExp is and a basic understanding of how they work can be useful in the future.

I agree with the others that it's probably not a must, but it's very helpful to have at least a basic understanding. I have a RegEx cheat sheet posted in my cube that I find very helpful.http://regexlib.com/CheatSheet.aspx

Understanding regular expressions is not a must. However, it is an effective tool for processing text. If you work on projects that manipulate text, you will eventually run across them.

Regular expressions come with a variety of challenges, whether you are using them or just supporting code that has them. Be aware that there are a variety of syntax flavors. Different libraries and languages often have slightly different syntax rules. Regular expressions, as they become more complicated can easily transition from a simple pattern matching tool to a piece of magic, write only code that cannot be easily understood. And, like most text processing tools, they can often be difficult to troubleshoot or change (e.g. you have a corner case that no long fits the features of the tool).

As with all parsing code, I recommend a lot of unit tests. In particular, watch out for edge conditions, repeated text patterns and unusual inputs.

Definitely not, I (like many people) have been programming for years without touching them. That said, once you get to know them you start to see where they might have been useful in the past :-)

I'd say - just read up on the basics so you know what RegExes are and what you can do with them, then if you ever find they might be useful you can grab a tutorial / reference website like http://www.regular-expressions.info/ and jump right in.

If you're developing a new product, I would suggest you avoid them, or at the very most use them sparingly and judiciously.

If you're maintaining a product that already uses regexp's you are left with no choice.

It helps to atleast be able to recognize a regular expression so if you encounter a particularly obfuscated piece of code you know the right search term to find a referance card.

No more so than, say, knowing HTML or being able to use a relational database. Strictly speaking, no, they're not a requirement for doing programming--- they might be essential and fundamental in some jobs, and yet irrelevant in others. You're unlikely to use regular expressions (or HTML or SQL, for that matter) while writing a device driver for a new Ethernet chip. In my area I use regular expressions occasionally in production code, much more often in ad-hoc scripts to massage reports etc. I've worked on one project where they were a central feature (an application to analyse free-form text to look for certain key phrases to produce a compiled rule set).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top