Question

I am new to regular expressions.

I want to do multiline search. Here is the example of what I want to do:

Suppose I have following text:

*Project #1:
CVC – Customer Value Creation (Sep 2007 – till now)
Time Warner Cable is the world's leading media and entertainment company, Time Warner Cable (TWC) makes coaxial quiver.
Client   : Time Warner Cable, US.
ETL Tool  : Informatica 7.1.4
Database  : Oracle 9i.
Role   : ETL Developer/Team Lead.
O/S   : UNIX.
Responsibilities:
Created Test Plan and Test Case Book.
Peer reviewed team members Mappings.
Documented Mappings.
Leading the Development Team.
Sending Reports to onsite.
Bug fixing for Defects, Data and Performance related.                                                                                                     
Project #2:
MYER – Sales Analysis system (Nov 2005 – till now)
            Coles Myer is one of Australia's largest retailers with more than 2,000 stores throughout Australia,
Client   : Coles Myer Retail, Australia.
ETL Tool  : Informatica 7.1.3
Database  : Oracle 8i.
Role   : ETL Developer.
O/S   : UNIX.
Responsibilities:
Extraction, Transformation and Loading of the data using Informatica.
Understanding the entire source system.                                                                                     
Created and Run Sessions and Workflows.
Created Sort files using Syncsort Application.*

I want to write RegEx which should first try to match word "Project" which can be either in small or upper case.

If "project" matches, then RegEx should try to match either client, role, environment. If RegEx. matches ANY ONE of these, then match is complete. (Words client, role, enviornment can be in any case also they may or may not be on the same line as that of word "project")

I have written one regular expression for above task which is like this :

^((P|p)roject.*\s*.*((((E|e)nviornment)|((P|p)latform)|((R|r)ole(s)?)|((R|r)esponsibilit(y|ies))|((C|c)lient)|((C|c)ustomer)|((P|p)eriod)))

This RegEx. matches Project #1 but does not match Project #2.

Can anyone please tell me what is wrong with this RegEx or how to write RegEx for this kind of text?

Was it helpful?

Solution

Try this:

Regex project = new Regex(
   @"^(Project [\s\S]*?" + 
   @"(Environment|Platform|Roles?|Responsibilit(y|ies)|Client|Customer|Period))",
   RegexOptions.ECMAScript | RegexOptions.IgnoreCase | RegexOptions.Multiline);

OTHER TIPS

In case of C# you can specify the Multiline options as a parameter to the Regex constructor:

Regex r = new Regex("(var matches = new Array\\([^\\)]*\\);)",  
          RegexOptions.IgnoreCase | RegexOptions.Compiled 
          | RegexOptions.Multiline);

For more code details please refer the link: C# and Regex: How to extract strings between quotation marks

since you didn't specified a programming language, here some commonly used patterns to accomplish this

/yourRegexpattern/m  <-- the m stays for multiline

you could also use

/yourRegexpattern/im <-- the i stays for case insensitivity

to remove the need of those (P|p) etc.

In C#, you have to specify these flags in the regex's constructor, just use autocompletion.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top