Can Regular Expressions search for groups no matter the order or whether they all exist? [closed]

StackOverflow https://stackoverflow.com/questions/18364654

  •  26-06-2022
  •  | 
  •  

Question

So I want to search for A,B,C,D in a string in any order, but if C doesn't exist I still want it to give me A,B, and D, etc.

To be more specific, here is the exact problem I'm trying to solve. CSV file with lines that look like this:

Name,(W)5555555,(H)5555555,(M)5555555,(P)5555555

However, the W,H,M,P could be in any order. Plus they don't all exist on every line. So it looks more like this:

Name,(W)5555555,(H)5555555,(M)5555555,(P)5555555
Name,(H)5555555,(P)5555555,(W)5555555,(M)5555555
Name,(M)5555555,(H)5555555,,
Name,(P)5555555,,,

What I need to accomplish is to put all items in the correct order so they line up under the correct columns. So the above should look like this when I'm done:

Name,(W)5555555,(H)5555555,(M)5555555,(P)5555555
Name,(W)5555555,(H)5555555,(M)5555555,(P)5555555
Name,,(H)5555555,(M)5555555,
Name,,,,(P)5555555

Edit: It appears I'm a bad Stack Overflow citizen. I didn't get answers fast enough for when my project needed to be done, and therefore forgot to come back and add a correct issues in my post. I ended up writing a python script to do this instead of just using find/replace in BBEdit or Sublime Text 2 like I was originally trying to do.

So I would like a method to do something like this that works in either BBEdit or Sublime Text. Or Vim for that matter. I'll try to keep a better eye on it this time, and I'll respond to the answers that already exist.

Was it helpful?

Solution

If your regex flavor supports lookarounds, this can be done with a simple regex-replace. Since lookaheads do not advance the position of the regex engine's cursor, we can use them to look for multiple patterns somewhere after one particular position. We can capture all these findings and write them back in the replacement string. To make sure that all of them are optional we could simply use ?, but in this case, I'll add an empty alternative to the lookahead - this is necessary to trick the engine when it's backtracking. The pattern could then look like this:

^Name,(?=.*([(]W[)]\d+)|)(?=.*([(]H[)]\d+)|)(?=.*([(]M[)]\d+)|)(?=.*([(]P[)]\d+)|).*

The .* at the end is to make sure that everything gets removed in the replacement.

And the replacement string like this:

Name,$1,$2,$3,$4

Here is a working demo using the ECMAScript flavor. It's a rather limited flavor, so this solution should be adaptable to most environments.

OTHER TIPS

Something like this?

^Name,(\((?:W|H|P|M)\)\d+(?:,)?)*[,]*$

Regular expression visualization

Edit live on Debuggex

Will give you all the matches per row. Then you simple need to allocate each match to the right column.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top