Question

First of all, let me please clarify that I know absolutely nothing about regular expressions, but I need to write a "Tagger Script" for MusicBrainz Picard so that it doesn't mess with the way I format certain aspects of my tracks' titles.

Here's what I need to do: - Find all sub-strings inside parenthesis - Then, for those matches that meet a given criteria and those matches only, change the parentheses to brackets

For example, consider this string: DJ Fresh - Louder (Sian Evans) (Flux Pavilion & Doctor P Remix)

It needs to be changed like so: DJ Fresh - Louder (Sian Evans) [Flux Pavilion & Doctor P Remix]

The condition is that if the string within the parentheses contains the sub-string "dj" or "mix" or "version" or "inch", etc... then the parentheses surrounding it need to be changed to brackets.

So, the question is: Is it possible to create a single regex expression that can perform this operation?

Thank you very much in advance.

Était-ce utile?

La solution

Assuming there are no nested brackets, you can use the following regex to search for the text:

(?i)\((?=[^()]*(?:dj|mix|version|inch))([^()]+)\)

Note that the regex is case-insensitive, due to (?i) in front - make it case-sensitive by removing it.

Check the syntax of your language to see if you can use r prefix, e.g. r'literal_string', to specify literal string.

And use the following as replacement:

[$1]

You can include more keywords by adding keywords to (?:dj|mix|version|inch) part, each keyword separated by |. If the keyword contains (, ), [, ], |, ., +, ?, *, ^, $, \, {, } you need to escape them (I'm 99% sure the list is exhaustive). An easier way to think about it is: if the keyword only contains space and alphanumeric (but note that the number of spaces is strict), you can add them into the regex without causing side-effect.


Dissecting the regex:

  • (?i): Case-insensitive mode

  • \(: ( is special character in regex, need to escape it by prepending \.

  • (?=[^()]*(?:dj|mix|version|inch)): Positive look-ahead (?=pattern):

    • [^()]*: I need to check that the text is within bracket, not outside or in some other bracket, so I use a negated character class [^characters] to avoid matching () and spill outside the current bracket. The assumption I made also comes into play a bit here.

    • (?:dj|mix|version|inch): A list of keywords, in a non-capturing group (?:pattern). | means alternation.

  • ([^()]+): The assumption about no nested bracket makes it easier to match all the characters inside the bracket. The text is captured for later replacement, since (pattern) is capturing group, as opposed to (?:pattern).

  • \): ) is special character in regex, need to escape it by prepending \.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top