Question

I'm trying to design a regular expression to extract MIDI note names and octaves from a string, to eventually turn the result into an actual MIDI note value.

MIDI note names are a discrete range from C-2 to G8, corresponding to 0...127, so that:

0 = C-2
1 = C#-2
2 = D-2
...
125 = F8
126 = F#8
127 = G8 

Here's what I have now:

^([CDEFGAB]#?)(-?[0-8])

This expression matches all the valid MIDI notes with capture groups, and discards obvious ones like K9. The problem is that it also matches bogus input like A10, and gives me note = A octave = 1, which is wrong.

EDIT: Of course this also accepts wrong octaves like -8, but I can filter them after the match.

This is part of a project in Objective-C, which uses the ICU regex flavor.

Any suggestions?

Was it helpful?

Solution

^([CDEFGAB]#?)((?:-[1-2])|[0-8])$

This will fix several problems. First, since we add the $ at the end of the regex, it will stop at the end of the string (this prevents A10, as we match precisely one digit). Then, the segment

((?:-[1-2])|[0-8])

will either take a negative value between 1 and 2 (and not capture it twice, from (?:...)), or it will take the original range of positive 0-8.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top