Extract root, month letter-year and yellow key from a Bloomberg futures ticker

Question 1

Assuming there are no leading or trailing whitespaces and only upcase letters in the root, this should work:

^([A-Z]{2,4}|[A-Z]\s)([FGHJKMNQUVXZ]\d{1,2}) (Curncy|Equity|Index|Comdty)$

You've got root in the first group, letter-year in the second, yellow key in the third.

I don't know Matlab nor whether it covers Perl Compatible Regex. If it fails, try e.g. with instead of \s. Also, drop the ^...$ if you'd like to extract from a bigger source text.

Question 2

The expression you're feeding regexpi with contains a space and is used as a pattern for 'match'. This is why the matched monthyear string also has a space¹.

If you want to keep it simple and let regexpi do the work for you (instead of postprocessing its output), try a different approach and capture tokens instead of matching, and ignore the intermediate space:

%//     <$1><----------$2---------> <$3>
expr = '(.+)([FGHJKMNQUVXZ]\d{1,2}) (.+)';
tickinfo = regexpi(bbergtickers, expr, 'tokens', 'once');

You can also simplify the expression to a more genereic '(.+)(\w{1}\d{1,2})\s+(.+)', if you wish.

Example

bbergtickers = 'MCDZ3 Curncy';
expr = '(.+)([FGHJKMNQUVXZ]\d{1,2})\s+(.+)'; 
tickinfo = regexpi(bbergtickers, expr, 'tokens', 'once');

The result is:

tickinfo =
    'MCD'
    'Z3'
    'Curncy'

^{¹ This expression is also used as a delimiter for 'split'. Removing the trailing space from it won't help, as it will reappear in the rootyk output instead.}

Question 3

Assuming you just want to get rid of the leading and or trailing spaces at the edge, there is a very simple command for that:

monthyear = trim(monthyear)

For removing all spaces, you can do:

monthyear(isspace(monthyear))=[]

Question 4

Here is a completely different approach, basically this searches the letter before your year number:

s = 'MCDZ3 Curcny'
p = regexp(s,'\d')
s(min(p)
s(min(p)-1:max(p))