Question

I have a cell array of strings in matlab. Some strings may be equal. I want to number strings in the lexicographical way.

For example, if I have {'abc','aty','utf8','sport','utf8','abc'}, in the output I want to get the array [1, 2, 4, 3, 4, 1].

Can you give me any approach?

Was it helpful?

Solution

Duplicated strings make using sort tricky, but in this case you can rely on the fact that unique works for cell arrays of strings, and both sorts its output and optionally returns the indices of those sorted elements in the original input:

>> a = {'abc' 'aty' 'utf8' 'sport' 'utf8' 'abc'}
a =
{
  [1,1] = abc
  [1,2] = aty
  [1,3] = utf8
  [1,4] = sport
  [1,5] = utf8
  [1,6] = abc
}

>> [b, ~, index] = unique(a)
b =
{
  [1,1] = abc
  [1,2] = aty
  [1,3] = sport
  [1,4] = utf8
}
index =

   1   2   4   3   4   1

or you can obviously just use [~, ~, index] = unique(a); if you really only want the indices.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top