Question

I want to apply an operation on a list, which should span three elements at a time.

The elements are strings, which consists of numbers then characters, like

'234.23432 hel'

So a sample string would like like this

 ['0.234 sil', '0.433 dh', '0.822 ax', '1.122 t', '1.45 r', '1.890 ih', '2.302 p']
 end_point = 2.56

The number in each string is a starting time (the next element's starting time marks the preceding's end time) and the characters are actually phonemes. What I'm trying to achieve now is to calculate the time for three phonemes at a time. So I would start at the first element, which is '0.234 sil'. Since it doesn't have any preceding element, I assume the start point is 0. Now I look at the succeeding+1 element, which is '0.822 ax', thus I know sil-dh spans from 0-0.822. The next would be sil-dh-ax, which spans from 0.234-1.122, and so on. If there is no succeeding+1 element or it's the last element, it should use the end_point value instead. So the second to last result would be r-ih-p with the range of 1.45-2.56. For the last element ih-p with range 1.890-2.56.

I hope it's understandable. Is there an 'easy' way to accomplish this? Some sort of filter?

Was it helpful?

Solution

You have to split your data first

l = ['0.234 sil', '0.433 dh', '0.822 ax', '1.122 t', '1.45 r', '1.890 ih', '2.302 p']
val, tok = zip(*map(str.split, l))
val = map(float, val)

then you can combine it the way you like, for example

tok_from_to = ['-'.join(tok[max(i-3, 0): min(i, len(l))]) for i in range(2, len(l)+2)]
# ['sil-dh', 'sil-dh-ax', 'dh-ax-t', 'ax-t-r', 't-r-ih', 'r-ih-p', 'ih-p']
val_from = [0] + val[:-1]
val_to = val[2:] + [end_point]*2

and if you wish, combine back:

zip(tok_from_to, val_from, val_to)
# [('sil-dh', 0, 0.822), ('sil-dh-ax', 0.234, 1.122), ('dh-ax-t', 0.433, 1.45), ('ax-t-r', 0.822, 1.89), ('t-r-ih', 1.122, 2.302), ('r-ih-p', 1.45, 2.56), ('ih-p', 1.89, 2.56)]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top