Question

I need to "compress" the size of python arrays which represent signals. The signals look like following example.

signal = [
    [0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0,1.1], #time values
    [1,1,1,2,3,4,4,4,4,2,1,1] #function values
    ]

After compressing, the signal should look like in following code.

signal_compressed = [
    [0.0,0.2,0.3,0.4,0.5,0.8,0.9,1.0,1.1], #time values
    [1,1,2,3,4,4,2,1,1] #function values
    ]

You see, if there are areas with constant values, only the first and the last value of this area is stored.

I wrote following algorithm to do this.

signal_compressed = [[],[]]

old_value = None
for index, value in enumerate(signal[1]):
    if value != old_value:
        if index > 0:
            if signal_compressed[0][-1] != signal[0][index - 1]:
                signal_compressed[0].append(signal[0][index - 1])
                signal_compressed[1].append(signal[1][index - 1])
        signal_compressed[0].append(signal[0][index])
        signal_compressed[1].append(value)
        old_value = value

if signal_compressed[0][-1] < signal[0][-1]:
    signal_compressed[0].append(signal[0][-1])
    signal_compressed[1].append(signal[1][-1])

This algorithm works fine. And for signals with lot of constant segments he works quite fast. But if I try to compress signals with no constant segments (e.g. a sine signal or a noise signal), the algorithm works really slow.

How can I accelerate my algorithm and conserve the functionality?

Was it helpful?

Solution

Here is one way to do it using a generator:

def compress(signal):
    prev_t, prev_val = None, None
    for t, val in zip(*signal):
        if val != prev_val:
            if prev_t is not None:
                yield prev_t, prev_val
            yield t, val
            prev_t, prev_val = None, val
        else:
            prev_t, prev_val = t, val
    if prev_t is not None:
        yield prev_t, prev_val

signal = [
    [0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0,1.1], #time values
    [1,1,1,2,3,4,4,4,4,2,1,1] #function values
    ]
print zip(*compress(signal))

I think it would be more natural to transpose signal, storing it like so:

[(0.0, 1),
 (0.1, 1),
 (0.2, 1),
 (0.3, 2),
 (0.4, 3),
 (0.5, 4),
 (0.6, 4),
 (0.7, 4),
 (0.8, 4),
 (0.9, 2),
 (1.0, 1),
 (1.1, 1)]

This way the two zip(*seq) calls would be unnecessary and the entire processing can be done on the fly.

Finally, if this is still too slow for large input, it might be worth looking into using NumPy. Here is an outline of one such solution:

import numpy as np

signal = [
    [0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0,1.1], #time values
    [1,1,1,2,3,4,4,4,4,2,1,1] #function values
    ]

def npcompress(signal):
    sig=np.array(signal)
    idx = np.where(sig[1][1:] != sig[1][:-1])[0]
    idx_arr = np.sort(np.array(list(set(idx) | set(idx + 1) | set([0]) | set([len(sig[1]) - 1]))))
    return sig.T[idx_arr]

print npcompress(signal).T

OTHER TIPS

you can use itertools.groupby():

In [93]: from itertools import groupby

In [94]: from operator import itemgetter

In [95]: ti=[0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0,1.1]

In [96]: fun=[1,1,1,2,3,4,4,4,4,2,1,1]

In [97]: def func(ti,func):
    new_time=[]
    new_func=[]
    for k,v in groupby(enumerate(func),itemgetter(1)):
       lis=list(v)
       if len(lis)>1:
           ind1,ind2=lis[0][0],lis[-1][0]
           new_time.extend([ti[ind1],ti[ind2]])
           new_func.extend([func[ind1],func[ind2]])
       else:    
           new_time.append(ti[lis[0][0]])
           new_func.append(func[lis[0][0]])
    return new_time,new_func
   ....: 

In [98]: func(ti,fun)
Out[98]: ([0.0, 0.2, 0.3, 0.4, 0.5, 0.8, 0.9, 1.0, 1.1],
          [1, 1, 2, 3, 4, 4, 2, 1, 1])
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top