Domanda

I have time series with the following pattern and I am wondering whether somebody could share a smart trick to remove the leading zeros. The reason why I want to avoid is that it may have a negative implication on the selection of forecasting models.

Example time series:

TimeSeries <- ts(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
                   0, 0, 0, 0, 0, 0, 9, 10, 10, 16, 7, 13, 0, 9, 1, 
                   11, 2, 11, 3, 11, 4, 1, 20, 13, 18, 19, 16, 16, 16, 
                   15, 14, 27, 24, 35, 8, 18, 21, 20, 19, 22, 18, 21
),start=c(2001,6),frequency=12)

I can imagine a procedure of narrowing down the leading series of zeros with performing multiple tests with subsets of the time series and then removing the leading subset with only zeros. However, this will be a cumbersome procedure, which is likely to be inefficient in terms of computation.

Is anybody aware of an already existing function or procedure to do this efficiently?

È stato utile?

Soluzione

This removes only the leading zeros and leaves in the other zeros:

TimeSeries[cumsum(TimeSeries)!=0]
#[1]  9 10 10 16  7 13  0  9  1 11  2 11  3 11  4  1 20 13 18 19 16 16 16 15 14 27 24 35  8 18 21 20 19 22 18 21

Why is this doing the trick? The out put of the cumsum is:

cumsum(TimeSeries)
 [1]   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   9  19  29  45  52  65  65  74  75
[33]  86  88  99 102 113 117 118 138 151 169 188 204 220 236 251 265 292 316 351 359 377 398 418 418 437 459 477 498

Thus, only in cases where there are only zeros the result is equal to zero. In case there is a zero somewhere halfway in the time series the cumsum will not change but won't be zero.

If there are negative values in the timeseries you can use:

TimeSeries[cumsum(abs(TimeSeries))!=0]

Altri suggerimenti

TimeSeries[TimeSeries != 0]... works for me, probably is a better way out there, though:

> TimeSeries <- ts(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
                   0, 0, 0, 0, 0, 0, 9, 10, 10, 16, 7, 13, 0, 9, 1, 
                   11, 2, 11, 3, 11, 4, 1, 20, 13, 18, 19, 16, 16, 16, 
                   15, 14, 27, 24, 35, 8, 18, 21, 20, 19, 22, 18, 21
),start=c(2001,6),frequency=12)
> TimeSeries[TimeSeries != 0]
 [1]  9 10 10 16  7 13  9  1 11  2 11  3 11  4  1 20 13 18 19 16 16 16 15 14 27
[26] 24 35  8 18 21 20 19 22 18 21
>

Hope that helps!

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top