Question

I have a pandas Series containing groups of numbers and nans, and I want to get the start and end of each group. The following code does this:

def get_ranges(d):
    results = []
    start = None
    for i in range(len(d) - 1):
        if start is None and not np.isnan(d.ix[i]):
            start = d.index[i]
        if start is not None and np.isnan(d.ix[i + 1]):
            results.append((start, d.index[i]))
            start = None
    if start is not None:
        results.append((start, d.index[i]))
    return pd.DataFrame(results, columns=['start', 'end'])

E.g.:

In [24]: d = pd.Series([0, 1, 4, 2, nan, nan, nan, 4, 2, nan, 10, nan])

In[25]: get_ranges(d)
Out[25]: 
   start  end
0      0    3
1      7    8
2     10   10

[3 rows x 2 columns]

But it seems like this is something that pandas should be able to do quite easily, possibly using groupby. Is there some built in method of getting these groups that I'm missing?

Was it helpful?

Solution 2

Not sure whether it has a more convenient way to do that, followings are what I'm using:

Get the index of those have numbers but not nan

In [134]: s = d.dropna().index.to_series()

In [135]: s
Out[135]: 
0      0
1      1
2      2
3      3
7      7
8      8
10    10
dtype: int64

Get start and end by

In [136]: start = s[s.diff(1) != 1].reset_index(drop=True)

In [137]: end = s[s.diff(-1) != -1].reset_index(drop=True)

Then you can construct what you want by

In [138]: pd.DataFrame({'start': start, 'end': end}, columns=['start', 'end'])
Out[138]: 
   start  end
0      0    3
1      7    8
2     10   10

[3 rows x 2 columns]

OTHER TIPS

You can use isnull() and cumsum() to create the groupby keys:

import pandas as pd
import numpy as np

nan = np.nan
d = pd.Series([0, 1, 4, 2, nan, nan, nan, 4, 2, nan, 10, nan])

mask = d.isnull()
index = mask.cumsum()
mask = ~mask

d[mask].groupby(index[mask]).agg(
{"start":lambda s:s.index[0],
 "end":lambda s:s.index[-1]}).reset_index(drop=True)

output:

   start  end
0      0    3
1      7    8
2     10   10
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top