Question

I'm interested in representing a range, similar to Guava's Range type, in Python. Specifically, it should have a start and end, and represent all values between the two (as a first pass, I'm fine with only representing the canonical open-closed range, i.e. [5,10), but proper representation of any open/closed range would be a reasonable feature).

I'm aware of the range() builtin, but my intent is to support arbitrary types (or specifically dates, for my use case).

Looking at Python's type hierarchy, it seems a range could be a Sequence or Set type fairly logically, but I'm unsure which makes more sense, of if it would be better to forgo shoehorning my class into that hierarchy and simply implement the behavior I want.

As a Sequence:

  • Fits the spec fairly well, it's a "finite ordered set".
  • A range can be counted, sliced, and iterated over.
  • However I potentially want to support unbounded ranges, e.g. [0,+∞), so maybe the above isn't true.

As a Set:

  • Slightly less to-spec, as a range is explicitly ordered
  • Conceptually more like a range, as set-theoretic operations like intersection and union make more sense
  • Properly represents that contains checks are efficient

As a separate structure:

  • We lose the benefits of following the patterns the above types (we'd have to define a separate range.slice() method, for instance)
  • But we're more explicit that this structure should not be confused with these types either. The fact that Guava's Range doesn't extend from the Collection API seems to back this argument up.

I'm curious what seems most Pythonic here, and if anyone's made any such data structures themselves.

Was it helpful?

Solution

Here's the implementation I've come up with so far. A Range object represents an arbitrary openClosed range, and is hash-able, contain-able, and iter-able, but is neither a sequence nor a set. The DateRange subclass represents ranges of dates, which primarily simply requires defining the increment argument as timedelta(days=1) rather than simply 1.

class Range:  
  '''
  Represents a range, in the spirit of Guava's Range class.
  Endpoints can be absent, and (presently) all ranges are openClosed.
  There's little reason to use this class directly, as the range()
  builtin provides this behavior for integers.
  '''
  def __init__(self, start, end, increment=1):
    if start and end and end < start:
      raise ValueError("End date cannot be before start date, %s:%s" % (start,end))
    self.start = start
    self.end = end
    self.increment = increment

  def __repr__(self):
    return '[%s\u2025%s)' % (
      self.start or '-\u221E',
      self.end   or '+\u221E'
    )

  def __eq__(self, other):
    return self.start == other.start and self.end == other.end

  def __hash__(self):
    return 31*hash(self.start) + hash(self.end)

  def __iter__(self):
    cur = self.start
    while cur < self.end:
      yield cur
      cur = cur + self.increment

  def __contains__(self, elem):
    ret = True
    if self.start:
      ret = ret and self.start <= elem
    if self.end:
      ret = ret and elem < self.end
    return ret

class DateRange(Range):
  '''A range of dates'''
  one_day = timedelta(days=1)

  @staticmethod
  def parse(daterange):
    '''Parses a string into a DateRange, useful for
    parsing command line arguments and similar user input.
    *Not* the inverse of str(range).'''
    start, colon, end = daterange.partition(':')
    if colon:
      start = strToDate(start) if start else None
      end = strToDate(end) if end else None
    else:
      start = strToDate(start)
      end = start + DateRange.one_day
    return DateRange(start, end)

  def __init__(self, start, end):
    Range.__init__(self, start, end, DateRange.one_day)

def strToDate(date_str):
  '''Parses an ISO date string, such as 2014-2-20'''
  return datetime.datetime.strptime(date_str, '%Y-%m-%d').date()

Some usage examples:

>>> DateRange(datetime.date(2014,2,20), None)
[2014-02-20‥+∞)
>>> DateRange(datetime.date(2014,1,1), datetime.date(2014,4,1))
[2014-01-01‥2014-04-01)
>>> DateRange.parse(':2014-2-20')
[-∞‥2014-02-20)
>>> DateRange.parse('2014-2-20:2014-3-22')
[2014-02-20‥2014-03-22)
>>> daterange = DateRange.parse('2014-2-20:2014-3-2')
>>> daterange
[2014-02-20‥2014-03-02)
>>> datetime.date(2014,1,25) in daterange
False
>>> datetime.date(2014,2,20) in daterange
True
>>> list(daterange)
[datetime.date(2014, 2, 20), datetime.date(2014, 2, 21), datetime.date(2014, 2, 22),
 datetime.date(2014, 2, 23), datetime.date(2014, 2, 24), datetime.date(2014, 2, 25),
 datetime.date(2014, 2, 26), datetime.date(2014, 2, 27), datetime.date(2014, 2, 28),
 datetime.date(2014, 3, 1)]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top