Nice question. How about this small solution:
def commonPrefix(a, b):
i = 0
while i < len(a) and i < len(b) and a[i] == b[i]:
i += 1
return i
def eachWithPrefix(v):
p = ''
for x in v:
yield commonPrefix(p, x), x
p = x
Now you can choose what you want:
list(eachWithPrefix(v))
will return a list of your values and each will state how many characters are equal to the former line, so
print '\n'.join(' '*p + x[p:] for p, x in eachWithPrefix(v))
Will print the second solution you proposed.
print '\n'.join('\t' * p + '\\'.join(x[p:]) for p, x in eachWithPrefix(x.split('\\') for x in v))
on the other hand will perform the same action for the delimiter \
and replace the to-be-omitted parts with tab stops. This is not quite the format you proposed in your first output example but I guess you get the point.
Try:
print '\n'.join('\\'.join([ s if i >= p else ' '*len(s) for i, s in enumerate(x) ]) for p, x in eachWithPrefix(x.split('\\') for x in v))
This will replace the equal parts with like-sized just-space strings. The output will still contain the delimiters, though, but maybe that's even nicer:
2014\2014-01 Jan\2014-01-01
\ \2014-01-02
\ \2014-01-03
\ \2014-01-04
\ \2014-01-05
...
\ \2014-01-31
\2014-02 Feb\2014-02-01
\ \2014-02-02
\ \2014-02-03
...
To remove also those you can use this approach:
print '\n'.join(' ' * len('\\'.join(x[:p])) + '\\'.join(x)[len('\\'.join(x[:p])):] for p, x in eachWithPrefix(x.split('\\') for x in v))
But this now contains some code doubling, so maybe an iterative loop would be nicer here:
for p, x in eachWithPrefix(x.split('\\') for x in v):
s = '\\'.join(x)
c = '\\'.join(x[:p])
print ' '*len(c) + s[len(c):]
Or as an easy-to-use generator:
def heirarchy(data, separator=","):
for p, x in eachWithPrefix(x.split(separator) if separator else list(x) for x in data):
s = separator.join(x)
c = separator.join(x[:p])
yield ' '*len(c) + s[len(c):]
So now heirarchy(data, separator='\\')
creates exactly your expected output.