Question

What's the shortest way to do this in Python?

string = "   xyz"

must return index = 3

Was it helpful?

Solution

>>> s = "   xyz"
>>> len(s) - len(s.lstrip())
3

OTHER TIPS

>>> next(i for i, j in enumerate('   xyz') if j.strip())
3

or

>>> next(i for i, j in enumerate('   xyz') if j not in string.whitespace)
3

in versions of Python < 2.5 you'll have to do:

(...).next()

Looks like the "regexes can do anything" brigade have taken the day off, so I'll fill in:

>>> tests = [u'foo', u' foo', u'\xA0foo']
>>> import re
>>> for test in tests:
...     print len(re.match(r"\s*", test, re.UNICODE).group(0))
...
0
1
1
>>>

FWIW: time taken is O(the_answer), not O(len(input_string))

import re
def prefix_length(s):
   m = re.match('(\s+)', s)
   if m:
      return len(m.group(0))
   return 0

Many of the previous solutions are iterating at several points in their proposed solutions. And some make copies of the data (the string). re.match(), strip(), enumerate(), isspace()are duplicating behind the scene work. The

next(idx for idx, chr in enumerate(string) if not chr.isspace())
next(idx for idx, chr in enumerate(string) if not chr.whitespace)

are good choices for testing strings against various leading whitespace types such as vertical tabs and such, but that adds costs too.

However if your string uses just a space characters or tab charachers then the following, more basic solution, clear and fast solution also uses the less memory.

def get_indent(astr):

    """Return index of first non-space character of a sequence else False."""

    try:
        iter(astr)
    except:
        raise

    # OR for not raising exceptions at all
    # if hasattr(astr,'__getitem__): return False

    idx = 0
    while idx < len(astr) and astr[idx] == ' ':
        idx += 1
    if astr[0] <> ' ':
        return False
    return idx

Although this may not be the absolute fastest or simpliest visually, some benefits with this solution are that you can easily transfer this to other languages and versions of Python. And is likely the easiest to debug, as there is little magic behavior. If you put the meat of the function in-line with your code instead of in a function you'd remove the function call part and would make this solution similar in byte code to the other solutions.

Additionally this solution allows for more variations. Such as adding a test for tabs

or astr[idx] == '\t':

Or you can test the entire data as iterable once instead of checking if each line is iterable. Remember things like ""[0] raises an exception whereas ""[0:] does not.

If you wanted to push the solution to inline you could go the non-Pythonic route:

i = 0
while i < len(s) and s[i] == ' ': i += 1

print i
3

. .

>>> string = "   xyz"
>>> next(idx for idx, chr in enumerate(string) if not chr.isspace())
3
>>> string = "   xyz"
>>> map(str.isspace,string).index(False)
3
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top