Question

Here is the string:

format db "this is string a", 0, 0Ah

And I am trying to split it into this:

format
db
"this is string a"
0
0Ah

Is there any way can do this in python 2.7?

Thank you!

Was it helpful?

Solution

Use shlex.split:

s = 'format db "this is string a", 0, 0Ah'

import shlex

shlex.split(s)
Out[18]: ['format', 'db', 'this is string a,', '0,', '0Ah']

Your grammar is a little wonky with the trailing commas, but you can pretty safely rstrip that out:

[x.rstrip(',') for x in shlex.split(s)]
Out[20]: ['format', 'db', 'this is string a', '0', '0Ah']

OTHER TIPS

I'm sure there will be more elegant answers, but this'll work and preserve the quotes:

def parse(s):
    s = s.split(', ')
    stack = []
    new = ''
    inQuotes = False
    for char in s[0]:
        if char == '"':
            inQuotes = True
        if not inQuotes:
            if not char == ' ':
                new += char
            else:
                stack.append(new)
                new = ''
        else:
            new += char
    stack.append(new)
    del s[0]
    stack.extend(s)
    return stack

>>> s = 'format db "this is string a", 0, 0Ah'
>>> parse(s)
['format', 'db', '"this is string a"', '0', '0Ah']

A regex solution:

import re

data = 'format db "this is string a", 0, 0Ah'
s = re.findall(r'''(?:[^ '"]+|'(?:[^']|\\.)*'|"(?:[^']|\\.)*")+''', data)
print [x.rstrip(',') for x in s]

output:

['format', 'db', '"this is string a"', '0', '0Ah']
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top