Given a Python string describing object.attribute, how do I separate the attributes's namespace from the attribute?

StackOverflow https://stackoverflow.com/questions/18950344

Question

Given a Python string describing object.attribute, how do I separate the attributes's namespace from the attribute?

Desired Examples:

ns_attr_split("obj.attr") => ("obj", "attr")
ns_attr_split("obj.arr[0]") => ("obj", "arr[0]")
ns_attr_split("obj.dict['key']") => ("obj", "dict['key']")
ns_attr_split("mod.obj.attr") => ("mod.obj", "attr")
ns_attr_split("obj.dict['key.word']") => ("obj", "dict['key.word']")

Note: I understand writing my own string parser would be one option, but I am looking for a more elegant solution to this. Rolling my own string parser isn't as simple as an rsplit on '.' because of the last option listed above where a given keyword may contain the namespace delimiter.

Was it helpful?

Solution

I've recently discovered the tokenize library for tokenizing python source code. Using this library I've come up with this little code snippet:

import tokenize
import StringIO

def ns_attr_split(s):
  arr = []
  last_delim = -1
  cnt = 0

  # Tokenize the expression, tracking the last namespace
  # delimiter index in last_delim
  str_io = StringIO.StringIO(s)
  for i in tokenize.generate_tokens(str_io.readline):
    arr.append(i[1])
    if i[1] == '.':
      last_delim = cnt
    cnt = cnt + 1

  # Join the namespace parts into a string
  ns = ""
  for i in range(0,last_delim):
    ns = ns + arr[i]

  # Join the attr parts into a string
  attr = ""
  for i in range(last_delim + 1, len(arr)):
    attr = attr + arr[i]

  return (ns, attr)

This should work with intermediate index/keys as well. (i.e "mod.ns[3].obj.dict['key']")

OTHER TIPS

Assuming that the namespace is always alphanumeric, you could first split on /[^a-zA-Z.]/, then rsplit on .:

>>> import re
>>> ns_attr_split = lambda s: re.split("[^a-zA-Z.]", s, 1)[0].rsplit('.')
>>> ns_attr_split("obj.dict['key.word']") 
['obj', 'dict']

Obviously this isn't exactly what you want… but the fiddling would be straight forward.

A fun little regular expression problem...

This code works on all the examples you provided using Python 2.6, and assumes you don't have any intermediate index/key accesses (e.g. "obj['foo'].baz"):

import re
ns_attr_split = lambda s: re.match(r"((?:\w+\.)*\w+)\.(.+)", s).groups()
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top