_
is a perfectly valid variable name and yes, you can use a variable multiple times in an unpacking operation, so what you've written will work. _
will end up with the last value assigned in the line. Some Python programmers do use it this way.
_
is used for special purposes by some Python interactive shells, which may confuse some readers, and so some programmers do not use it for this reason.
There's no way to avoid the allocation with str.split()
: it always splits the whole line, and the resulting strings are always allocated. It's just that, in this case, some of them don't live very long. But then again, who does?
You can avoid some allocations with, say, re.finditer()
:
import re
fi = re.finditer(r"\S+", line)
next(fi)
next(fi)
var_needed = next(fi).group()
next(fi)
next(fi)
another_var_needed = next(fi).group()
# we don't care about the last match so we don't ask for it
But next()
returns a Match
object and so it'll be allocated (and immediately discarded since we're not saving it anywhere). So you really only save the final allocation. If your strings are long, the fact that you're getting a Match
object and not a string could save some memory and even time, I guess; I think the matched string is not sliced out of the source string until you ask for it. You could profile it to be sure.
You could even generalize the above into a function that returns only the desired tokens from a string:
import re
def get_tokens(text, *toknums):
toknums = set(toknums)
maxtok = max(toknums)
for i, m in enumerate(re.finditer(r"\S", text)):
if i in toknums:
yield m.group()
elif i > maxtok:
break
var1, var2 = get_tokens("a b c d e f g", 2, 5)
But it still ain't exactly pretty.