Pergunta

I understand that we should use %s to concatenate a string rather than + in Python.

I could do any of:

hello = "hello"
world = "world"

print hello + " " + world
print "%s %s" % (hello, world)
print "{} {}".format(hello, world)
print ' '.join([hello, world])

But why should I use anything other than the +? It's quicker to write concatenation with a simple +. Then if you look at the formatting string, you specify the types e.g. %s and %d and such. I understand it could be better to be explicit about the type.

But then I read that using + for concatenation should be avoided even though it's easier to type. Is there a clear reason that strings should be concatenated in one of those other ways?

Foi útil?

Solução

  1. Readability. The format string syntax is more readable, as it separates style from the data. Also, in Python, %s syntax will automatically coerce any non str types to str; while concatenation only works with str, and you can't concatenate str with int.

  2. Performance. In Python str is immutable, so the left and right string have to be copied into the new string for every pair of concatenation. If you concatenate four strings of length 10, you will be copying (10+10) + ((10+10)+10) + (((10+10)+10)+10) = 90 characters, instead of just 40 characters. And things gets quadratically worse as the number and size of the string increases. Java optimizes this case some of the times by transforming the series of concatenation to use StringBuilder, but CPython doesn't.

  3. For some use cases, the logging library provide an API that uses format string to create the log entry string lazily (logging.info("blah: %s", 4)). This is great for improved performance if the logging library decided that the current log entry will be discarded by a log filter, so it doesn't need to format the string.

Outras dicas

Am I the only one who reads left to right?

To me, using %s is like listening to German speakers, where I have to wait until the end of a very long sentence to hear what the verb is.

Which of these is clearer at a quick glance?

"your %s is in the %s" % (object, location)

or

"your " + object + " is in the " + location  

An example clarifying readability argument:

print 'id: ' + id + '; function: ' + function + '; method: ' + method + '; class: ' + class + ' -- total == ' + total

print 'id: %s; function: %s; method: %s; class: %s --total == %s' % \
   (id, function, method, class, total)

(Note that second example is not only more readable but also easier to edit, you can change the template on one line and list of variables on another)

A separate issue is that %s code also converts to the string, otherwise you have to use str() call which is also less readable than a %s code.

Using + should not be avoided in general. In many cases is the correct approach. Using %s or .join() are only preferable in particular cases, and it is usually quite obvious when they are the better solution.

In your example you are concatenating three strings together, and the example using + is clearly the simplest and most readable, and therefore the recommended.

%s or .format() are useful if you want to interpolate strings or values in the middle of a larger string. Example:

print "Hello %s, welcome to the computer!" % name

In this case using %s it is more readable since you avoid chopping the first string into multiple segments. Especially if you are interpolating multiple values.

.join() is appropriate if you have a variable size sequence of strings and/or you want to concatenate multiple strings with the same separator.

Since the word order may change in different languages, the form with %s is imperative if you want to properly support the translation of strings in your software.

Licenciado em: CC-BY-SA com atribuição
scroll top