Aren't modern computers powerful enough to handle Strings without needing to use Symbols (in Ruby)

https://stackoverflow.com/questions/656441

19-08-2019
|

Question

Every text I've read about Ruby symbols talks about the efficiency of symbols over strings. But, this isn't the 1970s. My computer can handle a little bit of extra garbage collection. Am I wrong? I have the latest and greatest Pentium dual core processor and 4 gigs of RAM. I think that should be enough to handle some Strings.

Solution

Your computer may well be able to handle "a little bit of extra garbage collection", but what about when that "little bit" takes place in an inner loop that runs millions of times? What about when it's running on an embedded system with limited memory?

There are a lot of places you can get away with using strings willy-nilly, but in some you can't. It all depends on the context.

OTHER TIPS

It's true, you don't need tokens so very badly for memory reasons. Your computer could undoubtedly handle all kinds of gnarly string handling.

But, in addition to being faster, tokens have the added advantage (especially with context coloring) of screaming out visually: LOOK AT ME, I AM A KEY OF A KEY-VALUE PAIR. That's a good enough reason to use them for me.

There's other reasons too... and the performance gain on lots of them might be more than you realize, especially doing something like comparison.

When comparing two ruby symbols, the interpreter is just comparing two object addresses. When comparing two strings, the interpreter has to compare every character one at a time. That kind of computation can add up if you're doing a lot of this.

Symbols have their own performance problems though... they are never garbage collected.

It's worth reading this article: http://www.randomhacks.net/articles/2007/01/20/13-ways-of-looking-at-a-ruby-symbol

It's nice that symbols are guaranteed unique--that can have some nice effects that you wouldn't get from String (such as their addresses are always exactly equal I believe).

Plus they have a different meaning and you would want to use them in different areas, but ruby isn't too strict about that kind of stuff anyway, so I can understand your question.

Here's the real reason for the difference: strings are never the same. Every instance of a string is a separate object, even if the content is identical. And most operations on strings will make new string objects. Consider the following:

a = 'zowie'
b = 'zowie'
a == b         #=> true

On the surface, it'd be easy to claim that a and b are the same. Most common sense operations will work as you'd expect. But:

a.object_id    #=> 2152589920 (when I ran this in irb)
b.object_id    #=> 2152572980
a.equal?(b)    #=> false

They look the same, but they're different objects. Ruby had to allocate memory twice, perform the String#initialize method twice, etc. They're taking up two separate spots in memory. And hey! It gets even more fun when you try to modify them:

a += ''        #=> 'zowie'
a.object_id    #=> 2151845240

Here we add nothing to a and leave the content exactly the same -- but Ruby doesn't know that. It still allocates a whole new String object, reassigns the variable a to it, and the old String object sits around waiting for eventual garbage collection. Oh, and the empty '' string also gets a temporary String object allocated just for the duration of that line of code. Try it and see:

''.object_id   #=> 2152710260
''.object_id   #=> 2152694840
''.object_id   #=> 2152681980

Are these object allocations fast on your slick multi-Gigahertz processor? Sure they are. Will they chew up much of your 4 GB of RAM? No they won't. But do it a few million times over, and it starts to add up. Most applications use temporary strings all over the place, and your code's probably full of string literals inside your methods and loops. Each of those string literals and such will allocate a new String object, every single time that line of code gets run. The real problem isn't even the memory waste; it's the time wasted when garbage collection gets triggered too frequently and your application starts hanging.

In contrast, take a look at symbols:

a = :zowie
b = :zowie
a.object_id    #=> 456488
b.object_id    #=> 456488
a == b         #=> true
a.equal?(b)    #=> true

Once the symbol :zowie gets made, it'll never make another one. Every time you refer to a given symbol, you're referring to the same object. There's no time or memory wasted on new allocations. This can also be a downside if you go too crazy with them -- they're never garbage collected, so if you start creating countless symbols dynamically from user input, you're risking an endless memory leak. But for simple literals in your code, like constant values or hash keys, they're just about perfect.

Does that help? It's not about what your application does once. It's about what it does millions of times.

One less character to type. That's all the justification I need to use them over strings for hash keys, etc.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow