Question

"abc def ".split(" ")

returns

["abc", "def"]

Thus, I was expecting:

["a", "b", "c", " ", "d", "e", "f", " "].split(" ")

to return

[["a", "b", "c"], ["d", "e", "f"]]

but it returned

[["a", "b", "c"], ["d", "e", "f"], []]

I read through the source code doing the split in active_support/core_ext/array/grouping.rb (I am using ActiveSupport 4.0.0 with ruby 2.0.0-p247). You can find the 2 lines of doc here: http://api.rubyonrails.org/classes/Array.html#method-i-split and the code is the following:

def split(value = nil, &block)
    inject([[]]) do |results, element|
      if block && block.call(element) || value == element
        results << []
      else
        results.last << element
      end

      results
    end
  end

That explains how it does the split.

Now, is that the intended behavior or is that an ActiveSupport bug?

Was it helpful?

Solution

This is probably intended behavior rather than a bug. According to the documentation, splitting an array:

Divides the array into one or more subarrays based on a delimiting value or the result of an optional block.

This makes no guarantees about contiguous or leading spaces.

On the other hand, the Ruby core documentation for String#split states:

If pattern is a String, then its contents are used as the delimiter when splitting str. If pattern is a single space, str is split on whitespace, with leading whitespace and runs of contiguous whitespace characters ignored.

As you can see, the behavior you expect only works with whitespace, not for just any string.

 "abc ccc def ".split("c")
 => ["ab", " ", "", "", " def "]

When splitting an array, the concept of "whitespace" doesn't really make sense any more. So I think the behavior is sensible, if perhaps counterintuitive at first.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top