Question

Say I have:

a = ["apple", "pear", ["grapes", "berries"], "peach"]

and I want to sort by:

a.sort_by do |f|
  f.class == Array ? f.to_s : f
end

I get:

[["grapes", "berries"], "apple", "peach", "pear"]

Where I actually want the items in alphabetical order, with array items being sorted on their first element:

["apple", ["grapes", "berries"], "peach", "pear"]

or, preferably, I want:

["apple", "grapes, berries", "peach", "pear"]

If the example isn't clear enough, I'm looking to sort the items in alphabetical order.

Any suggestions on how to get there?

I've tried a few things so far yet can't seem to get it there. Thanks.

Was it helpful?

Solution

I think this is what you want:

a.sort_by { |f| f.class == Array ? f.first : f }

OTHER TIPS

I would do

a = ["apple", "pear", ["grapes", "berries"], "peach"]
a.map { |e| Array(e).join(", ") }.sort
# => ["apple", "grapes, berries", "peach", "pear"]

Array#sort_by clearly is the right method, but here's a reminder of how Array#sort would be used here:

  a.sort do |s1,s2| 
    t1 = (s1.is_a? Array) ? s1.first : s1
    t2 = (s2.is_a? Array) ? s2.first : s2
    t1 <=> t2
  end.map {|e| (e.is_a? Array) ? e.join(', ') : e }
    #=> ["apple", "grapes, berries", "peach", "pear"]  

@theTinMan pointed out that sort is quite a bit slower than sort_by here, and gave a reference that explains why. I've been meaning to see how the Benchmark module is used, so took the opportunity to compare the two methods for the problem at hand. I used @Rafa's solution for sort_by and mine for sort.

For testing, I constructed an array of 100 random samples (each with 10,000 random elements to be sorted) in advance, so the benchmarks would not include the time needed to construct the samples (which was not insignificant). 8,000 of the 10,000 elements were random strings of 8 lowercase letters. The other 2,000 elements were 2-tuples of the form [str1, str2], where str1 and str2 were each random strings of 8 lowercase letters. I benchmarked with other parameters, but the bottom-line results did not vary significantly.

require 'benchmark'

# n: total number of items to sort
# m: number of two-tuples [str1, str2] among n items to sort
# n-m: number of strings among n items to sort
# k: length of each string in samples
# s: number of sorts to perform when benchmarking

def make_samples(n, m, k, s)
  s.times.with_object([]) { |_, a| a << test_array(n,m,k) }
end

def test_array(n,m,k)
  a = ('a'..'z').to_a 
  r = []
  (n-m).times { r << a.sample(k).join }
  m.times { r << [a.sample(k).join, a.sample(k).join] }
  r.shuffle!
end

# Here's what the samples look like:    
make_samples(6,2,4,4)
  #=> [["bloj", "izlh", "tebz", ["lfzx", "rxko"], ["ljnv", "tpze"], "ryel"],
  #    ["jyoh", "ixmt", "opnv", "qdtk", ["jsve", "itjw"], ["pnog", "fkdr"]],
  #    ["sxme", ["emqo", "cawq"], "kbsl", "xgwk", "kanj", ["cylb", "kgpx"]],
  #    [["rdah", "ohgq"], "bnup", ["ytlr", "czmo"], "yxqa", "yrmh", "mzin"]]

n = 10000 # total number of items to sort
m = 2000  # number of two-tuples [str1, str2] (n-m strings)
k = 8     # length of each string
s = 100   # number of sorts to perform

samples = make_samples(n,m,k,s)

Benchmark.bm('sort_by'.size) do |bm|
  bm.report 'sort_by' do
    samples.each do |s|
      s.sort_by { |f| f.class == Array ? f.first : f }
    end
  end

  bm.report 'sort' do
    samples.each do |s| 
      s.sort do |s1,s2| 
        t1 = (s1.is_a? Array) ? s1.first : s1
        t2 = (s2.is_a? Array) ? s2.first : s2
        t1 <=> t2
      end
    end
  end
end

              user     system      total        real
sort_by   1.360000   0.000000   1.360000 (  1.364781)
sort      4.050000   0.010000   4.060000 (  4.057673)

Though it was never in doubt, @theTinMan was right! I did a few other runs with different parameters, but sort_by consistently thumped sort by similar performance ratios.

Note the "system" time is zero for sort_by. In other runs it was sometimes zero for sort. The values were always zero or 0.010000, leading me to wonder what's going on there. (I ran these on a Mac.)

For readers unfamiliar with Benchmark, Benchmark#bm takes an argument that equals the amount of left-padding desired for the header row (user system...). bm.report takes a row label as an argument.

You are really close. Just switch .to_s to .first.

irb(main):005:0> b = ["grapes", "berries"]
=> ["grapes", "berries"]
irb(main):006:0> b.to_s
=> "[\"grapes\", \"berries\"]"
irb(main):007:0> b.first
=> "grapes"

Here is one that works:

a.sort_by do |f|
  f.class == Array ? f.first : f
end

Yields:

["apple", ["grapes", "berries"], "peach", "pear"]
a.map { |b| b.is_a?(Array) ? b.join(', ') : b }.sort

# => ["apple", "grapes, berries", "peach", "pear"]

Replace to_s with join.

a.sort_by do |f|
  f.class == Array ? f.join : f
end

# => ["apple", ["grapes", "berries"], "peach", "pear"]

Or more concisely:

a.sort_by {|x| [*x].join }

# => ["apple", ["grapes", "berries"], "peach", "pear"]

The problem with to_s is that it converts your Array to a string that starts with "[":

"[\"grapes\", \"berries\"]"

which comes alphabetically before the rest of your strings.

join actually creates the string that you had expected to sort by:

"grapesberries"

which is alphabetized correctly, according to your logic.

If you don't want the arrays to remain arrays, then it's a slightly different operation, but you will still use join.

a.map {|x| [*x].join(", ") }.sort

# => ["apple", "grapes, berries", "peach", "pear"]

Sort a Flattened Array

If you just want all elements of your nested array flattened and then sorted in alphabetical order, all you need to do is flatten and sort. For example:

["apple", "pear", ["grapes", "berries"], "peach"].flatten.sort
#=> ["apple", "berries", "grapes", "peach", "pear"]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top