best way to sort an array of objects by category and infinite subcategory

Question 1

You can do it like this:

Code

def doit(data, indent = 2)
  d = data.each_with_object({}) { |h,g| g[h[:id]] = h }
  d.each {|_,h| h[:ancestor_ids] =
    (h[:top_level_category_id] ? d[h[:parent_id]][:ancestor_ids] :[])+[h[:id]]}
   .values
   .sort_by { |h| h[:ancestor_ids] }
   .each { |h| puts ' '*((h[:ancestor_ids].size-1)*indent) + "#{h[:name]}" }
end

Demo

data=[
  {id: 1, name: "parent test 1", parent_id: nil, top_level_category_id: nil},
  {id: 2, name: "test 2", parent_id: 1, top_level_category_id: 1},
  {id: 3, name: "test 3", parent_id: 1, top_level_category_id: 1},
  {id: 4, name: "parent test 4", parent_id: nil, top_level_category_id: nil},
  {id: 5, name: "test 5", parent_id: 3, top_level_category_id: 4},
  {id: 6, name: "test 6", parent_id: 4, top_level_category_id: 4},
  {id: 7, name: "test 7", parent_id: 4, top_level_category_id: 4}
]

doit(data)
parent test 1
  test 2
  test 3
    test 5
parent test 4
  test 6
  test 7

Explanation

What we need to do is add another hash element (whose key I've named :ancestor_ids), whose value is an array of the hash's :id and those of all of its ancestors; i.e., we want to add the following elements to the respective hashes:

:ancestor_ids => [1]
:ancestor_ids => [1,2]
:ancestor_ids => [1,3]
:ancestor_ids => [4]
:ancestor_ids => [1,3,5]
:ancestor_ids => [4,6]
:ancestor_ids => [4,7]

Once we have these, we can use sort_by { |h| h[:ancestor_ids] } to put the elements of the array data in the proper order. (If you are uncertain how the elements of an array are ordered, review Array#<=>.) Also h[:ancestor_ids].size is used to determine the amount of indentation required when displaying the results.

The calculations go like this*:

d = data.each_with_object({}) { |h,g| g[h[:id]] = h }
  #=> {1=>{:id=>1, :name=>"parent test 1",...},
  #    2=>{:id=>2, :name=>"test 2",...},
  #    3=>{:id=>3, :name=>"test 3",...},
  #    4=>{:id=>4, :name=>"parent test 4",...},
  #    5=>{:id=>5, :name=>"test 5",...},
  #    6=>{:id=>6, :name=>"test 6",...},
  #    7=>{:id=>7, :name=>"test 7",...}}

We perform this step to make it easy to find the rows of data that correspond to a record's parent.

e = d.each {|_,h| h[:ancestor_ids] =
    (h[:top_level_category_id] ? d[h[:parent_id]][:ancestor_ids]:[])+[h[:id]]}
  #=> {1=>{:id=>1,...,:ancestor_ids=>[1]},
  #    2=>{:id=>2,...,:ancestor_ids=>[1, 2]},
  #    3=>{:id=>3,...,:ancestor_ids=>[1, 3]},
  #    4=>{:id=>4,...,:ancestor_ids=>[4]}
  #    5=>{:id=>5,...,:ancestor_ids=>[1, 3, 5]},
  #    6=>{:id=>6,...,:ancestor_ids=>[4, 6]},
  #    7=>{:id=>7,...,:ancestor_ids=>[4, 7]}}

This adds the element whose key is :ancestor_ids. We no longer need the keys, so we will extract the values, sort them by :ancestor_ids and display the results:

f = e.values
  #=> [{:id=>1,...,:ancestor_ids=>[1]},
  #    {:id=>2,...,:ancestor_ids=>[1, 2]},
  #    {:id=>3,...,:ancestor_ids=>[1, 3]},
  #    {:id=>4,...,:ancestor_ids=>[4]}
  #    {:id=>5,...,:ancestor_ids=>[1, 3, 5]},
  #    {:id=>6,...,:ancestor_ids=>[4, 6]},
  #    {:id=>7,...,:ancestor_ids=>[4, 7]}}

g = f.sort_by { |h| h[:ancestor_ids] }
  #=> [{:id=>1,...,:ancestor_ids=>[1]},
  #    {:id=>2,...,:ancestor_ids=>[1, 2]},
  #    {:id=>3,...,:ancestor_ids=>[1, 3]},
  #    {:id=>5,...,:ancestor_ids=>[1, 3, 5]},
  #    {:id=>4,...,:ancestor_ids=>[4]}
  #    {:id=>6,...,:ancestor_ids=>[4, 6]},
  #    {:id=>7,...,:ancestor_ids=>[4, 7]}}

indent = 2
g.each { |h| puts ' '*((h[:ancestor_ids].size-1)*indent) + "#{h[:name]}" }
parent test 1
  test 2
  test 3
    test 5
parent test 4
  test 6
  test 7

Points

Do you need the hash element whose key is :top_level_category_id, considering that :parent_id => nil for top level elements?
Production code would raise an exception if, in the calculation of e above, there were no element of d with key h[:parent_id] or the value h[:parent_id] had no key :ancestor_ids.
This answer relies on the assumption that, for each element h of Data that is not top level, h[:id] > h[:parent_id] when h[:parent_id] is not nil. If the rows of Data are not initially ordered by :id, they must be sort_by'ed :id as a first step.

* If you try running this at home, it should work from the command line, but IRB and PRY cannot handle the continued lines that begin with a dot

Question 2

Requires a single pass through the edge list. All nodes must fit in memory together; edge list must constitute an actual tree (that is, there's no checking for forests, proper DAGs, or cycles).

private static final Long DUMMY = null;
Node buildTree( Iterable< ? extends Edge > iedg ) {
    Map< Long, Node > mnod = new HashMap< Long, Node >();

    for ( Edge edg : iedg )
        getNode( mnod, iedg.getParentId() ).addChild(
            getNode( mnod, iedg.getId() ).withName( iedg.getName() )
        );

    return getNode( mnod, DUMMY ).firstChild();
}

private Node getNode( Map< Long, Node > mnod, Long lId ) {
    Node nod = mnod.get( lId );
    if ( null == nod )
        mnod.put( lId, nod = new Node().withId( lId ) );
    return nod;
}