Question

I am trying to fetch top N-1 elements from a List. I have gone through similar posts in SO, like here and here. I have understood the motivations behind the solutions proposed in those posts. However, I think my problem is a bit different.

I have a sorted (descending) list of some elements. Let's assume that it has been sorted in an efficient manner, because that is not the point here.

The head is the topmost that I have got. But, there are 0 or more duplicates in the rest of the list. The attr1 of each of these duplicates is the same as that of head. I want to extract these duplicates (if any). Then I want to sort this (top + duplicates) list on the second attribute, attr2. In other words, I want to sort on a composite key: first by Key_1, and then - for the same Key_1 - by Key_2.

Let's say that an element in the list is a Pair.

case class PairOfAttrs(val int attr1,val int attr2)

/* .. more code here ... */

// Now, we obtain a sorted (descending) list of PairOfAttrs below
// Ordering is based on the value of attr1 (key_1)
val sortedPairsFirstAttr = .. // Seq(Pair1,Pair2,Pair3,Pair4....Pairn)

val top = sortedPairsFirstAttr.head

Obviously, top's attr1 is the highest in the list. To extract the duplicates, I do:

val otherToppers = sortedPairsFirstAttr.tail.filter 
                   (e => e.attr1 == topper.attr1) // Expression(1)

This is the key point, IMHO. I cannot find the duplicates until isolate the top and use its attribute (attr1) during the comparison.

// Then, I create a new list of the toppers only

val semiToppers = if (!otherToppers.empty) {
                        List(topper) ++ otherToppers // Expression(2)
                  }

Then, I sort the resultant list. Ordering is based on the value of attr2 (Key_2)

val finalToppers = semiToppers.sortWith(_._2 < _.2) // Expression(3)

So, effectively, I have sorted the original list using a compound key: sorted descendingly on Key_1 and then, ascendingly on Key_2.

I understand that Experssion(1) can be optimized for a long list; I don't need to travel through the entire list. I can break out of it earlier. I also understand that Expression(2) and Expression(3) can be merged. So, we can keep these two points aside.

My question is if my approach is functionally appropriate and acceptable. If not, then what you think is a better, more idiomatic approach?

This is a learner's question. So, I seek your comments/observations wholeheartedly.

Was it helpful?

Solution

Your solution is working and valid, so I won't comment on it. You already use val and immutable data structures, which is the pure way of doing it. I will instead suggest an alternative way that is a slightly shorter and might also interest you:

Since you preprocess your list anyway, I suggest you to sort it lexicographically in the first place: First by descending attr1, and in case of a tie, by ascending attr2:

scala> val originalList = Seq((1,0), (2,7), (3,1), (1,3), (2,5), (3,4), (3,2))
scala> val lst = originalList.sortBy(x => (-x._1, x._2))
lst: Seq[(Int, Int)] = List((3,1), (3,2), (3,4), (2,5), (2,7), (1,0), (1,3))

Now you just have to take the duplicates from the front and the result is already sorted by attr2:

scala> lst.takeWhile(_._1 == lst.head._1)
res8: Seq[(Int, Int)] = List((3,1), (3,2), (3,4))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top