문제

Basically I have a very large array of objects and I need to remove 10% of the least fit objects.

Each object has a fitness variable (a double) associated with it. I dont have a number which determines whether or not an object is fit, I just want the least fit of the bunch.

What is the best way of retrieving (sampling) the least fit objects?

One way could be randomly select lets way 20%, sort the data, then remove 10%. But I think this is not a very smart way of doing it.

The other way it to keep the array sorted at all times then remove the first 10%. But I dont think this is very good because you will have to always keep sorting the array whilst inserting/updating which is a big overhead..

도움이 되었습니까?

해결책

Let k be yourCollection.length() * 0.1 and n = yourCollection.length().

Find the k-th smallest element (QuickSelect or Median of 5), where the key is your fitness. Let's call it p. This can be done in O(n).

Then traverse through the collection and remove all the items with fitness less than p.fitness. We've got an O(n) solution.

Or, you can create a heap in O(n) with key=fitness and remove k elements from it in O(k * log(n)).

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top