Вопрос

Quick question. I'm currently designing some database queries to extract reasonably large, but not massive datasets into memory, say approximately 10k-100k records.

So far I've been testing loading these resultsets into a scala.collection.immutable.Seq and have discovered it seems to take an incredibly long time to build the collection. Whereas if I change to a Vector or List the write into memory takes fractions of a second.

MY question is therefore why is Seq so slow in this case? If so in what cases would using Seq be more appropriate than Vector?

Thanks

Это было полезно?

Решение

It would help if you'd post the relevant snippet and which operations you call on the sequence -- immutable.Seq is represented using a List (see https://github.com/scala/scala/blob/v2.10.2/src/library/scala/collection/immutable/Seq.scala#L42). My guess is that you've been using :+ on the immutable.Seq, which under the hood appends to the end of the list by copying it (probably giving you quadratic overall performance), and when you switched to using immutable.List directly, you've been attaching to the beginning using :: (giving you linear performance).

Since Seq is just a List under the hood, you should use it when you attach to the beginning of the sequence -- the cons operator :: only creates a one node and links it to the rest of the list, which is as fast as it can get when it comes to immutable data structures. Otherwise, if you add to the end, and you insist on immutability, you should use a Vector (or the upcoming Conc lists!).

If you would like a validation of these claims, see this link where the performance of the two operations is compared using ScalaMeter -- lists are 8 times faster than vectors when you add to the beginning.

However, the most appropriate data structure should be either an ArrayBuffer or a VectorBuilder. These are mutable data structures that resize dynamically and if you build them using += you will get a reasonable performance. This is assuming that you are not storing primitives.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top