質問

If you want to create a pipe with more than 22 fields from a smaller one in Scalding you are limited by Scala tuples, which cannot have more than 22 items.

Is there a way to use collections instead of tuples? I imagine something like in the following example, which sadly doesn't work:

input.read.mapTo('line -> aLotOfFields) { line: String =>
  (1 to 24).map(_.toString)
}.write(output)
役に立ちましたか?

解決

actually you can. It's in FAQ - https://github.com/twitter/scalding/wiki/Frequently-asked-questions#what-if-i-have-more-than-22-fields-in-my-data-set

val toFields = (1 to 24).map(f => Symbol("field_" + f)).toList

input
  .read
  .mapTo('line -> toFields) { line: String =>
    new Tuple((1 to 24).map(_.toString).map(_.asInstanceOf[AnyRef]): _*)

  }

the last map(_.asInstanceOf[AnyRef]) looks ugly so if you find better solution let me know please.

他のヒント

Wrap your tuples into case classes. It will also make your code more readable and type safe than using tuples and collections respectively.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top