In Scalding Fields API, in order to map from '*
to '*
, best approach I can think of is Cascading TupleEntry
, cascading.tuple.TupleEntry
import com.twitter.scalding._
import cascading.tuple.TupleEntry
// Notice I do not specify the scheme when reading.
// I only know first column is 'user_id', the rest is some value and I want
// to double the values. You can use 'map' or 'mapTo'.
Tsv(args("input"))
.read
.map('* -> '*) {
fields: TupleEntry =>
val sz: Int = fields.size()
for (i <- from 1 until sz) fields.setDouble(i, fields.getDouble(i) * 2.0)
fields.getTuple()
}
.write(Tsv(args("output")))