문제

I have a table A which looks like:

number    value
 1           A
 1           B
 2           C

And I have a csv file which contains number as one of the columns. When I do a (pentaho)database lookup on this table with number from that csv file i get an output like:

number     value
  1         A
  2         C

Is there any other way in ETL where the output must be like:

 number    value
  1          A
  1          B
  2          C
도움이 되었습니까?

해결책

The Database Value Lookup step is designed to return at most 1 row for any given input value. If you want to get all rows for a key you can use a Database Join step, or read all rows from the table and the csv file, sort them, and flow them through a Merge Join step.

These correspond roughly to a nested lookup join and a sort merge join respectively. You would choose between them in the same way a query optimizer would. Basically the rule of thumb is if the number of rows in the table and the csv are roughly the same, the Merge Join will be faster, otherwise use the Database Join step. This is of course a 'rule of thumb', and will not suit every situation. Experimentation is encouraged if performance is critical.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top