Talend - Merge two rows of a Delimited file to get one single register

Question

You could try splitting the file up into your 2 types of rows and then using a tMap to join them.

To clarify further you'll want to split the file depending on whether it's an IN or OUT and then use a tMap to join the columns as per your needs.

I've modified your example data a little to look a little like:

|=---+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------=|
|IN1 |ROW1COLUMN1|xxxxxxx    |ROW1COLUMN3|ROW1COLUMN4|ROW1COLUMN5|xxxxxxx    |xxxxxxx    |ROW1COLUMN8|
|OUT1|xxxxxxx    |ROW1COLUMN2|xxxxxxx    |xxxxxxx    |xxxxxxx    |ROW1COLUMN6|ROW1COLUMN7|xxxxxxx    |
|IN2 |ROW2COLUMN1|xxxxxxx    |ROW2COLUMN3|ROW2COLUMN4|ROW2COLUMN5|xxxxxxx    |xxxxxxx    |ROW2COLUMN8|
|OUT2|xxxxxxx    |ROW2COLUMN2|xxxxxxx    |xxxxxxx    |xxxxxxx    |ROW2COLUMN6|ROW2COLUMN7|xxxxxxx    |
|IN3 |ROW3COLUMN1|xxxxxxx    |ROW3COLUMN3|ROW3COLUMN4|ROW3COLUMN5|xxxxxxx    |xxxxxxx    |ROW3COLUMN8|
|OUT3|xxxxxxx    |ROW3COLUMN2|xxxxxxx    |xxxxxxx    |xxxxxxx    |ROW3COLUMN6|ROW3COLUMN7|xxxxxxx    |
'----+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------'

The only real addition is that there is now a key as to how it should be joined next to the IN or OUT of the first column.

First you'll want to split the data up into your in and out parts with a tMap set up like:

Split the file using a tMap

This simply sends the data down one of two paths depending on whether the Id field begins with "IN" or "Out".

After this you'll want to recombine it with another tMap set up like:

Recombine the files using a tMap

This joins based on the extracted key from the Id file and uses the appropriate columns in the combined output.

Unfortunately you can't split a flow with a tMap and then rejoin it simply straight back into another tMap so the best bet is to output it to two separate places (either database tables or temporary CSV files), and then when that subjob is complete then to read in those separate places and recombine with the second tMap.

An example job might look like:

example Job setup

If you don't have a natural key to join on then you could generate one by taking the outputs of the first tMap and then adding a column with an expression of Numeric.sequence as the value for the column.