Domanda

I'm looking for a way how to split job execution in talend studio according to actual file row - I'd like to process file rows starting with "DEBUG" in one job branch and another rows in another job branch. It that possible?

È stato utile?

Soluzione

To do this, use a tMap component. Your job will look like this

   t*Input--row-->tMap--out1--->tFileOutput*

                      --out2--->tFileOutput*

In the tMap component, you have input on the left and output on the right. In your output table, select "Activate expression filter" and use the text box to define your filter-- only rows that match that filter will be ouput from that connection. You can have as many output tables and filters as you need.

Altri suggerimenti

Using tMap is cool, but if number of output stream is not defined and fixed, tMap is not a good choice.

In this case using iterate link or tjavaflex can help you:

Have a look at this tutorial on "how to split a file into many files regarding a key on each record" which explains how to solve this kind of task. It is actually only available in french. The tutorial shows 3 different technics to achieve this task.

Finally I used tExctractRegeFields component - simply defined regex for matching lines. The most important (and I didn't know before) is that you can connect components with different types of connections. I did right click on used component a chose Row > Reject for new branch in job as described in question.

We can do it by using tfileoutputdelimited and tfileinputdelimited. We have one option in tfileoutputdelimited in advanced settings and check option split out files in several files.

enter image description here

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top