Question

I am working with Pentaho Data Integration (aka Kettle) and I have several Transformations, let's call them A, B, C, D, E. B depends on A, D depends on C and E depends on B and D. In a job I'd like to run A, B and C, D in parallel:

           -> A -> B _
    Start<            \
           -> C -> D----> E

where A and C run in parallel. Is there any way to execute E only iff B AND D were successful? Right now, looking at the Job metrics, E gets executed as soon as either B OR D are finished.

Was it helpful?

Solution

I just found http://forums.pentaho.org/showthread.php?t=75425 and it seems like it's not easily possible to achieve what I want.

OTHER TIPS

You can do something like this:

        /--=--[job]----[set var J1=1]---\ 
[start]----=--[Job]----[set var J2=1]----+--[jscriptstep]--(ok)-->[next steps]
        \--=--[Job]----[set var J3=1]---/        \
                                                 (x)
                                                   \
                                                  [Write to log]

The JS step with:

J1= parent_job.getVariable("J1");
J2= parent_job.getVariable("J2");
J3= parent_job.getVariable("J3");
(J1*J2*J3)==1;

The write to log step is optional and I used to not register in log a red-lined error, by using Log Message:

" Waiting :${J1}-${J2}-${J3}-${J4}-${J5} "

So I am able to see what and when each step ends through log.

I believe this can be done, but I don't have jobs big enough to really test this well, and it's awkward. Basically, you'll need 4 separate jobs in addition to your A,B,C,D, and E jobs. Let's call them Control Job, Job A_B, Job C_D, and Parallel Jobs.

You set them up like this:

Control Job: start -> Parallel Jobs -> E
Parallel Jobs:       -> Job A_B
               start<           (Set Start step to run next jobs in parallel)
                     -> Job C_D
Job A_B: start -> A -> B
Job C_D: start -> C -> D

The key is that A -> B and C -> D need to be in their own job step to retain the dependency. Then Parallel Jobs makes sure both parallel paths have completed before allowing control to proceed to E.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top