Question

I am developing a web app using Spring MVC. Simply put, a user uploads a file which can be of different types (.csv, .xls, .txt, .xml) and the application parses this file and extracts data for further processing. The problem is that I format of the file can change frequently. So there must be some way for quick and easy customization. Being a bit familiar with Talend, I decided to give it a shot and use it as ETL tool for my app. This short tutorial shows how to run Talend job from within Java app - http://www.talendforge.org/forum/viewtopic.php?id=2901 However, jobs created using Talend can read from/write to physical files, directories or databases. Is it possible to modify Talend job so that it can be given some Java object as a parameter and then return Java object just as usual Java methods? For example something like:

String[] param = new String[]{"John Doe"};
String talendJobOutput = teaPot.myjob_0_1.myJob.main(param);

where teaPot.myjob_0_1.myJob is the talend job integrated into my app

Was it helpful?

Solution 2

Now I better understand your willing, I think this is NOT possible because Talend's architecture is made like a standalone app, with a "main" entry point merely as does the Java main() method :

public String[][] runJob(String[] args) {
    int exitCode = runJobInTOS(args);
    String[][] bufferValue = new String[][] { { Integer.toString(exitCode) } };
    return bufferValue;
}

That is to say : the Talend execution entry point only accepts a String array as input and doesn't returns anything as output (except as a system return code).

So, you won't be able link to Talend (generated) code as a library but as an isolated tool that you can only parameterize (using context vars, see my other response) before launching.

You can see that in Talend help center or forum the only integration described is as an "external" job execution ... :

Talend knowledge base "Calling a Talend Job from an external Java application" article

Talend Community Forum "Java Object to Talend" topic

May be you have to rethink the architecture of your application if you want to use Talend as the ETL tool for your purpose.

OTHER TIPS

I did something similar I guess. I created a mapping in tallend using tMap and exported this as talend job (java se programm). If you include the libraries of that job, you can run the talend job as described by others.

To pass arbitrary java objects you can use the following methods which are present in every talend job:

public Object getValueObject() {
    return this.valueObject;
}

public void setValueObject(Object valueObject) {
    this.valueObject = valueObject;
}

In your job you have to cast this object. e.g. you can put in a List of HashMaps and use Java reflection to populate rows. Use tJavaFlex or a custom component for that.

Using this method I can adjust the mapping of my data visually in Talend, but still use the generated code as library in my java application.

Now from Talend ETL point of view : if you want to parameter the execution environment of your Jobs (for exemple the physical directory of the uploaded files), you should use context variables that can be loaded at execution time from a configuration file as mentioned here : https://help.talend.com/display/TalendOpenStudioforDataIntegrationUserGuide53EN/2.6.6+Context+settings

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top