Domanda

I need some help devising a strategy to parse JSON docs within a Talend job (Java job, not Perl). I am using Talend Version: 5.0.2 and developing on a Mac, planning to run on a Linux box.

Unfortunately, I cannot use the tFileInputJSON component because of the format of my files -- each file contains several hundred JSON docs, with a complete JSON doc taking up one line in the file. I think the right solution is to read the file line by line then pass it into a JSON parser and from there send the results to the rest of the job.

As I see it my options are:

a) send the line input to some sort of Java JSON parser. If that's the strategy I need to take, I'd like some advice on how to deal with the output and getting

b) find a Talend component that parses JSON docs, but within a flow as opposed to on a single file in valid JSON format.

I've searched around for this component but can't seem to find it. From my search, it seems even the tFileInputJSON component is relatively new.

I definitely know this is something Java can do pretty easily. My problem is getting the whole thing synced up within the Talend framework.

Anyone have some advice on where I should turn next?

Thanks in advance.

È stato utile?

Soluzione

Have you tried creating a custom routine? You can do so under Code (in the repository window on the left), right click on Routines and create your custom routine. This lets you write a Java function which can then be called from somewhere in your job (tMap, tJava, whatever). You could read your input file and call a function on each line/element or whatever that does something you want.

Like any Java function, the routine can then write to file, print to screen or return some list object that you can further work on in another tJava, tJavaFlex, tJavaRow or whatever Talend components in your job.

It may feel a little hacky, but you can do a lot just using custom routines.

If you want to go all the way and create your own component, this may be a good way to start: http://www.talendforge.org/forum/viewtopic.php?id=17650 Of course, creating components is much more time-consuming, but may be useful if you think you'll be reusing this code in multiple projects/cases.

Altri suggerimenti

Read the file line by line, and construct a JSON Object for each line.

final BufferedReader br = new BufferedReader(new FileReader(file));
String line;

while ((line = br.readLine()) != null)         // read until EOF
{
  final JSONObject json = new JSONObject(line);
  ...
}

br.close();
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top