Remote java program execution using ftp, very large dataset on remote machine - program to data vs data to program

StackOverflow https://stackoverflow.com/questions/2068166

Question

I am developing a java based application; its pertinent requirements are listed below

  • Large datasets exist on several machines on network. my program needs to (remotely) execute a java program to process these data sets and fetch the results

  • A user on a windows desktop will need to process datasets (several gigs) on machine A. My program can reside on the user's machine. He will execute my program from his machine and initiate the dataset processing on remote machine(s)

  • Instead of getting the dataset over the network from the remote machine to his machine, he will execute the program on the remote machine and fetch results

  • The user may have open access to the other machines but ftp is the requirement

  • Data should not be brought through network to the user's machine.

  • Users have windows OS

My question(s)

  • How can I perform this kind of remote process execution ? Any ideas?

  • I am looking at hadoop; I am working on Windows XP. I was unable to get hadoop working for a single node cluster; I am unable to find good documentation. I therefore haven't quite tested hadoop. Any comments on if I am on the right track?

  • Any links any of you has found useful for installation of hadoop and trouble shooting?

Thanks in advance for any responses. Do please let me know if I should provide any more/specific details.

-jv

Was it helpful?

Solution

Java has a RMI API that you could use, assuming that you can have a JAVA VM running on your remote machines. That's the lightest weight solution. The next lightest weight would be straight socket communication. After that you're getting into EJB servers or Web Servers, which is probably overkill.

OTHER TIPS

Have a look at how to write web services with Java 6. That allows you to publish a method as a web service with an annotation. A web service client is small and does not require additional software. I found the Idea IntelliJ IDE easy to use, and generated a pure Java 6 client.

Then it essentially boils down to making a "normal" method call, and processing the result.

Keep it simple. Grid software is most likely not what you want.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top