Question

I'm using proc_open in php to call java application, pass it text to be processed and read output text. Java execution time is quite long and I found the reason for that is reading input takes most of the time. I'm not sure whether it's php's or java's fault.

My PHP code:

$process_cmd = "java -Dfile.encoding=UTF-8 -jar test.jar";

$env = NULL;

$options = ["bypass_shell" => true];
$cwd = NULL;
$descriptorspec = [
    0 => ["pipe", "r"],     //stdin is a pipe that the child will read from
    1 => ["pipe", "w"],     //stdout is a pipe that the child will write to
    2 => ["file", "java.error", "a"]
];

$process = proc_open($process_cmd, $descriptorspec, $pipes, $cwd, $env, $options);

if (is_resource($process)) {

    //feeding text to java
    fwrite($pipes[0], $input);
    fclose($pipes[0]);

    //reading output text from java
    $output = stream_get_contents($pipes[1]);
    fclose($pipes[1]);

    $return_value = proc_close($process);

}

My java code:

public static void main(String[] args) throws Exception {

    long start;
    long end;

    start = System.currentTimeMillis();

    BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
    String in;
    String input = "";
    br = new BufferedReader(new InputStreamReader(System.in));
    while ((in = br.readLine()) != null) {
        input += in + "\n";
    }

    end = System.currentTimeMillis();
    log("Input: " + Long.toString(end - start) + " ms");


    start = System.currentTimeMillis();

    org.jsoup.nodes.Document doc = Jsoup.parse(input);

    end = System.currentTimeMillis();
    log("Parser: " + Long.toString(end - start) + " ms");


    start = System.currentTimeMillis();

    System.out.print(doc);

    end = System.currentTimeMillis();
    log("Output: " + Long.toString(end - start) + " ms");

}

I'm passing to java html file of 3800 lines (~200KB in size as a standalone file). These are broken down execution times in the log file:

Input: 1169 ms
Parser: 98 ms
Output: 12 ms

My question is this: why does input take 100 times longer than output? Is there a way to make it faster?

Was it helpful?

Solution

Inspect your read block in the Java program: Try to use a StringBuilder to concat the data (instead of using += on a String):

String in;
StringBuilder input = new StringBulider();
br = new BufferedReader(new InputStreamReader(System.in));
while ((in = br.readLine()) != null) {
    input.append(in + "\n");
}

Details are covered here: Why using StringBuilder explicitly


Generally speaking, to make it faster, consider using an application server (or a simple socket based server), to have a permanently running JVM. There is always some overhead when you start a JVM, on top of it the JIT needs some time as well to optimize your code. This effort is lost, after the the JVM exits.

As for the PHP program: Try to feed the Java program from the shell, just use cat to pipe the data (on a UNIX system like Linux). As an alternative, rewrite your Java program to accept a command line parameter for the file as well. Then you can judge, if your PHP code pipes the data fast enough.

As for the Java program: If you do performance analysis, consider the recommendations in How do I write a correct micro-benchmark in Java

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top