Dovrei tamponare l'InputStream o InputStreamReader?

https://stackoverflow.com/questions/3459127

27-09-2019
|

Domanda

Quali sono le differenze (se ce ne sono) tra i due buffer seguenti approcci?

Reader r1 = new BufferedReader(new InputStreamReader(in, "UTF-8"), bufferSize);
Reader r2 = new InputStreamReader(new BufferedInputStream(in, bufferSize), "UTF-8");

Soluzione

r1 è più efficiente. Il InputStreamReader sé non ha un buffer di grandi dimensioni. Il BufferedReader può essere impostato per avere un buffer maggiore di InputStreamReader. Il InputStreamReader in r2 agirebbe come un collo di bottiglia.

In un dado:. Si dovrebbe leggere i dati attraverso un imbuto, non attraverso una bottiglia

Aggiorna : ecco un piccolo programma di benchmark, basta copy'n'paste'n'run esso. Non è necessario per preparare i file.

package com.stackoverflow.q3459127;

import java.io.BufferedInputStream;
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.Reader;

public class Test {

    public static void main(String... args) throws Exception {

        // Init.
        int bufferSize = 10240; // 10KB.
        int fileSize = 100 * 1024 * 1024; // 100MB.
        File file = new File("/temp.txt");

        // Create file (it's also a good JVM warmup).
        System.out.print("Creating file .. ");
        BufferedWriter writer = null;
        try {
            writer = new BufferedWriter(new FileWriter(file));
            for (int i = 0; i < fileSize; i++) {
                writer.write("0");
            }
            System.out.printf("finished, file size: %d MB.%n", file.length() / 1024 / 1024);
        } finally {
            if (writer != null) try { writer.close(); } catch (IOException ignore) {}
        }

        // Read through funnel.
        System.out.print("Reading through funnel .. ");
        Reader r1 = null;        
        try {
            r1 = new BufferedReader(new InputStreamReader(new FileInputStream(file), "UTF-8"), bufferSize);
            long st = System.nanoTime();
            for (int data; (data = r1.read()) > -1;);
            long et = System.nanoTime();
            System.out.printf("finished in %d ms.%n", (et - st) / 1000000);
        } finally {
            if (r1 != null) try { r1.close(); } catch (IOException ignore) {}
        }

        // Read through bottle.
        System.out.print("Reading through bottle .. ");
        Reader r2 = null;        
        try {
            r2 = new InputStreamReader(new BufferedInputStream(new FileInputStream(file), bufferSize), "UTF-8");
            long st = System.nanoTime();
            for (int data; (data = r2.read()) > -1;);
            long et = System.nanoTime();
            System.out.printf("finished in %d ms.%n", (et - st) / 1000000);
        } finally {
            if (r2 != null) try { r2.close(); } catch (IOException ignore) {}
        }

        // Cleanup.
        if (!file.delete()) System.err.printf("Oops, failed to delete %s. Cleanup yourself.%n", file.getAbsolutePath());
    }

}

I risultati a mio Latitude E5500 con Seagate Momentus 7200.3 harddisk:

Creating file .. finished, file size: 99 MB.
Reading through funnel .. finished in 1593 ms.
Reading through bottle .. finished in 7760 ms.

Altri suggerimenti

r1 è anche più conveniente quando si legge flusso line-based come supporti BufferedReader metodo readLine. Non c'è bisogno di leggere contenuti in un buffer array di caratteri o caratteri uno per uno. Tuttavia, è necessario getto r1 a BufferedReader o utilizzare quel tipo in modo esplicito per la variabile.

Io uso spesso questo frammento di codice:

BufferedReader br = ...
String line;
while((line=br.readLine())!=null) {
  //process line
}

In risposta alla domanda di Ross Studtman nel commento di cui sopra (ma rilevanti anche per il PO):

BufferedReader reader = new BufferedReader(new InputStreamReader(new BufferedInputSream(inputStream), "UTF-8"));

Il BufferedInputStream è superfluo (e probabilmente le prestazioni danni a causa di copia estranea). Questo perché le richieste BufferedReader caratteri dal InputStreamReader in grossi pezzi di InputStreamReader.read(char[], int, int) chiamata, che a sua volta (attraverso StreamDecoder) chiama InputStream.read(byte[], int, int) per leggere un grande blocco di byte dal InputStream sottostante.

Si può convincere che questo è così eseguendo il seguente codice:

new BufferedReader(new InputStreamReader(new ByteArrayInputStream("Hello world!".getBytes("UTF-8")) {

    @Override
    public synchronized int read() {
        System.err.println("ByteArrayInputStream.read()");
        return super.read();
    }

    @Override
    public synchronized int read(byte[] b, int off, int len) {
        System.err.println("ByteArrayInputStream.read(..., " + off + ", " + len + ')');
        return super.read(b, off, len);
    }

}, "UTF-8") {

    @Override
    public int read() throws IOException {
        System.err.println("InputStreamReader.read()");
        return super.read();
    }

    @Override
    public int read(char[] cbuf, int offset, int length) throws IOException {
        System.err.println("InputStreamReader.read(..., " + offset + ", " + length + ')');
        return super.read(cbuf, offset, length);
    }

}).read(); // read one character from the BufferedReader

Si vedrà il seguente output:

InputStreamReader.read(..., 0, 8192)
ByteArrayInputStream.read(..., 0, 8192)

Questo dimostra che il BufferedReader richiede un grande pezzo di caratteri dal InputStreamReader, che a sua volta richiede un grande pezzo di byte dal InputStream sottostante.

FWIW, se si sta aprendo un file in Java 8, è possibile utilizzare il

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow