BufferedReader中大型的ByteBuffer?
-
20-08-2019 - |
题
有一种方法以读取与一个BufferedReader一个字节缓冲区,而不必首先将它变成一个字符串?我想通过一个相当大的ByteBuffer阅读文本的线条和性能方面的原因我想避免它写入磁盘。呼吁字节缓冲区的toString不起作用,因为得到的String太大(它抛出java.lang.OutOfMemoryError:Java堆空间)。我本来以为会有东西在合适的读取器来包装字节缓冲区的API中,但我似乎无法找到合适的东西。
下面是一个简短的代码示例示出了我在做什么):
// input stream is from Process getInputStream()
public String read(InputStream istream)
{
ReadableByteChannel source = Channels.newChannel(istream);
ByteArrayOutputStream ostream = new ByteArrayOutputStream(bufferSize);
WritableByteChannel destination = Channels.newChannel(ostream);
ByteBuffer buffer = ByteBuffer.allocateDirect(writeBufferSize);
while (source.read(buffer) != -1)
{
buffer.flip();
while (buffer.hasRemaining())
{
destination.write(buffer);
}
buffer.clear();
}
// this data can be up to 150 MB.. won't fit in a String.
result = ostream.toString();
source.close();
destination.close();
return result;
}
// after the process is run, we call this method with the String
public void readLines(String text)
{
BufferedReader reader = new BufferedReader(new StringReader(text));
String line;
while ((line = reader.readLine()) != null)
{
// do stuff with line
}
}
解决方案
它为什么您使用的是字节的缓冲区下手目前尚不清楚。如果你有一个InputStream
和你想读它的线条,你为什么不只是使用包裹在一个InputStreamReader
的BufferedReader
?什么是在获得NIO涉及的利益?
这是一个toString()
调用ByteArrayOutputStream
听起来像一个坏主意,我即使你有它的空间:更好地得到它作为一个字节数组,敷在ByteArrayInputStream
,然后一个InputStreamReader
,如果你真的必须有一个ByteArrayOutputStream
。如果你的真正的想打电话toString()
,至少要用这需要的字符编码的名称超负荷使用 - 否则它会使用系统默认的,这可能不是你想要的
编辑:好了,你真的想用NIO。你还在写一个ByteArrayOutputStream
最终,这样你会得到一个BAOS最终在它的数据。如果你希望避免的数据副本,则需要从ByteArrayOutputStream
派生,比如像这样的:
public class ReadableByteArrayOutputStream extends ByteArrayOutputStream
{
/**
* Converts the data in the current stream into a ByteArrayInputStream.
* The resulting stream wraps the existing byte array directly;
* further writes to this output stream will result in unpredictable
* behavior.
*/
public InputStream toInputStream()
{
return new ByteArrayInputStream(array, 0, count);
}
}
然后可以创建输入流,把它包装在一个InputStreamReader
,包裹在一个BufferedReader
,并且你离开。
其他提示
您可以使用NIO,但这里没有真正的需要。作为乔恩斯基特提示:
public byte[] read(InputStream istream)
{
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buffer = new byte[1024]; // Experiment with this value
int bytesRead;
while ((bytesRead = istream.read(buffer)) != -1)
{
baos.write(buffer, 0, bytesRead);
}
return baos.toByteArray();
}
// after the process is run, we call this method with the String
public void readLines(byte[] data)
{
BufferedReader reader = new BufferedReader(new InputStreamReader(new ByteArrayInputStream(data)));
String line;
while ((line = reader.readLine()) != null)
{
// do stuff with line
}
}
这是一个示例:
public class ByteBufferBackedInputStream extends InputStream {
ByteBuffer buf;
public ByteBufferBackedInputStream(ByteBuffer buf) {
this.buf = buf;
}
public synchronized int read() throws IOException {
if (!buf.hasRemaining()) {
return -1;
}
return buf.get() & 0xFF;
}
@Override
public int available() throws IOException {
return buf.remaining();
}
public synchronized int read(byte[] bytes, int off, int len) throws IOException {
if (!buf.hasRemaining()) {
return -1;
}
len = Math.min(len, buf.remaining());
buf.get(bytes, off, len);
return len;
}
}
和您可以使用它是这样的:
String text = "this is text"; // It can be Unicode text
ByteBuffer buffer = ByteBuffer.wrap(text.getBytes("UTF-8"));
InputStream is = new ByteBufferBackedInputStream(buffer);
InputStreamReader r = new InputStreamReader(is, "UTF-8");
BufferedReader br = new BufferedReader(r);