Question

How can I get the number of lines(rows) from an InputStream or from a CsvMapper without looping through and counting them?

Below I have an InputStream created from a CSV file.

InputStream content = (... from a resource ...);
CsvMapper mapper = new CsvMapper();
mapper.enable(CsvParser.Feature.WRAP_AS_ARRAY);
MappingIterator<Object[]> it = mapper
        .reader(Object[].class)
        .readValues(content);

Is it possible to do something like

int totalRows = mapper.getTotalRows();

I would like to use this number in the loop to update progress.

while (it.hasNextValue()){
    //do stuff here

    updateProgressHere(currentRow, totalRows);
}

Obviously, I can loop through and count them once. Then loop through again and process them while updating progress. This is inefficient and slow as some of these InputStreams are huge.

Was it helpful?

Solution

Unless you know the row count ahead of time, it is not possible without looping. You have to read that file in its entirety to know how many lines are in it, and neither InputStream nor CsvMapper have a means of reading ahead and abstracting that for you (they are both stream oriented interfaces).

None of the interfaces that ObjectReader can operate on support querying the underlying file size (if it's a file) or number of bytes read so far.

One possible option is to create your own custom InputStream that also provides methods for grabbing the total size and number of bytes read so far, e.g. if it is reading from a file, it can expose the underlying File.length() and also track the number of bytes read. This may not be entirely accurate, especially if Jackson buffers far ahead, but it could get you something at least.

OTHER TIPS

Technically spoken, there are only two ways. Either (as you have seen) looping through and incrementing counter, or:

On the sender, the first information to send would be the counter, and then sending the data. This enables you to evaluate the first bytes as count when reading the stream at the begin. Precondition of this procedure is of course that the sending application knows in advance the size of data to be sent.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top