Pergunta

I have a huge text file which i need to read.Currently I am reading text file like this..

string[] lines = File.ReadAllLines(FileToCopy);

But here all the lines are getting being stored in lines array and after this according to the condition is being processed programtically which is not efficient way as first it will Read irrelevant rows(lines) also of the text file into array and same way will go for the processing. So my question is Can i put line number to be read from the text file..Suppose last time it had read 10001 lines and next time it should start from 10002.. How to achieve it?

Foi útil?

Solução 2

Ignore the lines, they're useless - if every line isn't the same length, you're going to have to read them one by one again, that's a huge waste.

Instead, use the position of the file stream. This way, you can skip right there on the second attempt, no need to read the data all over again. After that, you'll just use ReadLine in a loop until you get to the end, and mark the new end position.

Please, don't use ReadLines().Skip(). If you have a 10 GB file, it will read all the 10 GBs, create the appropriate strings, throw them away, and then, finally, read the 100 bytes you want to read. That's just crazy :) Of course, it's better than using File.ReadAllLines, but only because that doesn't need to keep the whole file in memory at once. Other than that, you're still reading every single byte of the file (you have to find out where the lines end).

Sample code of a method to read from last known location:

string[] ReadAllLinesFromBookmark(string fileName, ref long lastPosition)
{
    using (var fs = File.OpenRead(fileName))
    {
        fs.Position = lastPosition;

        using  (var sr = new StreamReader(fs))
        {
            string line = null;

            List<string> lines = new List<string>();

            while ((line = sr.ReadLine()) != null)
            {
                lines.Add(line);
            }

            lastPosition = fs.Position;

            return lines.ToArray();
        }
    }
}

Outras dicas

Well you don't have to store all those lines - but you definitely have to read them. Unless the lines are of a fixed length (in bytes, not characters) how would you expect to be able to skip to a particular part of the file?

To store only the lines you want in memory though, use:

List<string> lines = File.ReadLines(FileToCopy).Skip(linesToSkip).ToList();

Note that File.ReadLines() was introduced in .NET 4, and reads the lines on-demand with an iterator instead of reading the entire file into memory.

If you only want to process a certain number of lines, you can use Take as well:

List<string> lines = File.ReadLines(FileToCopy)
                         .Skip(linesToSkip)
                         .Take(linesToRead)
                         .ToList();

So for example, linesToSkip=10000 and linesToRead=1000 would give you lines 10001-11000.

Well you do have line numbers, in the form of the array index. Keep a note of the previously read lines array index and you start start reading from the next array index.

Use the Filestream.Position method to get the position of that file and then set the position.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top