Question

Was wondering if anyone had any favourite methods/ useful libraries for processing a tab-delimited text file? This file is going to have on average 30,000 - 50,000 rows in it. Just need to read through each row and throw it into a database. However, i'd need to temporarily store all the data, the reason being that if the table holding the data gets to more than 1,000,00 rows, i'll need to create a new table and put the data in there. The code will be run in a windows service so i'm not worried about processing time.

Was thinking about just doing a standard while(sr.ReadLine()) ... any suggestions?

Cheers,

Sean.

Was it helpful?

Solution

This library is very flexible and fast. I never get tired recommending it. Defaults to ',' as a delimiter, but you can change it to '\t' easily.

OTHER TIPS

I suspect "throwing it into a database" will take at least 1 order of magnitude longer than reading a line into a buffer, so you could pre-scan the data just to count the number of rows (without parsing them). Then make your database decisions. Then re-read the data doing the real work. With luck, the OS will have cached the file so it reads even quicker.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top