Question

I'm working a .NET web service that will be processing a text file with a relatively long, multilevel record format. Each record in the file represents a different entity; the record contains multiple sub-types. (The same record format is currently being processed by a COBOL job, if that gives you a better picture of what we're looking at). I've created a class structure (a DATA DIVISION if you will) to hold the input data.

My question is, what best practices have you found for processing large, complex fixed-width files in .NET? My general approach will be to read the entire line into a string and then parse the data from the string into the classes I've created. But I'm not sure whether I'll get better results working with the characters in the string as an array, or with the string itself. I guess that's the specific question, string vs. char[], but I would appreciate any other pointers anyone has.

Thanks.

Was it helpful?

Solution

I would build classes that matched the data in the rows, using attributes for types, length etc. Then use the Microsoft.VisualBasic.FileIO.TextFieldParser object for reading the file, with some generic code for programming the parser based on the class, then reading the data and creating an instance of the class (all using reflection).

I use this for reading CSVs and its fast, flexible, extenisble, generic and easy to maintain. I also have attributes that allow me to add generic validation to each field as its being read.

I'd share my code, but its the IP of the firm I work for.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top