Question

I have a log file. The file is structured. Each line is delimeted by commas so that it looks like a table. I have to read from this file and transport the content to a database table. I want to treat the first 9 commas as delimeters and the rest ones as just a content of the tenth column. So, there can be more than 9 commas in a line but the tenth and the next ones shoudn't be taken as a delimeter. I actually know how to do it by iterating through the characters and checking occurance of commas. But I don't want it be inefficient. maybe there's a better way? What would be the most accurate way to accomplish this? Either C# or Delphi is OK. For Oracle I'll probably use Oracle but SQL Server is also an option.

Was it helpful?

Solution

The String.Split(char[],int) method overload achieves what you require. For example:

string line = "a,b,c,d,e,f,g,h,i,j,k,l,m,n";
string[] fields = line.Split(new char[] { ',' }, 10);

The fields array will contain ten elements. The first nine elements correspond the first nine substrings of line that were delimited by ',' characters, whilst the last element would correspond to the remaining substring (following the ninth ',' occurrence).

From MSDN (with count being the second argument passed to the Split(char[], int) method):

If there are more than count substrings in this instance, the first count minus 1 substrings are returned in the first count minus 1 elements of the return value, and the remaining characters in this instance are returned in the last element of the return value.

Thus, executing:

for (int i = 0; i < fields.Length; ++i)
    Console.WriteLine(string.Format("fields[{0}]: \"{1}\"", i, fields[i]));

would output:

fields[0]: "a"
fields[1]: "b"
fields[2]: "c"
fields[3]: "d"
fields[4]: "e"
fields[5]: "f"
fields[6]: "g"
fields[7]: "h"
fields[8]: "i"
fields[9]: "j,k,l,m,n"

OTHER TIPS

Some databases have ETL (Extract, Transform, Load) facilities that allows for importing external data very quickly and efficiently once set up. How flexible they are depends on the database - you didn't say which is. To identify your columns in Delphi I'd use regular expressions, anyway they do exactly what you would do - iterate the string and match characters, just a good regex library can be fast (and flexible). What you should avoid is read one character per file read. Read n lines into a memory buffer and process them there.

Try to use FileHelpers Library and its DatabaseStorage class.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top