Domanda

Does anyone know how to split this file with regex

1 TESTAAA      SERNUM    A DESCRIPTION
2 TESTBBB      ANOTHR    ANOTHER DESCRIPTION
3 TESTXXX      BLAHBL

The lenght of each column

{id} {firsttext} {serialhere} {description}
 4    22          6            30+

I'm planning to do it with a regex to store all my values in a string[] like this.

        using (StreamReader sr = new StreamReader("c:\\file.txt"))
        {
            string line = string.Empty;
            string[] source = null;
            while ((line = sr.ReadLine()) != null)
            {
                source = Regex.Split(line, @"(.{4})(.{22})(.{6})(.+)", RegexOptions.Singleline);
            }

        }

But I have 2 problems.

  1. The split creates a 6 elements source[0] = "" and source[5] ="" when as you can see I have only 4 elements(columns) per line.
  2. In the case of 3rd line which have the 4th column, if I have blank spaces it creates a position for it but if there's no blank spaces this column is missed.

So what would be the best pattern or solution to split with regex or another solution will be aprreciate it!!! I want to split fixed width. Thanks.

È stato utile?

Soluzione

Using a regular expression seems like overkill, when you already know exactly where to get the data. Use the Substring method to get the parts of the string:

string[] source = new string[]{
  line.Substring(0, 4),
  line.Substring(4, 22),
  line.Substring(26, 6),
  line.Substring(32)
};

Edit:

To make it more configurable, you can use column widths from an array:

int[] cols = new int[] { 4, 22, 6 };

string[] source = new string[cols.Length + 1];
int ofs = 0;
for (int i = 0; i < cols.Length; i++) {
  source[i] = line.Substring(ofs, cols[i]);
  ofs += cols[i];
};
source[cols.Length] = line.Substring(ofs)

Altri suggerimenti

It's easier to just use Substring method if you have fixed length, e.g.

string id = line.Substring(0, 4);
string firsttext = line.Substring(4, 22);
string serial = line.Substring(26, 6);
string description = line.Substring(32);

If you really want to use regular expressions, you can use the one below. Please note that it will only work if the data in the first 3 columns doesn't have spaces. Also, I assumed the first column is digits and the rest just alpha.

String input = "2 TESTBBB      ANOTHR    ANOTHER DESCRIPTION";
Match match = Regex.Match(input, @"^(\d*)\s*(\w*)\s*(\w*)\s*(.*)$");
if (match.Groups.Count == 5)
{
    string id = match.Groups[1].Value;
    string firsttext = match.Groups[2].Value;
    string serial = match.Groups[3].Value;
    string description = match.Groups[4].Value;
}
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top