문제

Does anyone know how to split this file with regex

1 TESTAAA      SERNUM    A DESCRIPTION
2 TESTBBB      ANOTHR    ANOTHER DESCRIPTION
3 TESTXXX      BLAHBL

The lenght of each column

{id} {firsttext} {serialhere} {description}
 4    22          6            30+

I'm planning to do it with a regex to store all my values in a string[] like this.

        using (StreamReader sr = new StreamReader("c:\\file.txt"))
        {
            string line = string.Empty;
            string[] source = null;
            while ((line = sr.ReadLine()) != null)
            {
                source = Regex.Split(line, @"(.{4})(.{22})(.{6})(.+)", RegexOptions.Singleline);
            }

        }

But I have 2 problems.

  1. The split creates a 6 elements source[0] = "" and source[5] ="" when as you can see I have only 4 elements(columns) per line.
  2. In the case of 3rd line which have the 4th column, if I have blank spaces it creates a position for it but if there's no blank spaces this column is missed.

So what would be the best pattern or solution to split with regex or another solution will be aprreciate it!!! I want to split fixed width. Thanks.

도움이 되었습니까?

해결책

Using a regular expression seems like overkill, when you already know exactly where to get the data. Use the Substring method to get the parts of the string:

string[] source = new string[]{
  line.Substring(0, 4),
  line.Substring(4, 22),
  line.Substring(26, 6),
  line.Substring(32)
};

Edit:

To make it more configurable, you can use column widths from an array:

int[] cols = new int[] { 4, 22, 6 };

string[] source = new string[cols.Length + 1];
int ofs = 0;
for (int i = 0; i < cols.Length; i++) {
  source[i] = line.Substring(ofs, cols[i]);
  ofs += cols[i];
};
source[cols.Length] = line.Substring(ofs)

다른 팁

It's easier to just use Substring method if you have fixed length, e.g.

string id = line.Substring(0, 4);
string firsttext = line.Substring(4, 22);
string serial = line.Substring(26, 6);
string description = line.Substring(32);

If you really want to use regular expressions, you can use the one below. Please note that it will only work if the data in the first 3 columns doesn't have spaces. Also, I assumed the first column is digits and the rest just alpha.

String input = "2 TESTBBB      ANOTHR    ANOTHER DESCRIPTION";
Match match = Regex.Match(input, @"^(\d*)\s*(\w*)\s*(\w*)\s*(.*)$");
if (match.Groups.Count == 5)
{
    string id = match.Groups[1].Value;
    string firsttext = match.Groups[2].Value;
    string serial = match.Groups[3].Value;
    string description = match.Groups[4].Value;
}
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top