Domanda

I have a bunch of files in the format following like this:

20130201:14:58:47 I search: xx ('ID'= (xxxxxxxx) )
20130201:14:58:56 I request: search | For ID | Search
20130201:14:58:56 I search: xx ('ID'= (xxxxxxx) )

Is there something in C# like there is in python called a restkey? I want to grab the first three items (Date time, I (called an action), search/request) and insert each of those into their own column in an SQL table then in the 4th column put the rest of the line.

In python this was quite easy but I couldn't deal with all the hoops I had to jump through to get it to insert into my sql table. So I moved to C# where connection to SSMS is easier.

È stato utile?

Soluzione

Although String.Split() is probably the nice and simple way around, I prefer using Regex for this kind of parsing. In this case a pattern like this:

(?<DateTime>\d{8}\:\d{2}\:\d{2}\:\d{2})\s(?<Action>\w)\s(?<SearchOrRequest>search|request)\:\s(?<RestOfTheLine>.*)

Gives you everything you need, nicely grouped into "DateTime", "Action", "SearchOrRequest" and "RestOfLine" match groups.

var pattern = "(?<DateTime>\d{8}\:\d{2}\:\d{2}\:\d{2})\s(?<Action>\w)\s(?<SearchOrRequest>search|request)\:\s(?<RestOfTheLine>.*)";
var regex = new Regex(pattern);
var match = regex.Match(inputString);

var theDate = match.Groups["DateTime"].Value;
var theAction = match.Groups["Action"].Value;
var theChoice = match.Groups["SearchOrRequest"].Value;
var theRest = match.Groups["RestOfTheLine"].Value;

Altri suggerimenti

Using string.Split method

string myString = "20130201:14:58:47 I search: xx (\'ID\'= (xxxxxxxx) )"

string[] strarr = myString.split(' ');

string theLetterIVariableThing = strarr[1];

string iddate = strarr[0];

StringBuilder sb = new StringBuilder();

for (int i = 1; i < strarr.Length; i++)
{
    sb.Append(strarr[i]);
    sb.Append(" ");
}

string trailingText = sb.ToString();

string id = iddate.split(':')[0];

sb.Clear();

for (int i = 1; i < 4; i++)
{
    sb.Append(iddate.split(':'))[i];
}

string date = sb.ToString();

I think this will work, but it's probably the long way around.

You can do it using the .NET function String.Split().

Assuming your date string is of fixed-length, this should work:

        //string inputStr = "20130201:14:58:47 I search: xx ('ID'= (xxxxxxxx) )";
        //string inputStr = "20130201:14:58:56 I request: search | For ID | Search";
        string inputStr = "20130201:14:58:56 I search: xx ('ID'= (xxxxxxx) )";

        string dateStr = inputStr.Substring(0, 17);
        string[] splitStr = inputStr.Substring(18).Split(new char[] { ':' });

        string actionStr = splitStr[0].Substring(0, splitStr[0].IndexOf(' '));
        string userStr = splitStr[0].Substring(2);
        string restStr = splitStr[1].TrimStart();

        // print out what we parsed
        Console.WriteLine(inputStr);
        Console.WriteLine(dateStr);
        Console.WriteLine(actionStr);
        Console.WriteLine(userStr);
        Console.WriteLine(restStr);

Output:

20130201:14:58:56 I search: xx ('ID'= (xxxxxxx) )
20130201:14:58:56
I
search
xx ('ID'= (xxxxxxx) )

I tried a slightly different approach. I created a console program that can convert these files into fully qualified csv files. Then you can import into sql using ssms very easily.

static void Main(string[] args)
    {
        if (args.Length == 2)
        {
            using (StreamWriter sw = new StreamWriter(args[1]))
            {
                using (StreamReader sr = new StreamReader(args[0]))
                {
                    String line;

                    while ((line = sr.ReadLine()) != null)
                    {
                        int index = 0;
                        int oldIndex = 0;
                        string dateTime = null;
                        string action = null;
                        string task = null;
                        string details = null;

                        index = line.IndexOf(' ', oldIndex);
                        dateTime = line.Substring(oldIndex, index - oldIndex);
                        oldIndex = index + 1;

                        index = line.IndexOf(' ', oldIndex);
                        action = line.Substring(oldIndex, index - oldIndex);
                        oldIndex = index + 1;

                        index = line.IndexOf(':', oldIndex);
                        task = line.Substring(oldIndex, index - oldIndex);
                        oldIndex = index + 1;

                        details = line.Substring(oldIndex + 1);

                        sw.WriteLine("\"{0}\",\"{1}\",\"{2}\",\"{3}\"", dateTime, action, task, details);
                    }
                }

            }
        }
        else
        {
            Console.WriteLine("Usage: program <input> <output>");
        }
    }

This is a case where a regular expression is probably the right thing to use.

var testVectors = new[]
{
    "20130201:14:58:47 I search: xx ('ID'= (xxxxxxxx) )",
    "20130201:14:58:56 I request: search | For ID | Search",
    "20130201:14:58:56 I search: xx ('ID'= (xxxxxxx) )"
};

var expression = @"^(?<TimeStamp>[0-9]{8}(:[0-9]{2}){3}) (?<Action>[^ ]+) (?<Type>search|request): (?<Rest>.*)$";

var regex = new Regex(expression);

foreach (var testVector in testVectors)
{
    var match = regex.Match(testVector);

    Console.WriteLine(match.Groups["Timestamp"]);
    Console.WriteLine(match.Groups["Action"]);
    Console.WriteLine(match.Groups["Type"]);
    Console.WriteLine(match.Groups["Rest"]);
}

The expression used makes some assumptions - what you called action is a sequence of characters not containing any spaces and only search and request are valid values for what I call type. But it should be easy to adopt the expression if any of the assumptions does not hold.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top