質問

I'm using some not optimal code written by me... :-|

I have following code:

string fmtLine = "";
            string[] splitedFmtLine;
            int counterFMTlines = 0;

            foreach (string fmtF in fmtFiles)
            {
                using (StreamReader sr = new StreamReader(fmtF))
                {
                    while ((fmtLine = sr.ReadLine()) != null)
                    {
                        Console.WriteLine(counterFMTlines++);
                        foreach (L3Message message in rez)
                        {
                            splitedFmtLine = Regex.Split(fmtLine, "\t");

                            if (message.Time == splitedFmtLine[0])
                            {
                                message.ScramblingCode = splitedFmtLine[7];      
                            }
                        }
                    }
                }
            }

I tested this code when List was empty and there was only one file (tab delimited, 280000 lines), and even then it took lifetime (1 min) to go through all 280000 lines of my file. That means that execution skipped foreach loop where is my List of myObjs.

I cannot understand why it took so long?

As example, I was filling my List of myObjs (tree hierarchy) with different text file (source file) but bigger than this tab delimited(tab delimited: 16MB, source file: 36MB) and it took only second versus this 1 minute.

役に立ちましたか?

解決

Apart from the problem writing to the console, you also have a O(m*n) runtime for n being the number of lines in the file and m being the number of messages. This is bad if m or n is big. You can reduce this to an O(n) operation by using a Dictionary instead and eliminating the inner loop.

You can put your messages in a Dictionary, using the Time as a key. In the loop you only have to ask the dictionary for the messages at a specific time:

        string fmtLine = "";
        string[] splitedFmtLine;
        int counterFMTlines = 0;

        var messageTimes = new Dictionary<string, LinkedList<L3Message>>();
        foreach (L3Message message in rez)
        {
            LinkedList<L3Message> list=null;
            messageTimes.TryGetValue(message.Time, out list);

            list = list ?? new LinkedList<L3Message>();

            list.AddLast(message);
            messageTimes[message.Time] = list;
        }

        foreach (string fmtF in fmtFiles)
        {
            using (StreamReader sr = new StreamReader(fmtF))
            {
                while ((fmtLine = sr.ReadLine()) != null)
                {
                    //Console.WriteLine(counterFMTlines++);
                    splitedFmtLine = fmtLine.Split('\t');

                    LinkedList<L3Message> messageList = null;
                    messageTimes.TryGetValue(splitedFmtLine[0], out messageList);

                    if(messageList != null)
                    {
                        foreach (var message in messageList)
                        {
                            message.ScramblingCode = splitedFmtLine[7];                                
                        }
                        messageTimes.Remove(splitedFmtLine[0]); //see comments
                    }

                    if(messageTimes.Count==0) break; //see comments
                }
            }
            if(messageTimes.Count==0) break; //see comments
        } 

This should be super fast.

Edit: I changed the example so that it supports cases where there is more than one message for one time.

Edit2: I added an optimization based on the fact that message time and ScramblingCode always correlate (see comments).

他のヒント

You are writing 280.000 times to the console which is very slow. Remove the console output. Also, use string.Split('\t') which is way faster than this particular regex call.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top