Question

The statement below takes around 6 seconds to produce the output when the SecurityInfoMasterList has around 11,000 items and listClassiNode has around 750 items.

Is there any other way of doing this to achieve the same result but with better performance?

List<SecurityInfo> listSecurityInfo = SecurityInfoMasterList.Where(c => 
    listClassiNode.Any(d => 
        c.SX == d.Exch && c.Instrument == d.Instrument)).ToList();

I have been trying to use for loop but didnt see much improvement.

Updated:

listClassiNode is a List

[Serializable]
     public class SecurityInfo
    {
    public string SecurityID { get; set; }
    public int SecurityTypeID { get; set; }
    public string Code { get; set; }
    public string SecurityName { get; set; }
    public int DB { get; set; }
    public string ExchangeName { get; set; }
    public DateTime FirstDate { get; set; }
    public int StatusCode { get; set; }
    public long GICS { get; set; }
    public string ICB { get; set; }
    public string Sector { get; set; }
    public string IndustryGroup { get; set; }
    public string Industry { get; set; }
    public string Instrument { get; set; }
    public string TypeDescription { get; set; }
    public string SX { get; set; }
    public string GUID { get; set; }
  }



[Serializable()]
    public class ClassificationNode
    {
        public string Exch { get; set; }
        public string Instrument { get; set; }
        public string Prefix { get; set; }
        public string Name { get; set; }
        public string Level { get; set; }
    }

Alan

Was it helpful?

Solution

You could convert your listClassiNode into some kind of HashSet, so that lookups are O(1) rather than O(n).

var hash = new HashSet<string>(
    listClassiNode.Select(t => 
        string.Format("{0}_{1}", t.Exch, t.Instrument)).Distinct());

List<SecurityInfo> listSecurityInfo = SecurityInfoMasterList.Where(c => 
    hash.Contains(string.Format("{0}_{1}", c.SX, c.Instrument))
        .ToList();

The above is a little clumsy, with the string.Format creating a concatenated key to use for the HashSet. Hopefully, the nature of your data is such that it won't be a problem. Anyway, you get the idea, I hope.

OTHER TIPS

you can try using Parallel see if it helps

List<SecurityInfo> listSecurityInfo = SecurityInfoMasterList.AsParallel.Where(c => 
    listClassiNode.Any(d => 
        c.SX == d.Exch && c.Instrument == d.Instrument)).ToList();

using your class

this take about 4 to 5 seconds to run in DEBUG mode

12,000 x 12,000 instead of 11,000 x 750

 class Program
    {
        static void Main(string[] args)
        {
            var listSecurityInfo = new List<SecurityInfo>();
            var listClassiNode = new List<ClassiNode>();

            initList(listSecurityInfo, listClassiNode);

            var sw = System.Diagnostics.Stopwatch.StartNew();
            var matched = listSecurityInfo.Where(c => listClassiNode.Any(d =>  c.SX == d.Exch && c.Instrument == d.Instrument)).ToList();
            sw.Stop();

            Console.WriteLine("took: " + sw.ElapsedMilliseconds + "ms matched: " +matched.Count());
            Console.Read();
        }

        private static void initList(List<SecurityInfo> listSecurityInfo, List<ClassiNode> listClassiNode)
        {
            var rnd = new Random();

            for (int i = 0; i < 12000; ++i)
            {
                listSecurityInfo.Add(new SecurityInfo()
                {
                    SX = new string(Convert.ToChar(rnd.Next(40, 125)), 4000),
                    Instrument = new string(Convert.ToChar(rnd.Next(40, 125)), 4000)
                });
            }

            for (int i = 0; i < 12000; ++i)
            {
                listClassiNode.Add(new ClassiNode()
                {
                    Exch = new string(Convert.ToChar(rnd.Next(40, 125)), 4000),
                    Instrument = new string(Convert.ToChar(rnd.Next(40, 125)), 4000)
                });
            }
        }
    }
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top