Question

I need to select distinct rows from Textfile display below.

TextFile

 123| one| two| three  <br/>
124| one| two| four <br/>
 125| one |two| three <br/>

Output should like this

 123| one| two| three  <br/>
124| one| two| four <br/>

OR

124| one| two| four <br/>
125| one |two| three <br/>

I am using this code to work out this problem

var readfile = File.ReadAllLines(" text file location ");
        var spiltfile = (from f in readfile
                    let line = f.Split('|')
                    let y = line.Skip(1)
                    select (from str in y
                            select str).FirstOrDefault()).Distinct()

Thanks

Was it helpful?

Solution

The unclear spacing in the question doesn't help (especially around the |two|, which has different spacing than the rest, implying we need to use trimming), but here's some custom LINQ methods that do the job. I've used the anon-type purely as a simple way of flattening out the inconsistent spacing (I could also have rebuilt a string, but it seemed unnecessary)

Note that without the odd spacing, this can be simply:

var qry = ReadLines("foo.txt")
        .DistinctBy(line => line.Substring(line.IndexOf('|')));

Full code:

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
static class Program
{
    static void Main()
    {
        var qry = (from line in ReadLines("foo.txt")
                   let parts = line.Split('|')
                   select new
                   {
                       Line = line,
                       Key = new
                       {
                           A = parts[1].Trim(),
                           B = parts[2].Trim(),
                           C = parts[3].Trim()
                       }
                   }).DistinctBy(row => row.Key)
                  .Select(row => row.Line);

        foreach (var line in qry)
        {
            Console.WriteLine(line);
        }
    }
    static IEnumerable<TSource> DistinctBy<TSource, TValue>(
        this IEnumerable<TSource> source,
        Func<TSource, TValue> selector)
    {
        var found = new HashSet<TValue>();
        foreach (var item in source)
        {
            if (found.Add(selector(item))) yield return item;
        }
    }
    static IEnumerable<string> ReadLines(string path)
    {
        using (var reader = File.OpenText(path))
        {
            string line;
            while ((line = reader.ReadLine()) != null)
            {
                yield return line;
            }
        }
    }
}

OTHER TIPS

Check out this, this will do what you want to do

    static void Main(string[] args)
    {


        string[]  readfile = System.IO.File.ReadAllLines(@"D:\1.txt");
        var strList = readfile.Select(x => x.Split('|')).ToList();            

        IEnumerable<string[]> noduplicates =strList.Distinct(new StringComparer());

        foreach (var res in noduplicates)
            Console.WriteLine(res[0] + "|" + res[1] + "|" + res[2] + "|" + res[3]);
     }

And implement the IEqualityComparer this way

class StringComparer : IEqualityComparer<string[]>
{
    public bool Equals(string[] x, string[] y)
    {         
        if (Object.ReferenceEquals(x, y)) return true;

        if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
            return false;

        return x[1].Trim() == y[1].Trim() && x[2].Trim() == y[2].Trim() && x[3].Trim() == y[3].Trim() ;
    }


    public int GetHashCode(string[] data)
    {

        if (Object.ReferenceEquals(data, null)) return 0;        
        int hash1 = data[1] == null ? 0 : data[1].Trim().GetHashCode();

        int hash2 = data[2] == null ? 0 : data[2].Trim().GetHashCode();

        int hash3 = data[3] == null ? 0 : data[3].Trim().GetHashCode();

        return hash1 ^ hash2 * hash3;
    }

}

It will give u the output as you expected.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top