C#의 List<T>에서 중복 항목 제거

https://stackoverflow.com/questions/47752

09-06-2019
|

문제

C#에서 일반 목록의 중복을 제거하는 빠른 방법이 있는 사람이 있나요?

해결책

아마도 당신은 해시세트.

MSDN 링크에서:

using System;
using System.Collections.Generic;

class Program
{
    static void Main()
    {
        HashSet<int> evenNumbers = new HashSet<int>();
        HashSet<int> oddNumbers = new HashSet<int>();

        for (int i = 0; i < 5; i++)
        {
            // Populate numbers with just even numbers.
            evenNumbers.Add(i * 2);

            // Populate oddNumbers with just odd numbers.
            oddNumbers.Add((i * 2) + 1);
        }

        Console.Write("evenNumbers contains {0} elements: ", evenNumbers.Count);
        DisplaySet(evenNumbers);

        Console.Write("oddNumbers contains {0} elements: ", oddNumbers.Count);
        DisplaySet(oddNumbers);

        // Create a new HashSet populated with even numbers.
        HashSet<int> numbers = new HashSet<int>(evenNumbers);
        Console.WriteLine("numbers UnionWith oddNumbers...");
        numbers.UnionWith(oddNumbers);

        Console.Write("numbers contains {0} elements: ", numbers.Count);
        DisplaySet(numbers);
    }

    private static void DisplaySet(HashSet<int> set)
    {
        Console.Write("{");
        foreach (int i in set)
        {
            Console.Write(" {0}", i);
        }
        Console.WriteLine(" }");
    }
}

/* This example produces output similar to the following:
 * evenNumbers contains 5 elements: { 0 2 4 6 8 }
 * oddNumbers contains 5 elements: { 1 3 5 7 9 }
 * numbers UnionWith oddNumbers...
 * numbers contains 10 elements: { 0 2 4 6 8 1 3 5 7 9 }
 */

다른 팁

.Net 3+를 사용하는 경우 Linq를 사용할 수 있습니다.

List<T> withDupes = LoadSomeData();
List<T> noDupes = withDupes.Distinct().ToList();

어때요:-

var noDupes = list.Distinct().ToList();

.net 3.5에서?

동일한 유형의 목록으로 HashSet을 초기화하기만 하면 됩니다.

var noDupes = new HashSet<T>(withDupes);

또는 목록을 반환하려면 다음을 수행하세요.

var noDupsList = new HashSet<T>(withDupes).ToList();

정렬한 다음 중복 항목이 서로 뭉쳐지므로 서로 옆에 있는 두 개와 두 개를 확인하십시오.

이 같은:

list.Sort();
Int32 index = list.Count - 1;
while (index > 0)
{
    if (list[index] == list[index - 1])
    {
        if (index < list.Count - 1)
            (list[index], list[list.Count - 1]) = (list[list.Count - 1], list[index]);
        list.RemoveAt(list.Count - 1);
        index--;
    }
    else
        index--;
}

노트:

제거할 때마다 목록을 다시 작성할 필요가 없도록 뒤에서 앞으로 비교가 수행됩니다.
이 예에서는 이제 C# 값 튜플을 사용하여 스와핑을 수행합니다. 해당 코드를 사용할 수 없는 경우 적절한 코드로 대체합니다.
최종 결과가 더 이상 정렬되지 않습니다.

그것은 나를 위해 일했습니다.간단히 사용

List<Type> liIDs = liIDs.Distinct().ToList<Type>();

"유형"을 원하는 유형으로 바꾸세요.정수.

나는 다음 명령을 사용하고 싶습니다.

List<Store> myStoreList = Service.GetStoreListbyProvince(provinceId)
                                                 .GroupBy(s => s.City)
                                                 .Select(grp => grp.FirstOrDefault())
                                                 .OrderBy(s => s.City)
                                                 .ToList();

내 목록에는 다음 필드가 있습니다.Id, StoreName, City, Postalcode 중복 값이있는 드롭 다운에 도시 목록을 표시하고 싶었습니다.해결책:도시별로 그룹화한 다음 목록의 첫 번째 도시를 선택하세요.

도움이 되었으면 좋겠습니다 :)

kronoz가 .Net 3.5에서 말했듯이 다음을 사용할 수 있습니다. Distinct().

.Net 2에서는 다음과 같이 흉내낼 수 있습니다.

public IEnumerable<T> DedupCollection<T> (IEnumerable<T> input) 
{
    var passedValues = new HashSet<T>();

    // Relatively simple dupe check alg used as example
    foreach(T item in input)
        if(passedValues.Add(item)) // True if item is new
            yield return item;
}

이는 모든 컬렉션의 중복을 제거하는 데 사용될 수 있으며 원래 순서로 값을 반환합니다.

일반적으로 컬렉션을 필터링하는 것이 훨씬 빠릅니다(두 가지 모두 Distinct() 이 샘플에서는 항목을 제거하는 것보다 더 그렇습니다.

확장 방법은 괜찮은 방법 일 수 있습니다 ...이 같은:

public static List<T> Deduplicate<T>(this List<T> listToDeduplicate)
{
    return listToDeduplicate.Distinct().ToList();
}

그런 다음 다음과 같이 호출합니다. 예를 들면 다음과 같습니다.

List<int> myFilteredList = unfilteredList.Deduplicate();

Java에서는(C#은 거의 동일하다고 가정합니다):

list = new ArrayList<T>(new HashSet<T>(list))

원본 목록을 정말로 변경하고 싶다면 다음을 수행하세요.

List<T> noDupes = new ArrayList<T>(new HashSet<T>(list));
list.clear();
list.addAll(noDupes);

순서를 유지하려면 HashSet을 LinkedHashSet으로 바꾸면 됩니다.

Linq를 사용하세요 노동 조합 방법.

메모:이 솔루션에는 존재하는 것 외에는 Linq에 대한 지식이 필요하지 않습니다.

암호

클래스 파일 상단에 다음을 추가하여 시작하세요.

using System.Linq;

이제 다음을 사용하여 다음과 같은 객체에서 중복 항목을 제거할 수 있습니다. obj1:

obj1 = obj1.Union(obj1).ToList();

메모:이름 바꾸기 obj1 개체의 이름에.

작동 원리

Union 명령은 두 소스 개체의 각 항목 중 하나를 나열합니다.obj1은 둘 다 소스 객체이므로 obj1을 각 항목 중 하나로 줄입니다.
그만큼 ToList() 새 목록을 반환합니다.Linq 명령은 다음과 같기 때문에 이것이 필요합니다. Union 원본 목록을 수정하거나 새 목록을 반환하는 대신 결과를 IEnumerable 결과로 반환합니다.

순서에 관심이 없다면 항목을 HashSet, 만약 너라면 하다 순서를 유지하려면 다음과 같이 할 수 있습니다.

var unique = new List<T>();
var hs = new HashSet<T>();
foreach (T t in list)
    if (hs.Add(t))
        unique.Add(t);

또는 Linq 방식:

var hs = new HashSet<T>();
list.All( x =>  hs.Add(x) );

편집하다: 그만큼 HashSet 방법은 O(N) 시간과 O(N) 정렬하고 고유하게 만드는 동안 공간을 확보합니다(@에서 제안한 대로).라세브크 및 기타)은 O(N*lgN) 시간과 O(1) 공간이 너무 많아서 정렬 방식이 열등하다는 것이 (첫눈에 보인 것처럼) 나에게는 명확하지 않습니다 (일시적인 반대 투표에 대해 사과드립니다 ...)

인접한 중복 항목을 현장에서 제거하는 확장 방법은 다음과 같습니다.먼저 Sort()를 호출하고 동일한 IComparer를 전달합니다.이는 Lasse V보다 더 효율적입니다.RemoveAt를 반복적으로 호출하는 Karlsen의 버전(여러 블록 메모리 이동 결과)

public static void RemoveAdjacentDuplicates<T>(this List<T> List, IComparer<T> Comparer)
{
    int NumUnique = 0;
    for (int i = 0; i < List.Count; i++)
        if ((i == 0) || (Comparer.Compare(List[NumUnique - 1], List[i]) != 0))
            List[NumUnique++] = List[i];
    List.RemoveRange(NumUnique, List.Count - NumUnique);
}

도우미 메서드(Linq 제외):

public static List<T> Distinct<T>(this List<T> list)
{
    return (new HashSet<T>(list)).ToList();
}

설치 더LINQ Nuget을 통해 패키지를 사용하면 속성별로 개체 목록을 쉽게 구분할 수 있습니다.

IEnumerable<Catalogue> distinctCatalogues = catalogues.DistinctBy(c => c.CatalogueCode);

이는 고유한 요소(중복 요소가 없는 요소)를 가져와 다시 목록으로 변환합니다.

List<type> myNoneDuplicateValue = listValueWithDuplicate.Distinct().ToList();

목록에 중복 항목이 추가되지 않았는지 확인하는 것이 더 쉬울 수도 있습니다.

if(items.IndexOf(new_item) < 0) 
    items.add(new_item)

.Net 2.0의 또 다른 방법

    static void Main(string[] args)
    {
        List<string> alpha = new List<string>();

        for(char a = 'a'; a <= 'd'; a++)
        {
            alpha.Add(a.ToString());
            alpha.Add(a.ToString());
        }

        Console.WriteLine("Data :");
        alpha.ForEach(delegate(string t) { Console.WriteLine(t); });

        alpha.ForEach(delegate (string v)
                          {
                              if (alpha.FindAll(delegate(string t) { return t == v; }).Count > 1)
                                  alpha.Remove(v);
                          });

        Console.WriteLine("Unique Result :");
        alpha.ForEach(delegate(string t) { Console.WriteLine(t);});
        Console.ReadKey();
    }

해결 방법은 여러 가지가 있습니다. 목록의 중복 문제는 다음 중 하나입니다.

List<Container> containerList = LoadContainer();//Assume it has duplicates
List<Container> filteredList = new  List<Container>();
foreach (var container in containerList)
{ 
  Container duplicateContainer = containerList.Find(delegate(Container checkContainer)
  { return (checkContainer.UniqueId == container.UniqueId); });
   //Assume 'UniqueId' is the property of the Container class on which u r making a search

    if(!containerList.Contains(duplicateContainer) //Add object when not found in the new class object
      {
        filteredList.Add(container);
       }
  }

Ravi Ganesan을 건배합니다

읽기 어려운 LINQ나 사전 목록 정렬이 필요하지 않은 간단한 솔루션은 다음과 같습니다.

   private static void CheckForDuplicateItems(List<string> items)
    {
        if (items == null ||
            items.Count == 0)
            return;

        for (int outerIndex = 0; outerIndex < items.Count; outerIndex++)
        {
            for (int innerIndex = 0; innerIndex < items.Count; innerIndex++)
            {
                if (innerIndex == outerIndex) continue;
                if (items[outerIndex].Equals(items[innerIndex]))
                {
                    // Duplicate Found
                }
            }
        }
    }

David J.의 답변은 좋은 방법이며 추가 개체, 정렬 등이 필요하지 않습니다.그러나 다음 사항을 개선할 수 있습니다.

for (int innerIndex = items.Count - 1; innerIndex > outerIndex ; innerIndex--)

따라서 외부 루프는 전체 목록에서 맨 위 아래로 이동하지만 내부 루프는 "외부 루프 위치에 도달할 때까지" 맨 아래로 이동합니다.

외부 루프는 전체 목록이 처리되었는지 확인하고, 내부 루프는 실제 중복 항목을 찾습니다. 이는 외부 루프가 아직 처리하지 않은 부분에서만 발생할 수 있습니다.

또는 내부 루프에 대해 상향식을 수행하고 싶지 않은 경우 내부 루프가 externalIndex + 1에서 시작하도록 할 수 있습니다.

유니온을 사용할 수 있습니다

obj2 = obj1.Union(obj1).ToList();

견인 수업이 있는 경우 Product 그리고 Customer 목록에서 중복된 항목을 제거하고 싶습니다.

public class Product
{
    public int Id { get; set; }
    public string ProductName { get; set; }

}

public class Customer
{
    public int Id { get; set; }
    public string CustomerName { get; set; }

}

아래 형식으로 일반 클래스를 정의해야 합니다.

public class ItemEqualityComparer<T> : IEqualityComparer<T> where T : class
{
    private readonly PropertyInfo _propertyInfo;

    public ItemEqualityComparer(string keyItem)
    {
        _propertyInfo = typeof(T).GetProperty(keyItem, BindingFlags.GetProperty | BindingFlags.Instance | BindingFlags.Public);
    }

    public bool Equals(T x, T y)
    {
        var xValue = _propertyInfo?.GetValue(x, null);
        var yValue = _propertyInfo?.GetValue(y, null);
        return xValue != null && yValue != null && xValue.Equals(yValue);
    }

    public int GetHashCode(T obj)
    {
        var propertyValue = _propertyInfo.GetValue(obj, null);
        return propertyValue == null ? 0 : propertyValue.GetHashCode();
    }
}

그런 다음 목록에서 중복된 항목을 제거할 수 있습니다.

var products = new List<Product>
            {
                new Product{ProductName = "product 1" ,Id = 1,},
                new Product{ProductName = "product 2" ,Id = 2,},
                new Product{ProductName = "product 2" ,Id = 4,},
                new Product{ProductName = "product 2" ,Id = 4,},
            };
var productList = products.Distinct(new ItemEqualityComparer<Product>(nameof(Product.Id))).ToList();

var customers = new List<Customer>
            {
                new Customer{CustomerName = "Customer 1" ,Id = 5,},
                new Customer{CustomerName = "Customer 2" ,Id = 5,},
                new Customer{CustomerName = "Customer 2" ,Id = 5,},
                new Customer{CustomerName = "Customer 2" ,Id = 5,},
            };
var customerList = customers.Distinct(new ItemEqualityComparer<Customer>(nameof(Customer.Id))).ToList();

이 코드는 다음과 같이 중복 항목을 제거합니다. Id 다른 속성으로 중복된 항목을 제거하려면 변경할 수 있습니다. nameof(YourClass.DuplicateProperty) 같은 nameof(Customer.CustomerName) 그런 다음 중복 항목을 제거하십시오. CustomerName 재산.

  public static void RemoveDuplicates<T>(IList<T> list )
  {
     if (list == null)
     {
        return;
     }
     int i = 1;
     while(i<list.Count)
     {
        int j = 0;
        bool remove = false;
        while (j < i && !remove)
        {
           if (list[i].Equals(list[j]))
           {
              remove = true;
           }
           j++;
        }
        if (remove)
        {
           list.RemoveAt(i);
        }
        else
        {
           i++;
        }
     }  
  }

간단하고 직관적인 구현:

public static List<PointF> RemoveDuplicates(List<PointF> listPoints)
{
    List<PointF> result = new List<PointF>();

    for (int i = 0; i < listPoints.Count; i++)
    {
        if (!result.Contains(listPoints[i]))
            result.Add(listPoints[i]);
        }

        return result;
    }

모든 답변은 목록을 복사하거나, 새 목록을 생성하거나, 느린 기능을 사용하거나, 고통스러울 정도로 느립니다.

내가 이해한 바로는 이것이다. 가장 빠르고 저렴한 방법 나도 알고 있습니다(또한 실시간 물리 최적화를 전문으로 하는 숙련된 프로그래머의 지원을 받습니다).

// Duplicates will be noticed after a sort O(nLogn)
list.Sort();

// Store the current and last items. Current item declaration is not really needed, and probably optimized by the compiler, but in case it's not...
int lastItem = -1;
int currItem = -1;

int size = list.Count;

// Store the index pointing to the last item we want to keep in the list
int last = size - 1;

// Travel the items from last to first O(n)
for (int i = last; i >= 0; --i)
{
    currItem = list[i];

    // If this item was the same as the previous one, we don't want it
    if (currItem == lastItem)
    {
        // Overwrite last in current place. It is a swap but we don't need the last
       list[i] = list[last];

        // Reduce the last index, we don't want that one anymore
        last--;
    }

    // A new item, we store it and continue
    else
        lastItem = currItem;
}

// We now have an unsorted list with the duplicates at the end.

// Remove the last items just once
list.RemoveRange(last + 1, size - last - 1);

// Sort again O(n logn)
list.Sort();

최종 비용은 다음과 같습니다.

nlogn + n + nlogn = n + 2nlogn = O(로그인) 꽤 좋은데요.

RemoveRange에 대한 참고 사항:목록 개수를 설정할 수 없고 제거 기능을 사용하지 않기 때문에 이 작업의 속도는 정확히 알 수 없지만 가장 빠른 방법인 것 같습니다.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow