Getting distinct values from linq to xml query

https://stackoverflow.com/questions/18308598

24-06-2022
|

Question

I am getting a csv file and converting it to XML and this works well.

The XML looks like this

<Root>
  <Player index="0">
    <Rank>
      <Level> A</Level>
      <Class>   1</Class>
    </Rank>
    <TopScores>
      <Lives>      1.0</Lives>
      <Kills>      0.0</Kills>
      <Time> 9:59:55</Time>
    </TopScores>
  </Player>
  <Player index="1">
    <Rank>
      <Level> A</Level>
      <Class>   2</Class>
    </Rank>
    <TopScores>
      <Lives>      1.2</Lives>
      <Kills>      0.0</Kills>
      <Time> 9:59:59</Time>
    </TopScores>
  </Player>
  <Player index="2">
    <Rank>
      <Level> A</Level>
      <Class>   3</Class>
    </Rank>
    <TopScores>
      <Lives>      1.2</Lives>
      <Kills>      0.0</Kills>
      <Time>10: 0: 3</Time>
    </TopScores>
  </Lives>
  <Player index="3">
    <Rank>
      <Level> A</Level>
      <Class>   1</Class>
    </Rank>
    <TopScores>
      <Lives>      0.9</Lives>
      <Kills>      0.0</Kills>
      <Time>10: 0: 8</Time>
    </TopScores>
  </Player>
</Root>

As you can see there is a duplicate Rank (Level 1, Class 0).

I can get the number of duplicates with the following query:

//The query below will return the duplicates so we can report the identities
//and the count of the number of duplicates.
var duplicates = playerData.Descendants("Rank")
                  .GroupBy(c => c.ToString())
                  .Where(g => g.Count() > 1)
                  .Select(g => g.First().Value);

I would like to get the complete Player data for the all the 'Rank' values but, where there are duplicate 'Rank' values, I would like to get the last occurrence. So for the XML data above, I should only return three sets of Player data and when there is a duplicate, it should be the one with latest time or id.

To do this I have tried:

var uPlays = playerData.Descendants("Player").Descendants("Rank")
                       .GroupBy(z => z.ToString())
                       .Select(b => b.Ancestors().First());

This gives me all the Player data for unique values of the Rank but of course it gives me the first instances encountered. I thought I would just change .First() to .Last() but this just gives me duplicates of the very first value three times.

Solution

I would try something like this

var uPlays = playerData.Root.Elements("Player")
                      .GroupBy(c => c.Element("Rank").ToString())
                      .Select(g => g.Last());

or using your code

var uPlays = playerData.Descendants("Player")
                       .Descendants("Rank")
                       .GroupBy(z => z.ToString())
                       .Select(b => b.Last().Parent);

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow