How do I do a linq query to find fields in a dataset that are present in every record of a set?
-
25-02-2021 - |
Pergunta
I have an XML data set with 10K records, each containing a set of fields.
I'd like to know which fields need to be null and which can be non-null in the database schema that matches the dataset.
Does linq offer way to produce a big intersection?
Example:
<set>
<item>
<a/>
<foo />
<b/>
<c/>
</item>
<item>
<a/>
<foo />
<b/>
<c/>
</item>
<item>
<a/>
<b/>
</item>
<item>
<a/>
<foo />
<b/>
</item>
</set>
Prototype:
string[] CommonFieldNames(XElement[] elements)
{
// ...
}
Desired Result:
{ "a", "b" }
Solução
In bellow code, selectedValue is your not null columns.
XDocument doc = XDocument.Parse("<set><item><a/><foo /><b/><c/></item><item><a/><foo /><b/><c/></item></set>");
var items =
doc.Descendants("item")
.Select(x=>x.Descendants().Select(y=>y.Name).ToList()).ToList();
var selectValue = items[0];
foreach (var item in items)
{
selectValue = selectValue.Intersect(item).ToList();
}
Outras dicas
You could use a GroupBy and compare the group size to the total number of elements:
XDocument doc = XDocument.Parse("<set><item><a/><foo /><b/><c/></item><item><a/><foo /><b/><c/></item><item><a/><b/></item><item><a/><foo /><b/></item></set>");
var items = doc.Document.Element("set").Elements("item");
var commonElementNames = items.SelectMany(x => x.Elements()) // Get all immediate children
.GroupBy(x => x.Name) // Group by name
.Where(g => g.Count() == items.Count()) // Filter for only those which show in every group.
.Select(g => g.Key.LocalName) // Select just the element names
;
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow