Using svick's comment, I ended up combining LINQ to XML. Once I reached the correct element and checked that the attribute had the correct ID, I dumped it to XElement.Load.
Iteration through XML?
-
16-07-2023 - |
문제
I have a 6GB XML file and I'm using XmlReader to loop through the file. The file's huge but there's nothing I can do about that. I use LINQ, but the size doesn't let me use XDocument as I get an OutOfMemory error.
I'm using XmlReader to loop through the whole file and extract what I need. I'm including a sample XML file.
Essentially, this is what I do:
- Find tag Container. If found, then retrieve attribute "ID".
- If "ID" begins with LOCAL, then this is what I'll be reading.
- Reader loop until I find tag Family with value CELL_FD
- When found, loop the reader.read() until I find tag IMPORTANT_VALUE.
- Once found, read value of IMPORTANT_VALUE.
- I'm done with this container, so continue looping until I find the next Container (that's where the break comes in).
This is the simplified version of how I've been reading the file and finding the relevant values.
while (myReader.Read())
{
if ((myReader.Name == "CONTAINER"))
{
if (myReader.HasAttributes)
{
string Attribute = myReader.GetAttribute("id");
if (Attribute.IndexOf("LOCAL_") >= 0)
{
while (myReader.Read())
{
if (myReader.Name == "FAMILY")
{
myReader.Read();//read value
string Family = myReader.Value;
if (Family == "CELL_FDD")
{
while (myReader.Read())
{
if ((myReader.Name == "IMPORTANT_VALUE"))
{
myReader.Read();
string Counter = myReader.Value;
Console.WriteLine(Attribute + " (found: " + Counter + ")");
break;
}
}
}
}
}
}
}
}
}
And this is the XML:
<es:esFD xmlns:es="File.xsd">
<vs:vsFD xmlns:vs="OTHER_FILE.xsd">
<CONTAINER id="LOCAL_CONTAINER1">
<ATTRIBUTES>
<FAMILY>CELL_FDD</FAMILY>
<CELL_FDD>
<VAL1>1.1.2.3</VAL1>
<VAL2>JSMITH</VAL2>
<VAL3>320</VAL3>
<IMPORTANT_VALUE>VERY</IMPORTANT_VALUE>
<VAL4>320</VAL4>
</CELL_FDD>
<FAMILY>BLAH</FAMILY>
<BLAH>
<VAL1>1.4.43.3</VAL1>
<VAL2>NA</VAL2>
<VAL3>349</VAL3>
<IMPORTANT_VALUE>NA</IMPORTANT_VALUE>
<VAL4>43</VAL4>
<VAL5>00</VAL5>
<VAL6>12</VAL6>
</BLAH>
</ATTRIBUTES>
</CONTAINER>
<CONTAINER id="FOREIGN_ELEMENT1">
<ATTRIBUTES>
<FAMILY>CELL_FDD</FAMILY>
<CELL_FDD>
<VAL1>1.1.2.3</VAL1>
<VAL2>JSMITH</VAL2>
<VAL3>320</VAL3>
<IMPORTANT_VALUE>VERY</IMPORTANT_VALUE>
<VAL4>320</VAL4>
</CELL_FDD>
<FAMILY>BLAH</FAMILY>
<BLAH>
<VAL1>1.4.43.3</VAL1>
<VAL2>NA</VAL2>
<VAL3>349</VAL3>
<IMPORTANT_VALUE>NA</IMPORTANT_VALUE>
<VAL4>43</VAL4>
<VAL5>00</VAL5>
<VAL6>12</VAL6>
</BLAH>
</ATTRIBUTES>
</CONTAINER>
</vs:vsFD>
</es:esFD>
How can I break from the most inner loop so that I can reach the top-most loop?
해결책 3
다른 팁
Using separate methods should make it easier to control your loops:
while (myReader.Read())
{
if ((myReader.Name == "CONTAINER"))
{
ProcessContainerElement(myReader);
}
}
In the ProcessContainerElement
method, you can return
when you determine that you need to start looking for the next CONTAINER element.
private void ProcessContainerElement(XmlReader myReader)
{
while (whatever)
{
if ((myReader.Name == "IMPORTANT_VALUE"))
{
myReader.Read();
string Counter = myReader.Value;
Console.WriteLine(Attribute + " (found: " + Counter + ")");
return;
}
}
}
You can read with XmlReader and each node put to XmlDocument.
Something like this, not tested:
bool notFound = false;
notFound |= !reader.ReadToDescendant("root");
notFound |= !reader.ReadToDescendant("CONTAINER");
if (notFound)
Throw new Exception("[Не удаётся найти \"/root/CONTAINER\"]");
do
{
XmlDocument doc = new XmlDocument();
doc.LoadXml(reader.ReadOuterXml());
XmlNode container = doc.DocumentElement;
// do your work with container
}
while (reader.ReadToNextSibling("CONTAINER"));
reader.Close();