Pergunta

I hve an XML file which I am supposed to parse using RapidXML and c++. The file is a Phylogenetic tree. Each node has a node with 1-3 child nodes that each have values. The nodes can be the scientific name, common name, or rank. My question is, since the child nodes for each taxonomy node varies (for example, one may have both scientific name and common name and one may have only the scientific name), how would I access each value of the child nodes? For example, I wrote the code:

for (xml_node<> * clade_node = root_node->first_node("clade"); clade_node; clade_node = clade_node->next_sibling())
    {
        xml_node<> * taxonomy_node = clade_node->first_node("taxonomy");

        xml_node<> * sciName_node = taxonomy_node->first_node("scientific_name");
        xml_node<> * comName_node = taxonomy_node->next_sibling("common_name");
        xml_node<> * rank_node = taxonomy_node->next_sibling("rank");

        string sciName = sciName_node->value();
        string comName = comName_node->value();
        string rank = rank_node->value();

    }

But I get a thread error of EXC_BAD_ACCESS at the line string comName = comName_node->value() and this method of the RapidXML file

Ch *value() const
{
    return m_value ? m_value : nullstr();
}

Here is a piece of the file I am parsing:

<phylogeny rooted="true" rerootable="false">
  <clade>
    <clade>
      <taxonomy>
        <scientific_name>Neomura</scientific_name>
      </taxonomy>
    </clade>
    <clade>
      <taxonomy>
        <id provider="uniprot">2</id>
        <scientific_name>Bacteria</scientific_name>
        <rank>superkingdom</rank>
      </taxonomy>
    </clade>
  </clade>
</phylogeny>

Thanks for any help!

Foi útil?

Solução

If some of the nodes are optional it's likely the library will return NULL when it cannot find one of them. You probably need to check the returned values are not NULL before you even dereference the pointers to get any possible value():

string sciName = sciName_node->value();  // crash if scientific_name not present
string comName = comName_node->value();
string rank = rank_node->value();

I also think your use of names in the calls for nodes/siblings are a bit brittle. It might be better to just call first_node/next_sibling without a name and then check the name after a node is actually returned (checking for NULL). Then perform name-dependent logic.

This makes you less dependent on the order of data in the XML, which might change down the line.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top