Question

I am trying to retrieve just two small bits of information from this XML. The id (choice_xxxxxx) and the description (true/false):

<choices>
  <component>
    <id>choice_6WFRvQrXAN7</id>
    <description lang="und">True</description>
  </component>
  <component>
    <id>choice_5xBWm5NQRkA</id>
    <description lang="und">False</description>
  </component>
</choices>

But because I am parsing this in Nokogiri, the information returns like this:

=> 0;34#(Element:0x3feeeb4301b0 {
  name = "choices",
  namespace = 0;34#(Namespace:0x3feeeb6f1fac {
    href = "http://website.myxml.xsd"
    }),
  children = [
    0;34#(Text "\n\t\t\t\t"),
    0;34#(Element:0x3feeea7e36b4 {
      name = "component",
      namespace = 0;34#(Namespace:0x3feeeb6f1fac {
        href = "http://website.myxml.xsd"
        }),
      children = [
        0;34#(Text "\n\t\t\t\t\t"),
        0;34#(Element:0x3feeea7e624c {
          name = "id",
          namespace = 0;34#(Namespace:0x3feeeb6f1fac {
            href = "http://website.myxml.xsd"
            }),
          children = [ 0;34#(Text "choice_6WFRvQrXAN7")]
          }),
        0;34#(Text "\n\t\t\t\t\t"),
        0;34#(Element:0x3feeea7e4a50 {
          name = "description",
          namespace = 0;34#(Namespace:0x3feeeb6f1fac {
            href = "http://website.myxml.xsd"
            }),
          attributes = [
            0;34#(Attr:0x3feeea7e86c8 { name = "lang", value = "und" })],
          children = [ 0;34#(Text "True")]
          }),
        0;34#(Text "\n\t\t\t\t")]
      }),
    0;34#(Text "\n\t\t\t\t"),
    0;34#(Element:0x3feeea7ea34c {
      name = "component",
      namespace = 0;34#(Namespace:0x3feeeb6f1fac {
        href = "http://website.myxml.xsd"
        }),
      children = [
        0;34#(Text "\n\t\t\t\t\t"),
        0;34#(Element:0x3feeea7e8f88 {
          name = "id",
          namespace = 0;34#(Namespace:0x3feeeb6f1fac {
            href = "http://website.myxml.xsd"
            }),
          children = [ 0;34#(Text "choice_5xBWm5NQRkA")]
          }),
        0;34#(Text "\n\t\t\t\t\t"),
        0;34#(Element:0x3feeea7eb738 {
          name = "description",
          namespace = 0;34#(Namespace:0x3feeeb6f1fac {
            href = "http://website.myxml.xsd"
            }),
          attributes = [
            0;34#(Attr:0x3feeea7eb3f0 { name = "lang", value = "und" })],
          children = [ 0;34#(Text "False")]
          }),
        0;34#(Text "\n\t\t\t\t")]
      }),
    0;34#(Text "\n\t\t\t")]
  })

What's the most elegant way to extract those two variables from this Nokogiri::XML::Nodeset?

Was it helpful?

Solution

Providing you're storing the parsed output in a variable, which for this purpose I will name @xml.

@xml.css('component').map do |node|
  {
    id: node.children.css('id').first.text,
    description: node.children.css('description').first.text
  }
end

That should give you an array of hashes containing the data you need.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top