Question

Using encoding/xml.Decoder I'm attempting to manually parse an XML file loaded from http://www.khronos.org/files/collada_schema_1_4

For test purposes, I'm just iterating over the document printing out whatever token type is encountered:

func Test (r io.Reader) {
    var t xml.Token
    var pa *xml.Attr
    var a xml.Attr
    var co xml.Comment
    var cd xml.CharData
    var se xml.StartElement
    var pi xml.ProcInst
    var ee xml.EndElement
    var is bool
    var xd = xml.NewDecoder(r)
    for i := 0; i < 24; i++ {
        if t, err = xd.Token(); (err == nil) && (t != nil) {
            if a, is = t.(xml.Attr); is { print("ATTR\t"); println(a.Name.Local) }
            if pa, is = t.(*xml.Attr); is { print("*ATTR\t"); println(pa) }
            if co, is = t.(xml.Comment); is { print("COMNT\t"); println(co) }
            if cd, is = t.(xml.CharData); is { print("CDATA\t"); println(cd) }
            if pi, is = t.(xml.ProcInst); is { print("PROCI\t"); println(pi.Target) }
            if se, is = t.(xml.StartElement); is { print("START\t"); println(se.Name.Local) }
            if ee, is = t.(xml.EndElement); is { print("END\t\t"); println(ee.Name.Local) }
        }
    }
}

Now here's the output:

PROCI   xml
CDATA   [1/64]0xf84004e050
START   schema
CDATA   [2/129]0xf84004d090
COMNT   [29/129]0xf84004d090
CDATA   [2/129]0xf84004d090
START   annotation
CDATA   [3/129]0xf84004d090
START   documentation
CDATA   [641/1039]0xf840061000
END     documentation
CDATA   [2/1039]0xf840061000
END     annotation
CDATA   [2/1039]0xf840061000
COMNT   [37/1039]0xf840061000
CDATA   [2/1039]0xf840061000
START   import
END     import
CDATA   [2/1039]0xf840061000
COMNT   [14/1039]0xf840061000
CDATA   [2/1039]0xf840061000
START   element
CDATA   [3/1039]0xf840061000
START   annotation

Notice no ATTR or *ATTR lines are output even though by the last (24th) line many attributes have been passed both in the root xs:schema element as well as in xs:import and xs:element elements.

This is in Go 1.0.3 64-bit under Windows 7 64-bit. Am I doing something wrong or should I file a Go package bug report?

[Side note: when doing a normal xml.Unmarshal into properly prepared structs, known-named-and-mapped attributes are captured and mapped by the xml package just fine. But I also need to collect "unknown" attributes in the root element (to collect namespace information for this use-case, the use-case being http://github.com/metaleap/go-xsd ), hence my attempts to use Decoder.Token().]

Was it helpful?

Solution

Yes, this behavior is expected. The attributes are parsed, but not returned as a xml.Token. Attributes simply arn't Tokens. See: http://golang.org/pkg/encoding/xml/#Token

The attributes are accessible through the Attr field in the Token StartElement. See: http://golang.org/pkg/encoding/xml/#StartElement

(( Some general hints:

a) Do not use print or println.

b) The a, ok := t.(SomeType) idioma is called "comma okay", because the boolean is normaly named "ok", not "is". Please stick to these conventions.

c) Idiomatic would be something like

switch t := t.(type) {
  case xml.StartElement: ...
  case xml.EndElement: ...
}

instead of your list of "if a, is = t.(xml.Attr) ..."

d) All this "var se xml.StartElement" is noise (clutter). Use

if se, ok := t.(xml.StartElement); ok { ... }

This would make your code much readable. ))

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top