Question

I've been doing alot of XML processing in C# of late and after coming back to C# from a long stint javascript coding I'm really missing some of the nice short cuts of JS.

I've go a sizable XML document made up of lot so elements, child elements, etc. Representing Accommodations/Flights/Attraction tickets for an online booking.

So far I've just been Xpathing through the document to pull out the info I need. I've now started moving these into helper functions. So for example we might want to change how we process a booking if it has a flight from a third party. Thats easy to check with Xpath we just check the value of an element. So I have a helper function which does just that so my Xpath is only in one place and returns a bool. I've only got about a dozen of these and up until now I've been treating the whole thing as a booking, but we've just done some work which was all about the flight element of the booking and the last 4 helpers I've just created are very flight related which got me thinking if I'm doing this right.

I've avoided the need to convert the whole document into an Object as it would just be very very painful to do due to its size and currently its not been needed, still isn't really. Creating all those objects would be a huge pain and I've worked on similar projects where this route has been done and it just hurts trying to debugging it or initally get your head around it. We are not using all of document, the processing we are doing barely uses 10% of it so de-serialising everything seemed a bit over-engineered. Deserialising it in JS would be a breeze but C# just makes it so damn long winded. I know I could use XSD.exe to take away some of the pain but I find that such a mess to use without a good schema (which of course does not exist).

But it got me thinking, should we ALWAYS create a huge collection of objects from our XML or is the quick way I've done it still acceptible? I'm happy with it but just trying to think it through if Objects would be a better way. It we did convert it all then at least it would be available incase we needed it in future projects.

I'm aware that there maybe be some comments on the speed of xpathing everything but its fast enough for what we want currently so thats not an issue here.

Any thoughts on if I'm ok to continue XPathing?

Was it helpful?

Solution

If you have an XML schema (XSD), then I'd probably always prefer the deserialize to object approach - it's just easier and cleaner to work with nice CLR objects.

If you don't have a XML schema, and can't get one from the source/provider of your data, then the decision isn't quite clear. As you say, XSD.EXE can take some of the pain out of the equation, but such an inferred XML schema is typically neither perfect nor often very pretty.

Tough one - if you feel comfortable with using XPath to navigate your XML, I'd stick to it. If you have XML that you need to parse very often, maybe creating an XSD might be a good idea in the end.

OTHER TIPS

Yes, if you need only 10% of the XML, XPathing seems to be OK here due to possible deserialization overkill.

In C# 4.0 and DLR there is a dynamic keyword which allows to dynamically explore XML structure.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top