Question

I have an ASP.NET MVC website. This website has a sitemap that looks like the following:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>http://www.mysite.com/contact</loc>
    <lastmod>2013-06-04</lastmod>
    <changefreq>never</changefreq>
  </url>
  <url>
    <loc>http://www.mysite.com/contact-us</loc>
    <lastmod>2013-06-04</lastmod>
    <changefreq>never</changefreq>
  </url>
  <url>
    <loc>http://www.mysite.com/about/books</loc>
    <lastmod>2013-06-18</lastmod>
    <changefreq>monthly</changefreq>
  </url>
  <url>
    <loc>http://www.mysite.com/about/blog</loc>
    <lastmod>2012-05-02</lastmod>
    <changefreq>never</changefreq>
  </url>
  <url>
    <loc>http://www.mysite.com/about/blog/post-1</loc>
    <lastmod>2012-05-02</lastmod>
    <changefreq>never</changefreq>
  </url>
  <url>
    <loc>http://www.mysite.com/about/blog/post-2</loc>
    <lastmod>2012-02-15</lastmod>
    <changefreq>never</changefreq>
  </url>
</urlset>

I'm trying to figure out how to query this sitemap with Linq-to-XML in C#. I'm trying to write a query that returns only the blog post entries. The blog post entries are the ones whose loc attribute value starts with http://www.mysite.com/about/blog/. Currently, I'm successfully loading and querying the sitemap. However, I can't figure out how to filter down to just the blog posts and then sort by the lastmod value. This is what I have so far:

XDocument sitemap = XDocument.Load(Server.MapPath("/resources/sitemap.xml"));
IEnumerable<XElement> blogs = from post in sitemap.Descendants("url")
                              select post;

How do I filter down to just my blog posts? My query for even just the urls doesn't seem to be working.

Was it helpful?

Solution

Your XML document uses default namespace, so you have to use it in your query too:

var ns = XNamespace.Get("http://www.sitemaps.org/schemas/sitemap/0.9");

IEnumerable<XElement> blogs = from post in sitemap.Root.Elements(ns + "url")
                              where ((string)post.Element(ns + "loc") ?? string.Empty).StartsWith("http://www.mysite.com/about/blog/")
                              select post;

I used ((string)post.Element(ns + "loc") ?? string.Empty) to make sure no exception is being thrown when <loc> element does not exist, but if you're sure that every <url> has <loc> in it you can replace that with just ((string)post.Element(ns + "loc")).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top