LINQ to XML with XPath

Since a while I am really convinced LINQ is great. Okay, it has its limitations, but since a couple of weeks those limitations start to get challenges :) .

Since I assume that you know pure LINQ to XML, extension methods, etc. etc. already (contact me if I am wrong – I will be glad if I can be of some help), I will not dive too deap into it. Instead, I show some nice examples on LINQ to SQL with XPath. I locate the examples around an XML file with book information, as shown below.

<?xml version="1.0" encoding="utf-8" ?>
<books>
<book id="1">
<title>Book 01</title>
<author>Bert Loedeman</author>
<price>$ 19.99</price>
</book>
<book id="2">
<title>Book 02</title>
<author>Bert Loedeman</author>
<price>€ 24.99</price>
</book>
<book id="3">
<title>Book 03</title>
<author>Bert Loedeman</author>
<price>€ 10.00</price>
</book>
<book id="4">
<title>Book 04</title>
<author>Some author</author>
<price>$ 19.99</price>
</book>
</books>

What I am going to do, is to show how to reach my goals using pure LINQ to XML and how to reach the same goals using LINQ combined with XPath. First of all, to be able to use XPath in combination with LINQ, it is important to include a reference to the System.Xml.XPath namespace. Including this reference leads to a couple of XPath related extension methods which are available on all XML entities, like XDocument and XElement.

Let’s start with the most simple LINQ query, loading all books from our XML file and writing their titles to the console.

var books =
    from book in doc.Descendants("book")
    select book&lt;/code&gt;&lt;/div&gt;
&lt;code&gt;Console.WriteLine();
foreach(var book in books)
{
    XElement element = book.Element("title");
    Console.WriteLine(element == null ? "Not found" : element.Value);
}

This query can easily be altered to support XPath, as you can see from the next picture.

// Needs namespace System.Xml.XPath.
var books =
from book in doc.XPathSelectElements("//book")
select book;

Console.WriteLine();
foreach (var book in books)
{
    XElement element = book.XPathSelectElement("title");
    Console.WriteLine(element == null ? "Not found" : element.Value);
}

Although this example does not show any extra value for using XPath, it is already nice to be able to use XPath. Let’s move to a slightly more complex situation and only select the book with ID 4.

var bookWithIdFour =
    from book in doc.Descendants("book")
    where (int)book.Attribute("id") == 4
    select book;

Console.WriteLine();
foreach (var book in bookWithIdFour)
{
    XElement element = book.XPathSelectElement("title");
    Console.WriteLine(element == null ? "Not found" : element.Value);
}

As you can see, we introduce a where clause on our query. Totally legitimate, but have a look at the possibilities using XPath:

var bookWithIdFour =
    from book in doc.XPathSelectElements("//book")
    where (bool)book.XPathEvaluate("@id=4")
    select book;

Console.WriteLine();
foreach (var book in bookWithIdFour)
{
    XElement element = book.XPathSelectElement("title");
    Console.WriteLine(element == null ? "Not found" : element.Value);
}

var bookWithIdFour1 =
    from book in doc.XPathSelectElements("//book[@id=4]")
    select book;

Console.WriteLine();
foreach (var book in bookWithIdFour1)
{
    XElement element = book.XPathSelectElement("title");
    Console.WriteLine(element == null ? "Not found" : element.Value);
}

var bookWithIdFour2 =
    from book in doc.XPathSelectElements("//book")
    where (int)book.XPathSelectElement("@id") == 4
    select book;
    // where clause failes: only functions with XElements, not with XAttribute!

Console.WriteLine();
foreach (var book in bookWithIdFour2)
{
    XElement element = book.XPathSelectElement("title");
    Console.WriteLine(element == null ? "Not found" : element.Value);
}

At first, you can use XPath the way you would use pure LINQ to XML. Nothing has to be different. There is one reason though, to use the second method: consider a situation where it is not sure that the user searching for your books enters an ID at all. The original LINQ to XML query has to be written twice to deal with this situation, the second XPath example can be altered with ease removing the [@id=...] part if wanted, making LINQ a little more nice to use ;)

When I assembled this blog posting, I would have expected to have the third XPath example functioning as well. Unfortunately, it does not work: The XPath functions, apart from XPathEvaluate, do not work with anything different from XElement as their return value. Would you still want to use XPath this way, consider the XPathEvaluate function, as shown below:

var bookWithIdFour2 =
    from book in doc.XPathSelectElements("//book")
    where (bool)book.XPathEvaluate("@id=4")
    select book;

Console.WriteLine();
foreach (var book in bookWithIdFour2)
{
    XElement element = book.XPathSelectElement("title");
    Console.WriteLine(element == null ? "Not found" : element.Value);
}

Considering all that I have shown hereabove, I like XPath already, especially for its flexibility and easiness to use. However, there is more use for XPath using LINQ, that makes XPath even more interesting: there is a possibility to use the XPathNavigator class with LINQ to XML too. To show you how it works, I create the same query, selecting the title of the book with ID 4, this time using the XPathNavigator.

var booksNavigator =
    from book in doc.XPathSelectElements("//book")
    where (bool)book.XPathEvaluate("@id=4")
    select book.CreateNavigator();

Console.WriteLine();
foreach (var book in booksNavigator)
{
    XPathNodeIterator iterator = book. SelectChildren("title", "");
    while (iterator.MoveNext())
    {
        Console.WriteLine(iterator.Current == null 
                ? "Not found" 
                : iterator.Current.Value);
    }
}

Just remember the CreateNavigator() function on the XElement entity and you are able to use all XPath wealth in functions like Select, SelectChildren, ValueAs… (typed values), etc. etc.

Concluding everything, I am sure XPath is a most valuable feature set to pure LINQ to XML. Without it, I would have had a hard time to implement real difficult situations like dynamic queries, which has been addressed using XPath. I hope I could make you enthousiastic as well :) ! Feel free to post any comments you like (on-topic please ;) ).

Technorati tags: