User ID:
Password:
Remember Me:
Forgot Password?
Not a member?
Click here for more information and to register.

    Introduction to LINQ, Part 2: LINQ to XML



    Bringing Queries into Action

    The same, however, can be achieved by using a query because one of the overloaded constructors of XElement takes a sequence of objects:

    public XElement(XElement other);
    public XElement(XName name);
    public XElement(XStreamingElement other);
    public XElement(XName name, object content);
    public XElement(XName name, params object[] content);
    

    Thus, you can rewrite the former code to

    IEnumerable<Winner> winners = UCL.GetWinners();
    XElement root = new XElement("winners",
                        from w in winners
                        select new XElement("winner",
                           new XElement("Name", w.Name),
                           new XElement("Country", w.Country),
                           new XElement("Year", w.Year)));
    

    Saving and Loading XML Data

    XElement has several overloads for saving, which take a string representing a file name, a TextWriter object, or an XmlWriter object.

    public void Save(string fileName);
    public void Save(TextWriter textWriter);
    public void Save(XmlWriter writer);
    public void Save(string fileName, SaveOptions options);
    public void Save(TextWriter textWriter, SaveOptions options);
    

    Saving an XML tree to a file is as simple as:

    root.Save("winners.xml");

    If root were the XElement from the latest code samples, the winners.xml file would contain the tree of UEFA Champions League winners listed above.

    The opposite, loading an XML tree, is done either using the static method Parse() that takes a string with the XML data, or one of the overloads of the static method Load(), that can take a string representing a file name, a TextReader object, or an XmlReader object.

    Creating a sequence of Winners based on the data read from a file "winners.xml" containing the UCL winners can be done like this:

    var result = from e in XElement.Load("winners.xml").Elements("winner")
                 select new Winner
                    {
                       Name = (string)e.Element("Name"),
                       Country = (string)e.Element("Country"),
                       Year = (int)e.Element("Year")
                    };
    
    foreach (Winner w in result)
    {
       Console.WriteLine("{0} {1}, {2}",
          w.Year, w.Name, w.Country);
    }
    

    XElement.Load creates an XML tree, the root being an XElement. The Elements() method returns a sequence of XElement objects with the specified name ("winner"). This is used as the source of the query, which projects a sequence of Winner. Method Element() returns the content of a child element with the specified name ("Name", "Country", or "Year" in the example).

    More Examples

    If you want to print only the names of the winners, you can run the following query:

    // extracts only the names of the winners from the file
    var result = from e in XElement.Load("winners.xml").Elements("winner")
                 select (string)e.Element("Name");
    
    foreach (var w in result)
    {
        Console.WriteLine(w);
    }
    

    The output in this case would be:

    Barcelona
    Liverpool
    FC Porto
    AC Milan
    Real Madrid
    Bayern Munchen
    Real Madrid
    Manchester Utd.
    Real Madrid
    Borussia Dortmund
    Juventus
    AFC Ajax
    AC Milan
    Olympique de Marseille
    

    Of course, you might be interested in the distinct names only, in which case you can apply the Distinct operator on the sequence returned by the query:

    // extracts only the names of the winners from the file
    var result = from e in XElement.Load("winners.xml").Elements("winner")
                 orderby (string)e.Element("Name")
                 select (string)e.Element("Name");
    
    // creates a sequence of distinct names
    var result2 = Enumerable.Distinct(result);
    
    foreach (var w in result2)
    {
       Console.WriteLine(w);
    }
    

    The new output is in this case:

    AC Milan
    AFC Ajax
    Barcelona
    Bayern Munchen
    Borussia Dortmund
    FC Porto
    Juventus
    Liverpool
    Manchester Utd.
    Olympique de Marseille
    Real Madrid
    
    Note: Remember (from the first article) that, as long as you do not iterate over the result, the source is not iterated. Thus, in the absence of the foreach statement, the query would not be executed.

    Performance

    As I was writing in my blog, I run queries on a quite large XML file (about 100MB), extracting various sets of data. All queries were performed in approximative 3.5 seconds. This made me draw two conclusions:

    • LINQ is very performant; extracting 90% of data from a 100MB file, in three separate runs, is done in less than 12 seconds; this is equivalent of extracting data from a file of 300MB. I find this very swift.
    • There was't a distinctive difference between extracting 0.5MB or 50MB.

    See Also

    About the Author

    Marius Bancila is a Microsoft MVP for VC++. He works as a software developer for a Norwegian-based company. He is mainly focused on building desktop applications with MFC and VC#. He keeps a blog at www.mariusbancila.ro/blog, focused on Windows programming. He is the co-founder of codexpert.ro, a community for Romanian C++/VC++ programmers.

    Downloads

  • LinqToXML.zip

  • IT Offers


    Top Authors