Implementing a Left Join with LINQ


Oddly enough, LINQ doesn’t define keywords for cross join, left join, or right join. As part of the LINQ grammar, you get join and group join. Joins can be equijoins or non-equijoins. An equijoin uses the join keyword and non-equal joins are contrived using where clauses. However, left, right, and cross joins are supported by LINQ (with a little nudge).

The two common joins are the inner join (or just join in LINQ) and the left join. Suppose you have two collections of data. One you will call the master or left collection, and the other you’ll call the detail or right collection. A left join is a join whereby all of the elements from the left collection are returned and only elements from the right collection that have a correlated value in the left sequence. Usually, the correlation is a key or some kind of unique identifier. Using another analogy, if the left collection is the parent and the right is the child, a left join is all parents but only children with parents. (A right join returns orphans but no childless parents. Gotta love these computer analogies.)

In this article, I will demonstrate the group join because that’s how you get to a left join. You also will see some code for LINQ to SQL that is pretty straightforward and my last article, “Search and Replace with Regular Expressions,” and my upcoming book, LINQ Unleashed: for C#, cover LINQ to SQL in detail. I won’t repeat that explanation here.

Defining a Group Join

A group join in LINQ is a join that has an into clause. The parent information is joined to groups of the child information. That is, the child information is coalesced into a collection and the child collection’s parent information occurs only once. (The difference between a join—really an inner join—and a group join is that inner joins repeat the parent information for each child.)

The fragment in Listing 1 assumes you have a collection of orders and a collection or order details. (You do. The final listing demonstrates how to get these datum from the Northwind Traders database using LINQ to SQL.) The code demonstrates a group join followed by an array to display the parent and a nested array to display the children of each parent.

Listing 1: A group join on the Northwind Traders Orders and Order Details tables.

Dim groupJoin = (From order In orders _
                 Group Join detail In details On _
                 order.OrderID Equals detail.OrderID _
                 Into child = Group _
                 Select New With { _
                 .CustomerID = order.CustomerID, _
                 .OrderID = order.OrderID, _
                 .OrderDate = order.OrderDate, _
                 .Details = child}).Take(5)

Dim line As String = New String("-", 40)
For Each ord In groupJoin
   Console.WriteLine("{0} on {1}", ord.OrderID, _
   For Each det In ord.Details
      Console.WriteLine("Product ID: {0}", det.ProductID)
      Console.WriteLine("Unit Price: {0}", det.UnitPrice)
      Console.WriteLine("Quantity:   {0}", det.Quantity)
      Console.WriteLine("Discount:   {0}", det.Discount)


The LINQ query starts with the anonymous variable groupJoin. (Any legal name will do here.) The clause From order in orders defines the range variable order on the collection orders. The range variable is like the iterator variable in a For loop. The clause Group Join detail in details defines the child range detail on the details sequence. The On..Equals clause describes the correlation in the equijoin. And, Into child = Group coalesces all of the child sequence data into a group. The last part Take(5) works like the TOP keyword in SQL. Take is an extension method that operates on sequences (which is what LINQ returns).

The result of the LINQ query as defined in Listing 1 is that you have a new object (called a projection) comprised of CustomerID, OrderID, and OrderDate, with a child sequence property, Details. Details is an attribute of the projection (the new type created with Select New With). The last part of the listing displays the outer data and then the grouped detail data.

Converting a Group Join to a Left Join

A group join is essentially a master detail in-memory relationship. A left join flattens out the data from the detail sequence and puts it on par with the master data. That is, where the group join has a nested detail property with its own properties, the left join will put the properties of the master and detail information as sibling properties.

The difference is that with a left join the right sequence may not have any data. You have to allow for nulls or LINQ would throw a null exception when it tried to access non-existent elements of the right sequence (Order Details in this example). You can convert a group join into a left join by adding an additional From clause and range variable on the Group and adding a call to the DefaultIfEmpty method on the group variable. The revised fragment in Listing 2 demonstrates. All of the code is provided in Listing 3.

Listing 2: A left join uses an additional From clause and range variable after the Group and invokes the DefaultIfEmpty method to handle missing children.

Dim leftJoin = (From order In orders _
   Group Join detail In details On _
   order.OrderID Equals detail.OrderID _
   Into children = Group _
   From child In children.DefaultIfEmpty _
   Select New With { _
      .CustomerID = order.CustomerID, _
      .OrderID    = order.OrderID, _
      .OrderDate  = order.OrderDate, _
      .ProductID  = child.ProductID, _
      .UnitPrice  = child.UnitPrice, _
      .Quantity   = child.Quantity, _
      .Discount   = child.Discount}).Take(5)

Notice that the projection in Listing 2 defines elements from Orders and Order Details as siblings in the new projected type. Here is the complete listing and some additional code for looking at the object state.

More by Author

Must Read