How many software tasks DON’T involve reading through data? Answer: very few.

Developers work all the time with database data, XML data, DataSets, collections, lists, and arrays-all with different syntax and functionality for each one. Developers who write T-SQL code often covet set-based language statements when they work against other types of data. Additionally, developers who have coded against strongly-typed custom collections wish they could write SQL database queries with IntelliSense. Language Integrated Query (LINQ), a set of extensions to the .NET Framework for the next version of Visual Studio codename “Orcas”, brings the promise of integrated and uniform query capabilities to increase developer productivity when working with different types of data. While LINQ is a large topic worthy of books, this edition of The Baker’s Dozen will provide a crash course to learn what’s under the LINQ hood.

So what Exactly Is LINQ?

Odds are high that just about every .NET developer has at least heard of Language Integrated Query (LINQ). But since this is an intro, crash-course-style article on LINQ, I’ll assume very little.

Lambda expressions are a “kinder, gentler” form of anonymous methods.

LINQ is a set of extensions to the .NET Framework to query different types of data using a common language. The next release of Visual Studio (codenamed “Orcas”) will feature the following LINQ capabilities within .NET:

  • Querying Microsoft SQL Server data using a strongly-typed DataContext (LINQ to SQL)
  • Querying in-memory .NET objects (LINQ to Objects)
  • Querying .NET DataSets (LINQ to DataSets)
  • Querying XML Data (LINQ to XML)

LINQ offers something for just about everyone. If you’re writing database stored procedures and don’t need to change your approach, you may still find value in LINQ to DataSets or LINQ to Objects.

This article will present a number of code samples to demonstrate LINQ functionality. In addition, I’ll also cover new language enhancements in Visual Studio codename “Orcas” that work in conjunction with LINQ to allow developers to work with data more productively than ever before. As of this writing, LINQ has gone from the Community Technology Preview (CTP) phase to Beta 1, so odds are high that functionality will change. But the code samples should give you a good foundation for learning LINQ.

What’s on the Menu…

In creating the tips for this article, I wanted to provide basic exposure to as many areas of LINQ as possible. As I often do with intro articles and presentations, I try to build the types of examples that I previously sought when I learned a particular technology. So here are the 13 tips, in hopes of making you more aware of what’s under the LINQ hood, and what you can use it for.

  • Installing the beta and getting started
  • First steps for using LINQ to SQL by creating a strongly-typed DataContext
  • Writing LINQ to SQL queries
  • Standard LINQ querying syntax and operators, such as sorting, string searching, date handling, and SELECT DISTINCT
  • More querying syntax and operators, such as UNION and EXISTS, as well as SKIP/TAKE for paging and ANY/ALL condition handling
  • Using an extension method to convert a LINQ result set to an ADO.NET DataTable
  • Converting a LINQ result set to an IEnumerable or to a List
  • LINQ to XML
  • LINQ to DataSets
  • Lambda Expressions
  • LINQ to Objects
  • Language features for C# 3.0
  • Language features for VB 9.0

Additionally, at the end of the article I’ll list several valuable LINQ online references. Let’s get started with LINQ!

No ADO.NET Entity Framework (yet)

Microsoft’s data access strategy in .NET consists of two major components: one is LINQ, which this article covers. The other is the ADO.NET Entity Framework, a significant upgrade to ADO.NET that allows developers to create data models of the back-end database.

As I write this article (late April 2007), Microsoft has announced that the ADO.NET Entity Framework (which I’ll call Entity Framework throughout the remainder of this article) will not ship with Visual Studio codename “Orcas”, but will ship a few months later. The Entity Framework promises a richer development experience for creating and mapping entity data models to relational databases. You can read more about the Entity Framework here: http://blogs.msdn.com/data/archive/2007/04/28/microsoft-s-data-access-strategy.aspx

Given that Microsoft will not release the Entity Framework until after Visual Studio codename “Orcas” ships, I’ve decided to focus on areas of LINQ that are scheduled to be part of the Visual Studio codename “Orcas” release. In a future article I’ll cover the Entity Framework and Microsoft’s data access strategy.

Tip 1: Getting Started

I’m using Beta 1 for Visual Studio codename “Orcas”, which Microsoft released in late April 2007. For the remainder of this article I’ll simply refer to it as Orcas Beta 1. You can download Orcas Beta 1 from this URL: http://msdn2.microsoft.com/en-us/vstudio/aa700831.aspx.

XML MVP Don Demsak has a great quote: “An anonymous type is just a projection, like a shadow in a box. You can’t take a shadow out of the box, but you can take a picture of the shadow.”

I recommend that you install Orcas Beta 1 on a test machine (or on a Virtual PC). I recommend against installing Orcas Beta 1 on a machine where you are doing any type of production work with Visual Studio 2005 (or any prior version of .NET, or any other development tool). Some developers are running both Orcas Beta 1 as well as Visual Studio 2005 on the same machine. While it can and does work, doing so increases the odds that you may need to reapply patches or service packs, or perform other maintenance tasks later when new beta versions/service packs become available.

At the end of the article I’ll list a number of blogs and resources for LINQ. The most important link is the home site for the LINQ project: http://msdn2.microsoft.com/en-us/vbasic/aa904594.aspx.

Tip 2: Creating a DataContext for LINQ to SQL

The first example will use LINQ to SQL so that you can build a database query from Orcas Beta 1. I’ll show you how to write a very simple query against the AdventureWorks database in SQL Server 2005. In particular, the query will return a list of all Purchase Orders greater than $25,000 with a link into the Vendor table to show the Vendor Name.

Before you can do that, you need to address some housekeeping details. First, you need to provide the Orcas Beta 1 environment with some information so that you can write the database query in a strongly-typed fashion and take advantage of IntelliSense. In LINQ terms, you need to create a strongly-typed DataContext against the database. You have two options to do this: one involves the new utility SqlMetal that comes with the SDK for Orcas Beta 1, and the second is to use the new Linq to SQL design tool.

SqlMetal automatically generates entity classes for all tables from a given database. You can use these entity classes in Orcas Beta 1 to create a DataContext for writing LINQ queries.

You can generate the entity classes for the AdventureWorks database by typing the following at the Command Prompt (you’ll need to specify the full path for the location of SqlMetal):

SqlMetal.exe 
/server:localhost 
/database:AdventureWorks
/code:AdventureWorks.cs

Since the database will surely change during the development project, you can call SqlMetal every time you build your application in the Pre-build Event Command Line project option (Figure 1).

Figure 1: Specifying SQLMETAL.EXE in the .NET project Pre-build Event Command Line

The entity class that SqlMetal generates may be very large, and unnecessary if you simply need to write a few queries. So the second option is to create your own LINQ to SQL data context for only those tables you need, as follows:

  • Add a new item to the project
  • Select Linq to SQL File in the list of Installed Templates
  • Orcas Beta 1 will then display the Object Relational Design area where you can drag the necessary tables from Server Explorer for the database (Figure 2).
Figure 2: Creating a LINQ to SQL DataContext

The next tip will cover the actual code for the query.

Tip 3: Writing Some LINQ to SQL Queries

Now that you’ve created the DataContext using SqlMetal, you can create your first test LINQ project. You’ll want to add the AdventureWorks.cs entity class file (that SqlMetal generated) to your project (by selecting “Add Existing Item” in Solution Explorer). You’ll also need to add a .NET reference to System.Data.Linq. Table 1 lists the core LINQ DLLs.

The query in Listing 1 retrieves all of the purchase orders with a subtotal greater than 25,000. The query is a SQL-like join between the Vendor table and PO header table in the AdventureWorks database that uses a basic WHERE clause. The query in Listing 2 performs a sum of Freight and Subtotal for each Vendor.

Note that the second query does not contain a JOIN statement. Because a database parent/child relationship exists between the Vendor and PO Header table, you can use parent-child syntax (Vendor.Purchasing_PurchaseOrderHeader) to reference child aggregations.

Of course, you can do more than just retrieve data. Listing 3 shows two examples of inserting and updating data using LINQ. You can utilize the result object (an anonymous type, which I’ll cover in more detail later) to set values, and then post back to the database context object using SubmitChanges.

Tip 4: Standard Querying Statements and Operators

Sometimes the best way to learn syntax is to see a number of different examples. Listing 4 demonstrates some fairly common database query requirements, such as sorting, string searching, date handling, and an equivalent of SELECT DISTINCT.

In the last edition of “The Baker’s Dozen” you saw that WCF provides a uniform programming model for working with different communication protocols. The same concept applies to LINQ: to provide a standard means of querying different types of data.

If you want to see the actual SQL code that LINQ to SQL generates, you can set a breakpoint over the anonymous type result and check the value after the statement executes.

Tip 5: More Querying Syntax and Operators

But wait, there’s more! Listing 5 contains more code examples for common LINQ to SQL queries, such as UNION operations and Exists. LINQ to SQL also contains some additional nice features that Listing 5 covers, such as Skip and Take functions and ANY/ALL operators. You can use Skip and Take to retrieve a range of rows within a result set (helpful for paging result sets) and ANY/ALL operators to return a result if a condition is true for any/all rows in a set of child record (e.g. find the vendors with ANY orders greater than 50,000, or find the vendors where ALL orders are greater than 50,000).

Tip 6: Converting an Anonymous Type Result to a Data Table

Many who have written blogs on LINQ have discussed the fact that the result set, a new anonymous type object in .NET, cannot be used directly outside the method that creates it. Fellow MVP Don Demsak has a great quote about anonymous types: “it is just a projection…think of it as a shadow in a box-you can't take a shadow out of the box but you can take a picture of the shadow”.

In order to work effectively with anonymous types in an application, especially a distributed one, you’ll need the means to convert the anonymous type result set into something more usable. This tip will show how to convert an anonymous type result to an ADO.NET DataTable, and the next tip (Tip 7) will show how to populate a custom List.

The early CTP versions of Visual Studio codename “Orcas” contained a function called ToDataTable() that allowed you to easily convert the result of a LINQ query to an ADO.NET DataTable. Unfortunately, Microsoft removed that function from the March 2007 CTP, as well as Orcas Beta 1. However, Listing 6 contains code for an extension method (ToADOTable) that can serve as a functional counterpart for ToDataTable. You can call it immediately after executing a query, as follows:

// Take the anonymous type result "NewResult"    
// and convert it to a DataTable    
    
DataTable dtResults = 
   NewResult.ToADOTable( rec => 
          new object[] {NewResult});

Tip 7: Converting an Anonymous Type to an IEnumerable or to a List

Tip 6 demonstrated how to direct the results of a LINQ query to an ADO.NET DataTable. At conferences, user group meetings, and other community events, it seems that half the developers I meet use DataTables and DataSets, while the other half uses custom classes. Since the previous tip covered DataTables, I’ll now show how to convert the results of a LINQ query to a custom class.

The beauty of LINQ is that it offers something for just about everyone. Even if you continue to opt for stored procedures instead of LINQ to SQL, LINQ still provides better capabilities for working with the result sets.

Listing 2 showed how to query for a list of vendors and the freight and order subtotals from each vendor’s purchase orders. Suppose you want to pump the results into a collection of the following class:

public class VendorResults
{
    public VendorResults(string Name, 
          decimal TotFreight, decimal TotSubTot)
    {
        this.Name = Name;
        this.TotFreight = TotFreight;
        this.TotSubTot = TotSubTot;
    }
    public string Name { get; set; }
    public decimal TotFreight { get; set; }
    public decimal TotSubTot { get; set; }
}

(Note that the class properties above don’t contain any private fields. C# 3.0 provides a new capability called automatic properties that simplifies property creation. Tip 12 covers this in detail.)

Next I’ll modify the query from Listing 2 to return an IEnumerable of type VendorResults. Here I’ve created a method of type IEnumerable<VendorResults> to return the results:

public IEnumerable<VendorResults>
                      GetVendOrders()
{
   IEnumerable<VendorResults> Orders = 
       from Vendor in db.Purchasing_Vendor
       orderby Vendor.Name
    
       select new VendorResults (Vendor.Name,
   Vendor.Purchasing_PurchaseOrderHeader.Sum(o =>
                                  o.Freight),
   Vendor.Purchasing_PurchaseOrderHeader.Sum(o =>
                                  o.SubTotal));
   return Orders;
}

I can then call the method like follows:

IEnumerable<VendorResults> oVendorResults = 
 (IEnumerable<VendorResults>)this.GetVendOrders();

Finally, if I want to pump the results into a List class, I simply need to iterate through the items in IEnumerable:

List<VendorResults> oListResults = 
       new List<VendorResults>();
    
foreach (VendorResults oRec in oVendorResults)
   oListResults.Add(oRec);

Tip 8: LINQ to XML

Listing 7 shows basic examples for creating and querying XML data. You can create XML file contents easily using the XElement and Xattribute objects, and create a queryable object using XDocument. This illustrates the point about LINQ that those who don’t plan on using other aspects of LINQ (such as LINQ to SQL) can still utilize other LINQ functionality.

Tip 9: LINQ to DataSets

While ADO.NET has always had the ability to filter the contents of a DataTable with the Select method, the biggest querying limitation of the DataTable was the inability to perform a JOIN (and any related set-based operation). You could simulate a JOIN with Relation objects and Parent/Child syntax, but it still paled in comparison to the standard JOIN syntax found in SQL implementations. Fortunately, we can use the same LINQ syntax to query DataTables.

With LINQ to DataSets you can query an enumeration of DataRows of untyped or typed DataSets. You can also perform SELECT DISTINCT, UNION, JOIN, and GROUP operations.

Note two things in the code below. First, the code references the extension method AsEnumerable, from the System.Data.EntityClient namespace. This allows you to utilize LINQ syntax to query the DataTable. Second, the code also uses the Field<datatype> syntax to convert each DataColumn (which .NET stores as an object) to the appropriate datatype.

DataTable dtEmps = new DataTable();
dtEmps.Columns.Add("FirstName",typeof(String));
dtEmps.Columns.Add("LastName",typeof(String));
dtEmps.Columns.Add("Salary",typeof(Decimal));
    
dtEmps.Rows.Add("Kevin","Goff",90000);
dtEmps.Rows.Add("Steve","Goff",80000);
dtEmps.Rows.Add("Gwen","Goff",95000);
// Query for name = Kevin or Salary > 90Kvar     
// Query =     
   from rowData in dtEmps.AsEnumerable() 
      where rowData.Field<string>("FirstName") ==
                                     "Kevin" 
       || rowData.Field<decimal>("Salary") > 90000
 select rowData;

If you use strongly-typed DataSets (which I recommend over untyped DataSets), you can reference columns directly:

var Query = 
   from rowData in dtEmps.AsEnumerable() 
      where rowData.FirstName == "Kevin" 
       || rowData.Salary > 90000
 select rowData;

Finally, as I mentioned, you can apply JOIN and GROUP syntax to ADO.NET DataTables. Suppose you have two strongly-typed DataTables, one that stores a list of Employees (dtEmployees, storing EmpPK and Name), and a second that stores daily labor information (dtLabor, storing multiple rows by EmpPK for WorkDate and WorkHours). You want to join the two tables to produce a summary result of hours for each employee for the month of February 2007:

var LaborSum = 
   from Hours in oData.dtLabor 
      where Hours.WorkDate.Year == 2007 &&
            Hours.WorkDate.Month == 2 
      group Hours by Hours.EmpPK into g
      select new {EmpPK=g.Key, 
            SumHours = 
            g.Sum(o => o.WorkHours)} into HourSum
     orderby HourSum.SumHours descending
     join Master in oData.dtEmployees on
       HourSum.EmpPK equals Master.EmpPK
     select new {Master.Name, HourSum.SumHours};

Tip 10: Lambda Expressions

LINQ has something for almost everyone. If you store data in custom collections, you can use LINQ capabilities to query your collections.

Automatic properties are like getting clothes for Christmas - they might not seem very exciting, but you’ll find them useful throughout the year.

In past articles on .NET Generics in Visual Studio 2005, I’ve shown how to sort and filter custom collections using anonymous methods inside the FindAll and Sort methods of the List class. Suppose you’ve created a simple Customer class, with properties for Customer ID and Name, Location ID, and Amount Due. The following two code snippets show how you’d filter an existing collection of customers for Locations 1 and 2, where the Amount Due is greater than 15K, and then sort on Amount Due descending within Location:

// VS2005 syntax    
List<CustomerRec> oFilteredCustomers =
     oCustomerRecs.FindAll(
       (
          delegate(CustomerRec oRec)
          { 
             return ((oRec.LocationID == 1 
                    || oRec.LocationID == 2)
           && oRec.AmountDue > 15000);
          }
        ));
// VS2005 syntax    
oFilteredCustomers.Sort
  (delegate(CustomerRec oRec1, CustomerRec oRec2)
   {
    return oRec1.LocationID == oRec2.LocationID ?
      oRec2.AmountDue.CompareTo(oRec1.AmountDue) :
     oRec1.LocationID.CompareTo(oRec2.LocationID);
  });

In Orcas Beta 1, there’s good news, and then there’s great good news. First, the good news: the new lambda expression capability in Orcas Beta 1 allows you to rewrite the above snippets for better readability by eliminating the need to explicitly specify the delegate parameter. Think of lambda expressions as a simplified form of anonymous methods. In the code below, all you need to do is specify the parameter for FindAll (I’ve called it oRec, which .NET automatically knows is of type CustomerRec), and then use the token set => (which .NET uses to separate the parameter from the actual expression):

// Lambda expression, more concise    
List<CustomerRec> oFilteredCustomers =
     oCustomerRecs.FindAll(
        oRec =>
          // Note that oRec is automatically
          // of type "CustomerRec"
               (oRec.LocationID == 1 ||
                oRec.LocationID == 2)  &&
                oRec.AmountDue > 15000);

In the example for sorting (which receives two parameters), you only need to separate the two comparison objects (oRec1, oRec2) by commas, include the token =>, and then write the expression. Much simpler than anonymous methods! I’ll talk more about lambda expressions in Tip 12.

// Lambda expression, more concise    
oFilteredCustomers.Sort  ( (oRec1, oRec2) =>
   oRec1.LocationID == oRec2.LocationID ?
   oRec2.AmountDue.CompareTo(oRec1.AmountDue) :
   oRec1.LocationID.CompareTo(oRec2.LocationID));

So that’s the good news. Now for the great news - you can also apply LINQ query syntax to object collections (see the next tip):

Tip 11: LINQ to Objects

Continuing from Tip 10, you can use the same LINQ syntax from previous tips to query a collection. This emphasizes one of the key points of LINQ: using similar (not always identical, but similar) syntax to query all sorts of different objects:

// Query a customer collection    
var NewResult =
     from oRecs in oCustomerRecs where
           ((oRecs.LocationID==1 || 
             oRecs.LocationID==2)
           && oRecs.AmountDue > 15000)
     orderby oRecs.LocationID ascending,
             oRecs.AmountDue descending
     select oRecs;
    
foreach (CustomerRec oRec in NewResult)
   MessageBox.Show(oRec.LocationID.ToString() +
          ", " + oRec.AmountDue.ToString());

Tip 12: Language Features for C# 3.0

The C# 3.0 language in Orcas Beta 1 contains several new features to increase developer productivity. Here’s a quick run-through of the new language capabilities that you can investigate further:

Make sure to add a .NET reference to System.Data.Linq (or whichever LINQ namespace is appropriate for your project). For example, you’ll need System.Data.EntityClient in order to use Linq to DataSets.

Automatic properties allow you to create a property without having to declare a private field. Prior to C# 3.0, you’d normally write the following code to create properties for FirstName and LastName:

public class Employee  // Prior to C# 3.0
{
   private string _FirstName;
   public string FirstName
   {
        get { return _FirstName; }
        set { _FirstName = value; }
   } 
   private string _LastName;
   public string LastName
   {
        get { return _LastName; }
        set { _LastName = value; }
   } 
} 

C# 3.0 allows you to create the same class and properties, but with less code! (Important: you must declare both a get and set when you use anonymous properties. You will not be able to use anonymous properties if you only declare a get accessor).

public class Employee  // Now in C# 3.0
{
   public string FirstName { get; set; }
   public string LastName { get; set; }
} 

C# 3.0 contains object and collection initializers so that you can set initial properties for the class or members of a list, without the need for a constructor.

Employee oEmployee = 
   new Employee{FirstName = "Kevin", 
                LastName = "Goff"};
    
List<Employee> oEmployees = new    
               List<Employee>() { 
   new Employee{FirstName = "Kevin", 
                LastName = "Goff"},
   new Employee{FirstName = "Steve", 
                LastName = "Goff"}}; 

Implicitly-typed variables allow you to declare a variable without explicitly setting a type. The new var keyword represents an anonymous type, a type inference, as follows:

var Name = "Kevin S. Goff";
var Sum = 0;
    
// Implicitly-typed array    
var IntArry = new [] {10,20,30,40}; 
foreach(var intValue in IntArry)
   Sum += intvalue;

Next, C# 3.0 provides extension method capability. Extension methods can be a little difficult to initially understand. Think of the way extender classes work in .NET today: they (extender classes) allow you to append (extend) properties and capabilities to an existing class, as opposed to creating a subclass and recompiling. Extension methods allow you to add static methods to an existing .NET type without even touching the original type. They are beneficial when it is difficult or not possible to add instance methods to an existing class.

Listing 6 (from Tip 6) showed an example of an extension method (ToADOTable) that you can use for any object that implements IEnumerable. The method becomes available in IntelliSense for any corresponding type. Extension methods are a powerful capability for extending .NET Framework functionality. Note the “this IEnumerable<T> varlist” syntax in the first parameter-this essentially means that any object in the current namespace that implements IEnumerable can use the method ToADOTable.

public static DataTable ToADOTable<T>(
          this IEnumerable<T> varlist, 
           CreateRowDelegate<T> fn)

Finally, lambda expressions are a “kinder, gentler” form of anonymous methods. C# 2.0 introduced anonymous methods to allow developers to insert code blocks in-line, in place of a delegate. While anonymous methods are powerful, their syntax tends to be a little complex (see the example back in Tip 9 for supplying an anonymous method for the List class Sort method).

Lambda expressions perform the same functionality with more concise code, by using a simple => token set to separate the parameter list from the expression, as opposed to specifying the delegate statement and parameter type. The compiler actually translates the lambda expression into delegate-based code. As you saw back in Tip 9, you can specify multiple parameters in the lambda expression simply by delimiting them with a comma (parm1, parm2 =>), and then using them accordingly in the expression itself.

// VS2005 syntax    
oCustomerRecs.FindAll(
  ( delegate(CustomerRec oRec)
     { 
       return (oRec.LocationID == 1);
     }
));
// Lambda expression, more concise    
oCustomerRecs.FindAll( 
             oRec => oRec.LocationID == 1 );

The code in the lambda expression can include both C# 3.0 code as well as LINQ query expression code.

Tip 13: Language Features for VB 9.0

VB developers will be happy to hear that all of the above C# 3.0 language features (automatic properties, object and collection initializers, implicitly-typed variables, extension methods, and lambda expressions) are also available in VB 9 that comes with Orcas Beta 1. In addition, VB 9.0 also supports nullable types and relaxed delegates. The latter is helpful in instances where method signatures do not need to match, such as sender and argument declarations for a button click.

Resources, Resources, Resources

Do you ever feel like you’re on information overload, and yet you simultaneously worry that there are more resources out there of which you’re not aware? Every few weeks I take the time to add to and organize my list of blogs, sites, and articles. You won’t have a problem finding resources on LINQ and Orcas Beta 1. These persons/companies are writing very valuable blogs on LINQ and Orcas Beta 1:

http://www.thedatafarm.com/blog/

http://oakleafblog.blogspot.com/

http://codebetter.com/

http://www.linqdev.com/PublicPortal/

http://weblogs.asp.net/scottgu/

Additionally, Microsoft has a site with download links for a large number of samples and demo code:

http://msdn2.microsoft.com/en-us/vbasic/bb330936.aspx

http://forums.microsoft.com/MSDN/ShowForum.aspx?ForumID=123&;SiteID=1

Finally, though I didn’t cover them in this article, there are two other LINQ-related projects worth mentioning. One is PLINQ, a special implementation of LINQ designed to run on multiple processors. The other is BLINQ, an ASP.NET project to generate data-driven Web sites. Just use Google to search for more on these topics.

Closing Thoughts

Have you ever submitted something (an article, a paper, some code, etc.) and thought of some good ideas AFTER the fact? Well, I’m the king of thinking of things afterwards. Fortunately, that’s the type of thing that makes blogs valuable. Check my blog (www.TheBakersDozen.net) for follow-up tips and notes on Baker’s Dozen articles…and maybe a few additional treats!