The Baker’s Dozen: 13 Productivity Tips for ADO.NET 2.0

By Kevin Goff
Published in: CODE Magazine: 2006 - January/February
Publication Date: December 31, 2005
Last updated: February 19, 2019

Published in:

Filed under:

This installment of “The Baker’s Dozen” presents a variety of tips and techniques to become productive with data handling techniques using ADO.NET 2.0 in Visual Studio 2005.

ADO.NET 2.0 is faster than the first version of ADO.NET; in some instances, significantly faster. While many view ADO.NET 2.0 as more evolutionary than revolutionary, it provides many functions to give developers greater control over data access and data manipulation. It also leverages the new database capabilities in SQL Server 2005. In addition, ADO.NET 2.0 simplifies the creation of multiple-database solutions.

Beginning with the End in Mind

The ability to effectively work with data is essential in a line-of-business application. ADO.NET has a rich object model, but takes time and experience to master, especially when coming from other development environments.

ADO.NET 2.0 introduces a new provider factory class that provides a much easier way to support multiple databases by loading the desired data provider for a specific connection string.

While ADO.NET in Visual Studio .NET 2003 offered tremendous power, ADO.NET 2.0 in Visual Studio 2005 provides even more capabilities, along with major performance gains when inserting, filling, and merging large amounts of data. Here’s a quick rundown of what’s in store for this Baker’s Dozen installment.

New capabilities for the DataTable that previously were only available for the DataSet
Baker’s Dozen Spotlight: maximizing the use of typed DataSets
The new base class Provider model and how to write generic code to leverage it
Performance and optimization enhancements in ADO.NET 2.0
Baker’s Dozen Potpourri: miscellaneous data-munging
Data relations and look-ups using typed DataSets
Asynchronous command execution
New capabilities in the LOAD method to create a DataSet without a DataAdapter
More control over the rowstate
The new SqlBulkCopy class
Multiple active result sets (MARS)
Connection pooling
Connection statistics

In addition to the Spotlight and Potpourri sections, I’ve added a new section to this series-the Baker’s Dozen commentary-which appears at the end.

Tip 1: The DataTable Breaks Free

In Visual Studio .NET 2003, several methods were only available for the DataSet. One example is writing (and reading) XML data: a developer who wanted to write the contents of a data table to an XML file had to create a DataSet, add the data table to the DataSet, and use the DataSet’s WriteXml method.

Developers who wanted to utilize those methods for a single data table had to create a DataSet and add the data table just to gain access to these methods. ADO.NET 2.0 now provides several methods for the data table. Table 1 lists these methods.

In addition, ADO.NET 2.0 allows a developer to serialize a data table in remoting scenarios. The new RemotingFormat property (covered in detail in Tip 4) applies to the data table.

Finally, ADO.NET 2.0 allows developers to create a new data table from any data view. The ToTable method also contains an overload for the developer to create a subset list of columns in the new table. And if that isn’t enough, hold onto your hat-there’s more! ToTable contains an additional overload to filter distinct rows. (Previously, you had to write additional code to accomplish this.) Now a developer simply must specify an array of columns to filter on unique values.

I’m very pleased with the data table enhancements in ADO.NET 2.0. There are some enhancements that I hope Microsoft will consider for the next version. Even with the new independence of the data table, there are still capabilities that require a DataSet. For instance, you can only establish a data relation between two data tables in the same DataSet. Situations where master codes exist in multiple transaction tables requires some additional coding when relations need to be established.

Tip 2: Getting the Most Out of Typed Datasets

A typed DataSet is a subclass of the standard ADO.NET DataSet. It exposes the tables, rows, columns, etc., as strongly-typed properties. Visual Studio 2005 can check these strongly-typed names at compile time, as opposed to runtime. In this sense, strongly-typed DataSets are self-documenting. IntelliSense will display the names in the typed DataSet object hierarchy-exposing column names as properties in the IDE can be very helpful in a large database application. It isn’t necessary to perform the boxing/unboxing that is required when using untyped DataSets.

Typed DataSets simplify handling null values. A strongly-typed data row contains two methods to check if a column value is null, and to set a column value to null.

When used properly, typed DataSets are especially beneficial in a multi-developer environment as well as applications that make heavy use of result sets and reporting. Developers can define typed DataSets as separate project DLLs and then set references to them from other projects.

Listing 1 contains sample code to demonstrate the basic use of typed DataSets. In addition, developers can subclass a typed DataSet class to add validation code or other methods. Listing 2 includes a brief example that extends typed DataSets through an interface and a subclassed definition.

Recently I developed a reporting solution involving a stored procedure that returned ten tables. The scenario represented a one-many-many relationship with a large number of columns that I didn’t want to retype into the VS.NET typed DataSet designer. I wanted to take the structure of the stored procedure results, turn it into an XML schema, and use it to create a typed DataSet for the project (Listing 3).

A complaint about typed DataSets is the default naming conventions in the typed DataSet class for DataTable/DataRow objects and methods/events. The generated names may not be consistent with preferred naming conventions. Fortunately, developers can use Typed Dataset Annotations to solve some of the common naming issues. Annotations allow developers to modify the names of elements in the typed DataSet without modifying the actual schema.

Annotations also allow you to specify a nullValue annotation-this instructs the typed DataSet class how to react when a column is DbNull. You can optionally tell the typed DataSet to return an empty string or a default replacement value. Shawn Wildermuth demonstrates typed DataSet annotations in an excellent online article (see the Recommended Reading sidebar).

Tip 3: Supporting Multiple Databases with Provider Factories

In ADO.NET 1.0, supporting multiple databases meant coding against interfaces for the ADO.NET objects (IdbConnection, IdbDataAdapter, IdbCommand, etc.), as well as writing a case statement to determine which connection object to use. ADO.NET 2.0 introduces a new provider factory class, which provides a much easier way to support multiple databases by loading the desired data provider for a specific connection string.

For line-of-business, data-driven applications, the DataSet approach provides many benefits. Tasks such as two-way data-binding, RowFilter/Sorting, XML integration, setting up relationships between data objects, preserving rowstate information, merging changes, and complex reporting operations are much easier with DataSets.

A developer can code against the new classes DbProviderFactory and DbProviderFactories, and pass the desired namespace at runtime. Table 2 displays the methods that a new DbProviderFactory object exposes. Note that the names of the providers must appear in the machine configuration file. Listing 4 demonstrates the use of provider factories.

Tip 4: Performance Enhancements in ADO.NET 2.0

Microsoft has made two significant enhancements to ADO.NET that result in better performance. First, they rewrote the indexing engine for better performance. As a result, many operations will execute more quickly, especially as the size of a DataSet grows. Any increase in performance will vary based on many application environment factors, but many should see increases by a factor of two or greater. One test published on MSDN that inserted a million rows into a DataTable took 30 minutes in Visual Studio .NET 2003 and 45 seconds in Visual Studio 2005! Lesser amounts of data will result in lower orders of magnitude; but in general, insert, update, delete, fill, and merge operations will execute more quickly in Visual Studio 2005.

Second, many developers know that passing and returning DataSets across physical boundaries carries substantial overhead and incurs performance penalties. ADO.NET 2.0 supports true binary serialization in remoting environments by providing a new RemotingFormat property, which a developer can set to SerializationFormat.Binary.

Depending on the size of the DataSet, the binary serialization may yield a result as much as six times smaller than XML serialization, resulting in faster transfer speeds and smaller required bandwidth resources. Note that this enhancement only works in remoting environments because Web services (by definition) pass XML.

Tip 5: Baker’s Dozen Potpourri: Data Munging

Roll up your sleeves! It’s time for some elbow grease. Listing 5 demonstrates some miscellaneous tasks that developers new to ADO.NET often need to accomplish:

Filtering rows that contain NULL values
Filtering on columns in a parent/child relationship
Defining primary keys and performing lookups
Adding a new row
Creating a filter and looping through the results
Handling null data with a strongly-typed datarow

Tip 6: Data Relations and Look-Ups with Typed Datasets

Listing 1 references a typed DataSet with two tables. If the developer defines a relation between the two tables in the typed DataSet, the XML Schema Definition Tool will add methods to navigate through the data without the need to specify the name of the DataRelation.

In the specific example of a master table called DtOrderHeader and a child table called DtOrderDetail, the XML Schema Definition Tool automatically creates a GetDtOrderDetailRows method (similar to GetChildRows) to the strongly typed DataRow class for DtOrderHeader, and a GetDtOrderHeaderRow method (similar to GetParentRow) to the strongly typed DataRow class for DtOrderDetail. This demonstrates yet another advantage of strongly typed DataSets-the XML Schema Definition Tool has done some of the work for the developer.

Strongly-typed DataSets are self-documenting. IntelliSense will display the names in the typed DataSet object hierarchy. They are especially beneficial in a multi-developer environment as well as applications that make heavy use of result sets and reporting.

Additionally, if a typed data table contains a primary key, the class exposes a Find method associated with the column name. A developer can perform a Find against a table with a primary key of OrderID by using the method FindByOrderID. If the primary key is a concatenation of two columns (CustomerID and OrderID), the developer would use FindByCustomerIDOrderID. (As mentioned in Tip #2, developers can use Typed Dataset Annotations for alternate naming conventions.

Tip 7: Asynchronous Command Execution

ADO.NET 2.0 introduces a new asynchronous API that allows a program to continue while specific database commands execute. Previously, a developer could simulate asynchronous processing using asynchronous delegates or the ThreadPool class. However, ADO.NET 2.0 provides true asynchronous functionality.

Three methods in ADO.NET 2.0 provide asynchronous capabilities: ExecuteNonQuery, ExecuteReader, and ExecuteXMLReader. ADO.NET provides Begin and End prefix/suffix references to these methods.

 IAsyncResult oAR = command.BeginExecuteReader();
    
// other processing
    
 while(oAR.IsCompleted==false) {
  // do something while waiting
}
    
SqlDataReader r = command.EndExecuteReader(oAR);

The developer also needs to know when the operation is complete. ADO.NET 2.0 provides three models to detect when an asynchronous command has completed:

Callback model: the developer specifies a function to be executed when the command has executed. The begin methods contain an overload for a delegate parameter.
Synchronization objects (wait model): the IasyncResult object listed above contains a WaitHandle property
Polling: the IasyncResult object also contains an IsCompleted property that subsequent code can evaluate for completion. You can write subsequent code and test for whether IsCompleted is true.

Important note: You must add async=true to the connection string for asynchronous execution to work. If you don’t plan to use asynchronous commands, Microsoft recommends setting this to false.

Tip 8: New Data LOAD Options

In ADO.NET 1.0, populating a DataSet from a back-end database meant either you must add the Fill method of the Data Adapter, or create a DataReader and loop through it. ADO.NET 2.0 provides greater capabilities for creating a data table when all that exists is a data reader.

SqlDataReader oReader =
oMyCommand.ExecuteReader();
    
    
DataTable DtTable = new DataTable();
DtTable.Load(oReader,LoadOption.OverwriteRow);
    
// note that additional LoadOption enumerations
// exist, such OverwriteChanges, PreserveChanges

ADO.NET 2.0 also provides a new DataTableReader class, which a developer can use to stream data from a DataSet/DataTable using the new GetDataReader method.

Tip 9: More Control Over RowState

ADO.NET 2.0 provides two new methods to provide greater control over a DataRow’s RowState: SetAdded and SetModified. These methods set the RowState of a DataRow, which was a read-only property in ADO.NET 1.0.

// Developer can use these to set the RowState
// of an unchanged row
    
oMyDataRow.SetAdded();
oMyDataRow.SetModified();

Tip 10: SqlBulkCopy

ADO.NET 2.0 provides the ability to bulk copy the contents of a DataReader, DataTable, array of DataRows, or a DataSet to a SQL Server database. Listing 6 demonstrates an example of this welcome new capability to transfer data from one source to another.

Tip 11: Multiple Active Result Sets (MARS)

SQL Server 2005 supports Multiple Active Result Sets (MARS), which permits multiple DataReaders on a single connection. MARS has the potential to optimize database connections and performance, as multiple stored procedures or SQL pass-through queries can be executed asynchronously. Listing 7 demonstrates how to use MARS.

Tip 12: Connection Pooling

ADO.NET 2.0 improves connection pooling through two new methods: ClearPool and ClearPools. Both the SQL Server and Oracle providers support these functions. ClearPool will clear a specific pool while ClearPools will clear all connection pools.

Tip 13: Connection Statistics

A developer can retrieve Connection Statistics from the SqlConnection class by setting the connection object’s StatisticsEnabled property to true and calling the RetrieveStatistics method (Listing 8). Table 3 lists the statistics.

In Addition…

Coming up with thirteen tips is always difficult, especially when a particular technology has more than thirteen new capabilities! ADO.NET 2.0 also contains the following enhancements:

Support for the new SQL DataTypes (user-defined types, the XML DataType, and large character support)
Query notifications to refresh cached data when the original server data changes
Promotable transactions
Namespace-qualified tables
Batch updates
New Execute methods for the DataReader

The Baker’s Dozen Commentary

An old debate has resurfaced in the form of the argument about using DataSets versus custom entities, passing XML across boundaries versus passing custom objects, etc. Here are my twin copper Lincolns:

I’m immediately skeptical of any argument that begins with “never use DataSets” or “always use DataSets.” To paraphrase the old expression, the form of an application follows the functionality. I’ll freely admit that I come down more on the DataSet side, primarily because the applications I’ve built have the requirements that are often addressed by the built-in capabilities of ADO.NET. Had the applications been different, taking a more hardcore OOP approach to creating entities might have been preferable.

But for the ubiquitous line-of-business, data-driven applications, the DataSet approach provides many benefits. Tasks such as two-way data-binding, RowFilter/Sorting, XML integration, setting up relationships between data objects, preserving rowstate information, merging changes, and complex reporting operations are much easier with DataSets.

You may have to write additional code-sometimes a substantial amount of code-in the absence of DataSets. Those advocating this approach should ask themselves-is it worthwhile? And with the performance enhancements in ADO.NET 2.0, will custom code to accomplish these tasks execute as quickly as native ADO.NET 2.0 capabilities? In my opinion, the ability to effectively finish an application more quickly using native capabilities in .NET is a fundamental argument that developers should seriously evaluate when opting for approaches that exclude DataSets.

Closing Thoughts

Over the last several months, I’ve received many positive e-mails and comments on the Baker’s Dozen series. I’m reminded of the famous quote, “No man is an island.” I want to thank folks like Bonnie Berent, Cathi Gero, and Mike Antonovich, who have provided valuable input and guidance over the years. My editors (Erik Ruthruff and Melanie Spiller) deserve a medal for dealing with me. I also want to give a special thanks to Rod Paddock and John Petersen who have given me incredible support and direction. You’re all first class folks.

Article Source Code

You can find the entire source code on my Web site at www.commongroundsolutions.net.

Listing 1: Sample code to demonstrate use of typed DataSets

// Assumed a typed DataSet, DsOrder
// It consists of DtOrderHeader and DtOrderDetail
      
DsOrder oDsOrder = new DsOrder();
      
oDsOrder = SomeProcessToRetrieveAnOrder();
      
DsOrder.DtOrderHeaderDataTable oDtHeader = new
DsOrder.DtOrderHeaderDataTable();
DsOrder.DtOrderDetailDataTable oDtDetail = new
DsOrder.DtOrderDetailDataTable();
      
// Add an item
DsOrder.DtOrderDetailRow oDetailRow =
new DsOrder.DtOrderDetailRow();
oDetailRow.ItemFK = 100; // Column names show in Intellisense
oDetailRow.QtyOrder = 5;
oDetailRow.UnitPrice = 5.50;
      
// Add < TableName > Row is automatically part of the typed DataSet
oDsOrder.DtOrderDetail.AddDtOrderDetailRow(oDetailRow);

Listing 2: Using an interface and subclassing from a typed DataSet

public interface IDataSet
{
   void PopulateFromXml(string cXml);
}
      
public class XmlConverter
{
   public void ImportXml(DataSet DsData, string cXml)
   {
      StringReader sr = new StringReader(cXml);
      DsData.ReadXml(sr,XmlReadMode.InferSchema);
      DsData.AcceptChanges();
   }
}
// In the subclassed DataSet
      
public class DsOrder : DsTypedDsOrder, IDataSet
{
   BaseDataClass.XmlConverter Converter = new
      BaseDataClass.XmlConverter();
   
   public void PopulateFromXml(string cXml)
   {
      this.Converter.ImportXml(this,cXml);
   }
}

Listing 3: Test code for creating a schema from a stored procedure

// Test program: returns a stored proc result set
// into a temp DataSet, then writes out schema
SqlConnection oMyConn=new SqlConnection(cMyConnString);
SqlDataAdapter oDA=new SqlDataAdapter(”MyProc”, oMyConn);
      
// Temporary DataSet to hold our results
DataSet DsReturn = new DataSet();
      
// Because we haven’t (yet) introduced the typed DataSet
// into our application, we may provide the actual tablenames
// that we’ll want to use in our typed DataSet definition
// I use this with Crystal reports when providing a schema,
// so that I can specify the tablenames that I want
      
oDA.TableMappings.Add("Table","MyRealTypedTable1");
oDA.TableMappings.Add("Table1","MyRealTypedTable2");
oDA.TableMappintgs.Add("Table2","MyRealTypedTable3");
      
      
oDA.SelectCommand.CommandType = CommandType.StoredProcedure;
SqlParameter oParm = new SqlParameter("@nMyPKParm,nMyPKParm));
oDA.Fill( DsReturn);
DsReturn.WriteXmlSchema("c:\\MySchema.XSD"); // Just schema
DsReturn.WriteXml
("c:\\MySchema.XML",XmlWriteMode.WriteSchema); // All data
      
// Now you can "add existing item" and point to the XSD file,
// instead of typing all the columns in manually….and then
// you'll be on your way to creating a typed DataSet

Listing 4: Provider factories

// Each data provider to be used must have a provider string
// in the machine.config file…for example:
 
// < DbProviderFactories >
// < add name="SqlClient Provider"
// S invariant="System.Data.SqlClient"
// support="FF"
// description=".Net Framework Data Provider for SqlServer"
// type="System.Data.SqlClient.SqlClientFactory",
// System.Data,
// Version=2.0.0.0,
// Culture=neutral,
// PublicKeyToken="SomeString" / >
// </ DbProviderFactories >
      
// Note the invariant attribute: this is the name used to
// load the desired/appropriate factory
      
//
using System.Data.Common;
      
DbProviderFactory oFactory = new
DbProviderFactories.GetFactory("System.Data.SqlClient");
DbConnection oConn = oFactory.CreateConnection();
DbCommand oCommand = oFactory.CreateCommand();
oCommand.CommandText = "SELECT * FROM <sometable> where <cond>";
      
DbParameter oParm = oCommand.CreateParameter();
oParm.ParameterName = "@MyParameter";
oParm.Value = 12345; // note: use oParm.DbType to view data types
oCommand.Parameters.Add(oParm);
      
oConn.Open();
IdataReader oReader = oCommand.ExecuteReader();
      
// execute your code against the reader, then close
oReader.Close();
oConn.Close();
      
      
// Note: if you wanted to fill a DataSet instead of a
// data reader, you'd do the following:
      
// DbDataAdapter oDa = oFactory.CreateDataAdapter();
// oDa.SelectCommand = "my command / my stored proc name";
// DataSet DsReturn = new DataSet();
// oDa.Fill(DsReturn);

Listing 5: Miscellaneous data munging tasks

// 1) Filtering on columns that contain a NULL value
DataView Dv = new DataView(MyDataTable);
Dv.RowFilter = "ISNULL(Amount,-9999)=-9999";
      
// 2) You can also filter on columns defined in a parent/child
// relationship
MyDataSet.Relations.Add("RelName",DtAccounts.AcctID,
         DtOrders.AcctID,false);
System.Text.StringBuilder sbRowFilter= new
System.Text.StringBuilder();
      
string cAcctName = "National";
sbRowFilter.AppendFormat("Parent(RelName) like
'%{0}%'",cAcctName);
      
// 3) Defining a primary key (typed DataSet) and performing a
// lookup
      
DtMaterial.PrimaryKey = new DataColumn[] { DtMaterial.MaterialPK};
DsMaterial.DtMaterialRow Dr = DtMaterial.Rows.Find(12345);
If(Dr!=null)
    // IntelliSense can be used for column names…Dr.Name
      
// 4) Defining a multi-column primary key
DtOrders.PrimaryKey = new DataColumn[] { DtOrders.OrderID,
   DtOrders.ProductID};
DsMaterial.DtMaterialRow Dr = DtMaterial.Rows.FindS(
new object[] {cOrderID, cProductID});
      
// 5) Adding a new row to a typed DataSet
DSMaterials Ds = new DSMaterials();
Ds.DtMaterial.AddDtMaterialRow(111,"new row",DateTime.Today);
// note the method "AddRow", part of the typed DataTable
      
// 6) Looping through a filtered DataView
      
DataView Dv = new DataView(Ds.DtMaterial);
Dv.RowFilter = "MaterialType = 1 OR Amount > 500";
Dv.Sort = "Amount";
foreach(DataRowView drv in Dv) {
   string cName = Convert.ToString(drv["MaterialName"]);
   DateTime dt = Convert.ToDateTime(drv["MaterialDate"]);
}
      
// 7) Working with null values in a strongly-typed DataRow
if (MyRow.IsAddressNull()==false)
MyRow.SetAddressNull()

Listing 6: Bulk data copy in ADO.NET 2.0

DataSet DsData = new DataSet();
// Get the data from some other task…it might be another process,
// or a task that retrieves data from another device, etc.
DsData = oMyClass.PopulateData();
      
SqlConnection oMyConn=new SqlConnection(cMyConnString);
oMyConn.Open();
SqlBulkCopy oBCP =
new SqlBulkCopy(oMyConn, SqlBulkCopyOptions.Default);
oBCP.DestinationTableName = "ExitingTableInSQL";
oBCP.WriteToServer(DsData);
oMyConn.Close();

Listing 7: Multiple active result sets

SqlConnection oMyConn=new SqlConnection(cMyConnString);
oMyConn.Open();
SqlCommand cmd1 = new SqlCommand("select * from Table1",oMyConn);
SqlCommand cmd2 = new SqlCommand("select * from Table2",oMyConn);
      
SqlDataReader rReader1 = cmd1.ExecuteReader();
SqlDataReader rReader2 = cmd2.ExecuteReader();
oMyConn.Close();

Listing 8: Retrieving connection statistics

      
SqlConnection oMyConn=new SqlConnection(cMyConnString);
oMyConn.Open();
oMyConn.StatisticsEnabled = true;
SqlCommand oCmd = new SqlCommand("SELECT * FROM Authors",oMyConn);
SqlDataReader oReader = oCmd.ExecuteReader();
Hashtable hStats = (Hashtable)oMyConn.RetrieveStatistics();
IDictionaryEnumerator eStats = hStats.GetEnumerator();
String cResults = String.Empty;
While(eStats.MoveNext())
   cResults += eStats.Key.ToString() + " "
+ eStats.Value.ToString() + "\n";
MessageBox.Show(cResults);
oMyConn.Close();

Kevin S. Goff, a Microsoft MVP award recipient, is the founder and principal consultant of Common Ground Solutions, a consulting group that provides custom Web and desktop software solutions in .NET, VFP, SQL Server, and Crystal Reports. Kevin has been building software applications for seventeen years. He has received several awards from the U.S. Department of Agriculture for systems automation. He has also received special citations from Fortune 500 Companies for solutions that yielded six-figure returns on investment. He has worked in such industries as insurance, accounting, public health, real estate, publishing, advertising, manufacturing, finance, consumer packaged goods, and trade promotion. In addition, Kevin provides many forms of custom training.

Table 1: New methods for the DataTable.

Method
ReadXml
WriteXml
ReadXmlSchema
WriteXmlSchema
Merge
Load
GetDataReader

Table 2: New methods for DbProviderFactory.

Method
CreateConnectionStringBuilder
CreateConnection
CreateCommand
CreateParameter
CreateCommandBuilder
CreateDataAdapter
CreateDataSourceEnumerator

Table 3: Available connection statistics.

Connection
NetworkServerTime
BytesReceived
UnpreparedExecs
SumResultSets
SelectCount
PreparedExecs
ConnectionTime
ExecutionTime
Prepares
BuffersSent
SelectRows
ServerRoundTrips
CursorOpens
Transactions
BytesSent
BuffersReceived
IduRows
IduCount