What are the drawbacks of using a method which calls a delegate for every row in SqlDataReader?

https://stackoverflow.com/questions/4910333

29-10-2019
|

Question

When I find a new idea, I always stick with it, and am unable to see any weak sides of it. Bad things happen when I start to use the new idea in a large project, and discover some moths later that the idea was extremely bad and I shouldn't use it in any project.

That's why, having a new idea and being ready to use it in a new large project, I need your opinion on it, especially negative one.

For a long time, I was bored to type again and again or copy-paste the following blocks in projects where database must be accessed directly:

string connectionString = Settings.RetrieveConnectionString(Database.MainSqlDatabase);
using (SqlConnection sqlConnection = new SqlConnection(connectionString))
{
    sqlConnection.Open();

    using (SqlCommand getProductQuantities = new SqlCommand("select ProductId, AvailableQuantity from Shop.Product where ShopId = @shopId", sqlConnection))
    {
        getProductQuantities.Parameters.AddWithValue("@shopId", this.Shop.Id);
        using (SqlDataReader dataReader = getProductQuantities.ExecuteReader())
        {
            while (dataReader.Read())
            {
                yield return new Tuple<int, int>((int)dataReader["ProductId"], Convert.ToInt32(dataReader["AvailableQuantity"]));
            }
        }
    }
}

So I've done a small class which allows to write something like that to do the same thing as above:

IEnumerable<Tuple<int, int>> quantities = DataAccess<Tuple<int, int>>.ReadManyRows(
    "select ProductId, AvailableQuantity from Shop.Product where ShopId = @shopId",
    new Dictionary<string, object> { { "@shopId", this.Shop.Id } },
    new DataAccess<string>.Yield(
        dataReader =>
        {
            return new Tuple<int, int>(
                (int)dataReader["ProductId"],
                Convert.ToInt32(dataReader["AvailableQuantity"]);
        }));

The second approach is:

Shorter to write,
Easier to read (at least for me; some people may say that actually, it's much less readable),
Harder to make errors (for example in first case, I often forget to open the connection before using it, or I forget while block, etc.),
Faster with the help of Intellisense,
Much more condensed, especially for simple requests.

Example:

IEnumerable<string> productNames = DataAccess<string>.ReadManyRows(
    "select distinct ProductName from Shop.Product",
    new DataAccess<string>.Yield(dataReader => { return (string)dataReader["ProductName"]; }));

After implementing such thing with simple ExecuteNonQuery, ExecuteScalar and ReadManyRows and a generic DataAccess<T>.ReadManyRows in a small project, I was happy to see that the code is much shorter and easier to maintain.

I found only two drawbacks:

Some modifications in requirements will require heavy code changes. For example, if there is a need to add transactions, it will be very easy to do with ordinary SqlCommand approach. If my approach is used instead, it will require to rewrite the whole project to use SqlCommands and transactions.
Slight modifications on command level will require to move from my approach to standard SqlCommands. For example, when querying one row only, either DataAccess class must be extended to include this case, or the code must use directly SqlCommand with ExecuteReader(CommandBehavior.SingleRow) instead.
There might be a small performance loss (I don't have precise metrics yet).

What are the other weak points of this approach, especially for DataAccess<T>.ReadManyRows?

Solution

What you're trying to accomplish is nice, I actually like this kind of syntax and I think it's pretty flexible. However I believe you need to design you APIs better.

The code is readable and nearly beautiful but it is hard to understand, primarily due to lots of generics that don't make much sense unless you know exactly what each type means. I'd use generic type inference wherever possible to eliminate some of them. To do so, consider using generic methods instead of generic types.

Some syntax suggestions (I don't have compiler now so they are basically ideas):

Use anonymous types instead of dictionaries

It's trivial to write a helper that converts anonymous type to a dictionary but I think it improves the notation greatly and you wouldn't need to write new Dictionary<string, object>.

Use Tuple.Create

This static method was created to avoid specifying types explicitly.

Create a strong-typed wrapper around DataReader

This would remove those ugly conversions all around the place—and actually, do you really need access to DataReader in that lambda?

I will illustrate this by code for your examples.
Kudos to David Harkness for chaining idea.

var tuples = new DataAccess ("select ProductId, AvailableQuantity from Shop.Product where ShopId = @shopId")
    .With (new { shopId = this.Shop.Id }) // map parameters to values
    .ReadMany (row =>
         Tuple.Create (row.Value<int> ("ProductId"), row.Value<int> ("AvailableQuantity"))); 

var strings = new DataAccess ("select distinct ProductName from Shop.Product")
    .ReadMany (row => row.Value<string> ("ProductName"));

I can also see it being extended for handling single row selection:

var productName = new DataAccess ("select ProductName from Shop.Product where ProductId = @productId")
    .With (new { productId = this.SelectedProductId }) // whatever
    .ReadOne (row => row.Value<string> ("ProductName"));

This is a rough draft for Row class:

class Row {
    DataReader reader;

    public Row (DataReader reader)
    {
        this.reader = reader;
    }

    public T Value<T> (string column)
    {
        return (T) Convert.ChangeType (reader [column], typeof (T));
    }
}

It is to be instantiated inside ReadOne and ReadMany calls and provides convenient (and limited) access to underlying DataReader for selector lambdas.

OTHER TIPS

My thoughts: you're embedding SQL in code, in a string (as opposed to using LINQ, which is at least syntax-checked, which helps provided you keep your DBML or EDMX mapping file synched with your database structure). Embedding SQL in non-syntax checked code in this way could easily lead to unmaintainable code, where you (or someone else) later changes the database structure or the embedded SQL string in a way that breaks the application. Embedding SQL in strings is particularly prone to producing hard-to-find bugs, because code with logical errors will still compile correctly; this makes it more likely that developers who are less familiar with the code base will obtain a false sense of security that changes they've made have not had any adverse or unintended effects.

Your abstraction approach looks sound. Any performance hit due to extra method calls will be trivial, and developer time is far more expensive than CPU time. When you need to add transactions or single-row selects, you can expand your library classes. You're making good use of Don't Repeat Yourself here.

The Spring Framework for Java makes heavy use of these types of template classes and helpers such as JdbcTemplate and HibernateTemplate to remove the need for developers to write boilerplate code. The idea is to write and test it well once and reuse it many times over.

First of all, never apologize for not copy-pasting code.

Your abstraction looks fine, however what troubles me a little more is the fact that the first example you gave keeps the SqlConnection open longer than it needs to be.

Using an IEnumerable<T> is great, since you defer execution to when and if it is consumed. However, as long as you have not reached the end of your enumeration, the connection stays open.

The implementation of your method could consume the entire enumeration via a ToList(), and then return the list instead. You could even still support the deferred execution by implementing a small custom enumerator.

But I need to put a caveat around this, verify that there isn't any old code doing some magic while enumerating.

The delegate does make it a bit harder to read/understand immediately for someone new to this approach but it should be easy enough to pick up eventually.

From a maintainability point of view it may be harder to understand the stack trace when errors eventually creep in. Primarily if an error occurs in this section:

new DataAccess<string>.Yield(
    dataReader =>
    {
        return new Tuple<int, int>(
            (int)dataReader["ProductId"],
            Convert.ToInt32(dataReader["AvailableQuantity"]);
    }));

The use of yield puts some limitations on where you can try/catch (Why can't yield return appear inside a try block with a catch?) but that is an issue in the previous approach as well and may not be relevant in your scenario.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow