Question

I spend a lot of time querying a database and then building collections of objects from the query. For performance I tend to use a Datareader and the code looks something like:

while(rdr.Read()){
    var myObj = new myObj();

    myObj.Id = Int32.Parse(rdr["Id"].ToString();

    //more populating of myObj from rdr

    myObj.Created = (DateTime)rdr["Created"];
}

For objects like DateTime I simply cast the rdr value to the required class, but this can't be done for value types like int hence the (IMHO) laborious ToString() followed by Int.Parse(...)

Of course there is an alternative:

myObj.Id = rdr.GetInt32(rdr.GetOrdinal("Id"));

which looks cleaner and doesn't involve a call to ToString().

A colleague and I were discussing this today - he suggests that accessing rdr twice in the above code might be less efficient that doing it my old skool way - could anyone confirm or deny this and suggest which of the above is the best way of doing this sort of thing? I would especially welcome answers from @JonSkeet ;-)

Was it helpful?

Solution

I doubt there will be a very appreciable performance difference, but you can avoid the name lookup on every row simply by lifting it out of the loop. This is probably the best you'll be able to achieve:

int idIdx = rdr.GetOrdinal("Id");
int createdIdx = rdr.GetOrdinal("Created");

while(rdr.Read())
{
    var myObj = new myObj();

    myObj.Id = rdr.GetFieldValue<int>(idIdx);

    //more populating of myObj from rdr

    myObj.Created = rdr.GetFieldValue<DateTime>(createdIdx);
}

OTHER TIPS

I usually introduce a RecordSet class for this purpose:

public class MyObjRecordSet
{
    private readonly IDataReader InnerDataReader;
    private readonly int OrdinalId;
    private readonly int OrdinalCreated;

    public MyObjRecordSet(IDataReader dataReader)
    {
        this.InnerDataReader = dataReader;
        this.OrdinalId = dataReader.GetOrdinal("Id");
        this.OrdinalCreated = dataReader.GetOrdinal("Created");
    }

    public int Id
    {
        get
        {
            return this.InnerDataReader.GetInt32(this.OrdinalId);
        }
    }

    public DateTime Created
    {
        get
        {
            return this.InnerDataReader.GetDateTime(this.OrdinalCreated);
        }
    }

    public MyObj ToObject()
    {
        return new MyObj
        {
            Id = this.Id,
            Created = this.Created
        };
    }

    public static IEnumerable<MyObj> ReadAll(IDataReader dataReader)
    {
        MyObjRecordSet recordSet = new MyObjRecordSet(dataReader);
        while (dataReader.Read())
        {
            yield return recordSet.ToObject();
        }
    }
}

Usage example:

List<MyObj> myObjects = MyObjRecordSet.ReadAll(rdr).ToList();

This makes the most sense to a reader. Whether it's the most "efficient" (you're literally calling two functions instead of one, it's not going to be as significant as casting, then calling a function). Ideally you should go with the option that looks more readable if it doesn't hurt your performance.

var ordinal = rdr.GetOrdinal("Id");
var id = rdr.GetInt32(ordinal);
myObj.Id = id;

Actually there is are differences in performance in how you use SqlDataReader, but they are somewhere else. Namely the ExecuteReader method accepts the CommandBehavior.SequentialAccess:

Provides a way for the DataReader to handle rows that contain columns with large binary values. Rather than loading the entire row, SequentialAccess enables the DataReader to load data as a stream. You can then use the GetBytes or GetChars method to specify a byte location to start the read operation, and a limited buffer size for the data being returned. When you specify SequentialAccess, you are required to read from the columns in the order they are returned, although you are not required to read each column. Once you have read past a location in the returned stream of data, data at or before that location can no longer be read from the DataReader. When using the OleDbDataReader, you can reread the current column value until reading past it. When using the SqlDataReader, you can read a column value only once.

If you do not use large binary values then it makes very little difference. Getting a string and parsing is suboptimal, true, is better to get the value with rdr.SqlInt32(column) rather than a GetInt32() because of NULL. But the difference should not be noticeable on most application, unles your app is trully doing nothing else but read huge datasets. Most apps do not behave that way. Focusing on optimising the databse call itself(ie. have the query execute fast) will reap far greater benefits 99.9999% of the times.

For objects like DateTime I simply cast the rdr value to the required class, but this can't be done for value types like int

This isn't true: DateTime is also a value type and both of the following work in the same way, provided the field is of the expected type and is not null:

myObj.Id = (int) rdr["Id"];
myObj.Created = (DateTime)rdr["Created"];

If it's not working for you, perhaps the field you're reading is NULL? Or not of the required type, in which case you need to cast twice. E.g. for a SQL NUMERIC field, you might need:

myObj.Id = (int) (decimal) rdr["Id"];
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top