Question

So, I'm working on a project that takes a very complex entity from a third party framework and converts it into the native object that defines that entity. This entity has several one-to-many relationships, and a few of those have some one-to-many relationships. I've often been told nested loops are the worst, just absolute worst, as far as efficiency is concerned, but I really can't help it in this situation, as I'm working on a base object that's not going to change just to please me. The upshot to this, is this conversion only happens once per entity.

My concern over this is that I always assumed casting was supposed to be fairly instantaneous. There are no async calls involved in this conversion, and it's set up to be a batch job, so I don't really have to keep up with any UI updates or keeping users happy. For all intents and purposes, it's fine in the UI if the cast takes a while, but is there some other Technical reason a cast should run fairly efficiently? Does the CLR expect to run casts in a specific time frame?

If it matters at all, I'm doing explicit casts in c#.

Was it helpful?

Solution

As far as I know, there is no technical reason to be concerned about this. But there is a readability reason.

In all languages I'm familiar with, the two biggest differences between a "cast" and a function that converts an X to a Y are:

  1. In many contexts the cast will be implicit or invisible. An operation that you're willing to let the compiler insert for you usually should be a simple, fast, boring technical detail that isn't worth forcing the programmer to think about. Since the conversion you're describing may take a while, it's important that you not write code which accidentally does the conversion more times than it's supposed to, and that means it's worth making this conversion as explicit as any other method call.

  2. Even if the cast is "explicit", intuitively a cast implies that the X and the Y are "the same" object or value, just represented differently. The conversion you're describing takes an X, fetches a bunch of As and Bs and Cs related to that X and combines all of those into a Y object. To me, that means the X and the Y are not the same object in any meaningful sense.

In other words, the conversion you're doing is not what we'd normally call a "cast". So implement it as a regular function or method. Not for performance reasons, but for not-confusing-people reasons.


Incidentally, there's nothing evil about nested for loops. They are a code smell, since most instances of nested for loops indicate a place where smaller methods or a better-shaped data structure or a different looping construct like map/filter/reduce or some other refactoring would be appropriate, but there are still plenty of times when multiple for loops is simply the only readable way to get the job done.

OTHER TIPS

Your code will likely take time proportional to the number of objects that it touches. If these objects are required by the user of the code then this time is unavoidable. It's easy to say "nested loops are evil", but if you have 10 arrays each containing 10 arrays, then you have 100 items that need to be converted.

You can create a model object that just stores the third party framework data unchanged, and has accessor methods that go through the framework data and access what is needed when and only if its needed. Say you get data about all students in all classes in a school, that's hundreds. If all a caller wants is the name of the 10th student in the 21st class, an accessor method could return this reasonably quickly traversing the third party data.

Another method would be a data model that keeps some data unconverted and converts as needed. For example, you could in your data model have an array of 30 classes, with each class being either proper model data or original third party data. When you access a class for the first time, and only then, you convert the third party data. So asking for the 10th student in the 21st class and the 8th student in class 23 converts all students of 2 classes (but not all students of all classes), and asking for all the other students in the same two classes will be quick.

In a batch job, that all doesn't matter much.

In some circumstances, casting does not have a technical limitation that it occur within a certain amount of time. Then there are some defined APIs that have timeouts built in or assumed.

However, when dealing with users, there is only a certain amount of patience that they will allow. Faster casting will mean less time the user has to wait for a process to complete.

In my experience, a cast that is fast enough that the user doesn't notice it doesn't need to be optimized. If you are casting thousands of times per operation, then you may need to look at efficiency. If the cast only occurs a few times per operation, then you are starting to get into the micro-optimization that gnat is referring to.

Licensed under: CC-BY-SA with attribution
scroll top