Is TimeSpan unnecessary?

https://stackoverflow.com/questions/1448976

11-09-2019
|

Question

EDIT 2009-Nov-04

OK, so it's been a little while since I first posted this question. It seems to me that many of the initial responders failed to really get what I was saying--a common response was some variation on "What you're saying doesn't make any sense"--and so I've made some handy diagrams to really illustrate my point.

When we speak of numbers, we are generally referring to points on what grade school children learn is called the Number Line:

The number line

Now, when we learn arithmetic, our minds learn to perform a very interesting transformation of this concept. Evalutating the expression 1 + 0.5, for example, if we simply applied our "number line thinking", would require us to somehow make sense of this:

Adding two points on the number line

It's difficult to really illustrate that, because it's difficult to think about that: "adding" two points. This is where a lot of responders struggled with the idea of adding dates (or simply dismissed it as absurd), because they were thinking of dates as points.

However, the expression 1 + 0.5 does make sense to us, because when we think of it, we're really imagining this:

Adding a number (point) and a magnitude (vector)

That is, the number (or point) 1, plus the vector 0.5, resulting in point 1.5.

Alternately, we may be imagining this:

Adding two vectors

That is, the vector 1, plus the vector 0.5, resulting in the vector 1.5.

In other words, when dealing with numbers, we treat points and vectors interchangeably. But what about dates? Dates are, after all, basically numbers. If you don't believe me, compare this line to the number line above:

A timeline

Notice the correspondence between the timeline and the number line? This was my point: if we perform the transformation above with numbers, we ought to be able to do it with dates as well. So, applying "timeline thinking", the expression 0001-Jan-02 00:00:00 + 0001-Jan-01 12:00:00 doesn't make a lot of sense, as plenty of responders pointed out:

Adding two points on a timeline

But, if we do the same conceptual transformation in our head that we perform every time we add or subtract numbers, we can easily "rethink" the above as this:

Adding a point in time and a time vector

So clearly, the difference between a DateTime and a TimeSpan is the same difference that exists between a point and a vector. What I think caused a lot of people to respond negatively to my suggestion is that it just feels so unnatural to think of dates as magnitudes in this way. But I don't buy the argument that there's no obvious reference point to use as zero. There is an obvious reference point, and I'll give you a hint where it is: about 2010 years ago.

Don't get me wrong: I'm not questioning the usefulness of drawing a conceptual divide between the notion of a DateTime and a TimeSpan. Really, my question all along should have been (as ChrisW indirectly suggested), why do we treat numbers and vectors interchangeably when dealing with regular numeric types? (Or: why do we have just one int type, instead of int and intspan?) There's a big difference, and yet we don't ever really think about it until sometime in junior high or high school, when we begin geometry. And then it's treated as this new mathematical concept, when in reality it's something we've been utilizing ever since we learned to add numbers by counting with our fingers.

In the end, the best answer came from Strilanc, who pointed out that the use of DateTime and TimeSpan is really an implementation of an affine space, which has the convenient property of not needing a reference point to treat as the origin. So thanks, Strilanc. I'm giving the accepted answer to ChrisW, however, for being the first one to bring up the concept of vectors and points, which really got to the crux of the matter.

ORIGINAL QUESTION (for posterity)

I am certainly no programming jack of all trades, but I know both PHP and .NET have a TimeSpan class in addition to a DateTime class (or structure in .NET), and I am guessing this is the case in a variety of other languages and frameworks as well (though I am writing this primarily with reference to the .NET structures). This might seem a strange question, but isn't TimeSpan redundant?

In case you think the answer is obvious ("A DateTime is an absolute point in time, while a TimeSpan is a range of time -- simple as that!"), consider this: an integer can be conceptualized as either an absolute value (the point on the number line) or a distance between values--and we don't need two separate data types for these different conceptualizations. I can still write 5 + 6 without any ambiguity as to what I mean.

As long as there is a consistent zero-point reference, it seems to me there should be no reason why one would need a TimeSpan object to perform arithmetic operations on DateTime objects, or to get the distance between them.

What am I missing? Why can't the unique methods and properties of the TimeSpan structure simply be folded into DateTime?

(Disclaimer: It isn't like I'm passionate about this or anything; I'm fine using DateTime and TimeSpan objects as they're intended all the time. I'm just asking a question.)

EDIT: Okay, over-simplified example to illustrate my point:

Consider the equation 10 - 5 = 5. One could read this as "Start at 10 (value), move 5 to the left (span), and you end up at 5 (value)."

Suppose, just to make things easy, we let January 1 1900 be point zero and we define TimeSpan objects in terms of days only.

Then 10 - 5 = 5 could be understood, in DateTime terms, as January 11 1900 - January 6 1900 = January 6 1900. This is fine, because January 11 is just "10" by our definition and January 6 is "5". The fact that we are viewing the 10 as a value, the first 5 as a span, and the last 5 as a value again is merely for our own conceptual benefit. My point is just this: that the only difference is in how you think of the number, not in what it actually is. This is why we don't have separate structures for, say, integer values and integer spans -- a plain old integer covers all our bases.

Am I making any sense?

Solution

consider this: an integer can be conceptualized as either an absolute value (the point on the number line) or a distance between values

By your logic, it isn't TimeSpan that's unecessary: rather it's DateTime that's unnecessary, and could be replaced by TimeSpan (duration since zero).

Plus there's the fact that integers have an obvious zero, whereas Dates however don't have an obvious zero; but having an obvious zero is necessary, if you want to replace "place on the number line" with "distance/span from the zero/origin".

Edit:

A point (location on a plane) isn't the same as a vector.

They seem similar ...

A vector (distance from origin) can represent a point
A point (relative to the origin) can represent a vector

... however the value of the vector that's required to represent a given point will change if the origin changes.

It always makes sense to add two (relative) vectors; but, it makes no sense to add two points, except by converting those points to vectors and then adding the vectors.

The sum of two vectors is unaffected by a change in the origin, but the sum of two points would be affected by a change in the origin if you summed them by converting them to vectors and adding the vectors (because changing the origin would affect the values of those vectors).

[Replace 'point' with DateTime and 'vector' with TimeSpan in the argument above.]

I think there is a genuine difference between absolute and relative values. I'm don't know why that difference isn't more apparent in arithmetic, i.e. why 'numbers' are used seemingly interchangeably to represent both absolute and relative values.

OTHER TIPS

A Date does not behave like an integer, i can't recall the classification of algebra's but consider this:

Date + Span = Date
Date - Date = Span  
Date + Date = undefined

Span + Span = Span
Span - Span = Span

For any given year,

10 feb + 10 days = 20 feb
20 feb - 20 jan  = 31 days
20 jan + 20 feb  = ???

That last computation could be interpreted as meaningful when we consider a Date as Days-since-StartDate. But the value would be as arbitrary as the choiche of the StartDate.

(Speaking as a mathematician) It's because arithmetic operations on a "date" aren't closed or well defined, necessitating the need for an additional structure.

For example, January 1, 2000 - December 1, 1999 = ... ? We know there's 31 days between them, but if this were interpreted as a date, then the answer is Epoch (i.e., zero) + 31 days. This is not a valid "date" anymore.

Similarly, all the arithmetic operations on integers aren't well defined (1 / 2 has no answer in the integers .. integer math returns zero here, but 0 * 2 = 0, not 1 as you would expect). This necessitates the need for an additional structure that we call fractions.

Just because you can define an operation doesn't mean you should. For example, one of the reasons division by zero is undefined is because defining it would require sacrificing some very useful properties of arithmetic (eg. associativity, etc).

The distinction between a timespan and a date comes down to addition. It makes sense to add two timespans, but it doesn't make sense to add two dates unless you have an arbitrary reference date. By not allowing addition of dates, you abstract away that arbitrary reference date. I don't know what date '0' is in .Net, and I've never needed to know. Isn't that nice?

Adding two dates is almost always a bug (seriously, try to think of where this makes sense outside of numerology). By introducing timespans (creating an Affine Space) you eliminate a whole class of bugs.

One reason is that splitting the types prevents a class of bugs where you think you have a relative time but really have an absolute time, and vice versa. For example, addition of two absolute times can be flagged as a compiler error if the two types are separate.

Also, IntelliSense (and discovery for newbies) works better when the number of members is smaller-- by splitting methods between the two types, working with each gets easier.

Asked the other way round: what would the benefit of weakening the type system in that regard be?

It’s all a question of cost vs. benefit and DateTime has the great benefit of reducing bugs due to illogical date/time calculations by forbidding such actions. DateTime exists for very much the same reasons that a strict type-checking system exists in the first place: to make semantic errors in the code produce compile-time messages. that notify the programmers of errors in their code.

Conversely, there’s the cost of having DateTime: zilch.

Now consider dropping DateTime. What would we gain?

To answer your question directly: “isn't TimeSpan redundant?” Absolutely not, it reduces bugs. It definitely has, for me.

Think about it conceptually. If I tell you that I'm having a party 7 days from now, is "7 days" in that sentence a date. Could I just say my party is on 7 days? Of course not, because 7 days isn't a date. One of the key ideas of object oriented programming is to represent concepts like this in the system as types. It's true that we could represent everything as an integer (and in fact, many people have and do), but in object oriented programming, we have the notion of types of items, and their behaviors and properties, and in that sense, it makes sense to have an object that expresses this.

I think you could make the opposite argument that DateTime is redundant, and we should only have TimeSpan :)

Seriously, all dates really are just time spans. They are all relative to some starting point. Technically, there is no "year zero" in the Christian calendar (since you can't really have a "zeroth year of our lord"), but if we assign 12:00 A.M. January 1, 0001 B.C. as the "zero point", then every date that comes after (or before) can be thought of as relative to that date. So, 12:00 A.M. on September 19, 2009 would have a TimeSpan of 734033 days.

So, mathematically, DateTime and TimeSpan are redundant. But when we write code, we are attempting to communicate much more than just abstract mathematical constructs. Any given DateTime instance may in fact just be a time span relative to some arbitrary zero point, but to most people reading your code, it will imply a particular point on the calendar. Similarly, a TimeSpan implies the gap between two points on the calendar.

In this case, Microsoft has chosen to be clear rather than parsimonious. I can't say I disagree with the decision.

There are a lot of complications in dates, for example:

leap years
leap seconds
the 1582 change to the gregorian calendar
the fact that there is no such thing as 0 years
differences in the lengths of months

Treating Dates and TimeSpans as different things means that these kinds of issues are much less likely to confuse you in practise.

its sugar not more or less....

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow