Where and why is identity function useful?

Question 1

Identity is useful whenever an interface gives you more control than you actually need. For example, Either has no flatten method. Let's suppose you have

val e: Either[Double, Float] = Right(1.0f)

and you want to flatten it to a Double. How do you do it? There is a handy fold method, but no method to convert the right side to the left side's type. So you

e.fold(identity, _.toDouble)

and you've got what you want.

Question 2

Normally, you optimize formuals or code or whatever by factoring common part out. That is, DRY principle says that a ? v + x : v contains redundancy -- you type your v expression twice. You have to rewrite the code into the optimal form (a ? x: 0) + v. You see, we have factored out the common part out. You are tempted to do the same with

if (!initialized) initialize(callback_with_very_long_name) else callback_with_very_long_name

It initializes something if necessary and calls the callback either way. This looks pretty much similar to the math formula above. You can easily factor out a common factor in math/logic expression but how do you go about factoring out the function application? If you understand mathematiscs or Hascel, you should see that

a ? x + v : v = v + (a ? x : 0)

looks very much like

a ? f value : value = ???

You add x in one case but not in the other. You apply function to the value in one case but not in the other. You optimize the former into (a ? x : 0) + v because 0 is additive identity (it does not change anything when added to it) and v is a common factor here, which comes always, regardless of application of x. In case of function application (or not application), the callback is the common factor. We want to factor it out. What is the identity function that we should apply to it so that value does not change? The identity function!

(a ? f : identity) value

is what we are looking for. We have got rid of redundancy. Our original example looks like the following then

(initialized ? identity : initialize) (callback_with_very_long_name)

Please note that it fits into one row of page now. I presented it this way to highlight that the question "why do we need thing that does not need" is not scala-specific.

Question 3

The identity function is, well, the identity element for function composition, like 1 for multiplication or 0 for addition. We need it like we need 1 or 0. For instance, suppose we are writing a (higher-order) function that takes a list of functions and returns their composition: it is then natural to return the identity function for an empty list. This is just like we return 0 for an empty list if we are writing a function that takes a list of integers and returns their sum.

Question 4

Eventually there will come a time when you've refactored your code in such a way that you either accept a higher order function or you accept some Monad M of M[A => B] in which you do not want a value of B but instead want the same thing you put in. Keeping your code fluid and responsive to the changing needs of your project (or reducing D-R-Y violations) you'll undoubtedly want identity.

Question 5

On a more theoretical side (and since I can guess where this question has arisen from), the one fundamental use of the identity functions is that they allow you to define the notion of isomorphisms, which often turn out to be a much more useful definition of "sameness" than usual equality (read this).

An isomorphism tells you that "going there and back again is the same as staying here". In set-like structures, that definition corresponds to a bijective function - but not everything is set-like. So, in general we can say: A function (morphism) f is an isomorphism, if there is a g (its inverse), such that f . g == id and g . f == id. Note that this crucially depends on having id: we can't in general assume that things have "elements" to which we can refer to, like it us usually done when introducing bijective functions.

Here's a non set-based example: consider a directed graph. Say there are vertices A -> B and B -> A. Since paths can be (associatively!) concatenated, we have paths A -> B -> A, and B -> A -> B. But they are just "the same thing" as loops A -> A and B -> B (or "staying at one edge")! We now can also say that those are the identity paths for A and B. No bijection or "forall x in A ..." is involved at all.

All these structures can also be described (and are used) with categories in programming (Scala, Haskell); for example, pipes form a category, and thus need to have identity pipes.

And, aside, here's also another practical use, using id as base value for a fold:

doAll = foldr (.) id [(+1), (*2), (3-)]

The short version of combining a number of endofunctions.