How do experienced Haskell developers approach laziness at design time?

Question

I think most of the trouble with "strictness leaks" happens because people don't have a good conceptual model. Haskellers without a good conceptual model tend to have and propagate the superstition that stricter is better. Perhaps this intuition comes from their results from toying with small examples & tight loops. But it is incorrect. It's just as important to be lazy at the right times as to be strict at the right times.

There are two camps of data types, usually referred to as "data" and "codata". It is essential to respect the patterns of each one.

Operations which produce "data" (Int, ByteString, ...) must be forced close to where they occur. If I add a number to an accumulator, I am careful to make sure that it will be forced before I add another one. A good understanding of laziness is very important here, especially its conditional nature (i.e. strictness propositions don't take the form "X gets evaluated" but rather "when Y is evaluated, so is X").
Operations which produce and consume "codata" (lists most of the time, trees, most other recursive types) must do so incrementally. Usually codata -> codata transformation should produce some information for each bit of information they consume (modulo skipping like filter). Another important piece for codata is that you use it linearly whenever possible -- i.e. use the tail of a list exactly once; use each branch of a tree exactly once. This ensures that the GC can collect pieces as they are consumed.

Things take a special amount of care when you have codata that contains data. E.g. iterate (+1) 0 !! 1000 will end up producing a size-1000 thunk before evaluating it. You need to think about conditional strictness again -- the way to prevent this case is to ensure that when a cons of the list is consumed, the addition of its element occurs. iterate violates this, so we need a better version.

iterate' :: (a -> a) -> a -> [a]
iterate' f x = x : (x `seq` iterate' f (f x))

As you start composing things, of course it gets harder to tell when bad cases happen. In general it is hard to make efficient data structures / functions that work equally well on data and codata, and it's important to keep in mind which is which (even in a polymorphic setting where it's not guaranteed, you should have one in mind and try to respect it).

Sharing is tricky, and I think I approach it mostly on a case-by-case basis. Because it's tricky, I try to keep it localized, choosing not to expose large data structures to module users in general. This can usually be done by exposing combinators for generating the thing in question, and then producing and consuming it all in one go (the codensity transformation on monads is an example of this).

My design goal is to get every function to be respectful of the data / codata patterns of my types. I can usually hit it (though sometimes it requires some heavy thought -- it has become natural over the years), and I seldom have leak problems when I do. But I don't claim that it's easy -- it requires experience with the canonical libraries and patterns of the language. These decisions are not made in isolation, and everything has to be right at once for it to work well. One poorly tuned instrument can ruin the whole concert (which is why "optimization by random perturbation" almost never works for these kinds of issues).

Apfelmus's Space Invariants article is helpful for developing your space/thunk intuition further. Also see Edward Kmett's comment below.

How do experienced Haskell developers approach laziness at *design* time?

How do experienced Haskell developers approach laziness at design time?