In scrum, how do you give an estimate for a backlog item that is primarily research?

https://softwareengineering.stackexchange.com/questions/225113

02-10-2020
|

Question

A few sprints ago I was assigned a task that was primarily research. I had to figure out how to get our product to interoperate with a very complex black box that we did not develop.

I couldn't think of a way to estimate this work. Even if I got the ball rolling and knew the immediate problem I faced, I could not get a sense of how many other problems I'd have to solve after that. I could never tell if I was almost done or far from it. How am I supposed to estimate a backlog item like this?

I want to elaborate the nature of this assignment. I knew what calls I had to make to interoperate with the black box. That was the easy part. But the API took a very, very complex object as a parameter. Calling the API would throw an error and it was not easy to figure out what that error was trying to tell me. The black box wouldn't tell me all the problems wrong with my request, it would just tell me the first problem it found. This made it very difficult to know how much work I had left.

Solution

if you can't estimate - and in this scenario it sounds like there really is no way to know in advance how long any particular part of the process will take - then the next-best option is to time-box the effort: how much time you're willing to spend on it, whether you get anywhere or not

once you get into it, you may have a better idea of how to estimate the remaining effort

OTHER TIPS

Incredibly simple: It's always 3 points.

(or pick a different constant). This is the approach we took at our last job and I think it worked out relatively well.

The idea is that you have a backlog full of actual stories and one of them happens to be an integration story with a device you know nothing about. If someone asked me to size that story(actual integration) without any more knowledge, I would give a very conservative estimate. If I did similar integrations in the past, I might pick a story points on high end. If I never did similar work and don't even know where to begin, I would just pick a million points.

So then obviously product owner comes back and says, we can't have a million points, so let's create another story about research. The goal of the research story IS NOT to code the integration. It IS NOT about solving every possible problem you might face. The primary and only goal of the research story is to give better point estimate for the actual work story. Don't get lost trying to figure out how to use every single API call and come up with actual integration design/code. Instead, approach it from a high level to understand the complexity that's ahead of you.

At the end of the research story, your #1 deliverable (along with possible research notes, prototypes... etc) is a point estimate for the integration and possibly break down of integration story into several smaller parts. Keep in mind that each of those still has an unknown component given enough research, you should be able to control for that unknown part (i.e. you read about usage of some API and it looks like its going to be 5 points, well, pick 8 just to be safe).

What we found is that the developer assigned the research task always runs out of steam at almost a constant number of hours (obviously there is some fluctuation, but it averages out at the end). After that you could spend twice as many hours, but your integration story estimate won't get any better.

That's why they call them estimates.

An accurate estimate for purely research-related work is improbable. That's the nature of research. But I think it's worthwhile to point out that sometimes you think you can't estimate it, when in actual fact, you can. Scientists know this; most research work is repetitive and mundane... it's the theorizing and changes in direction that cause variations in the required time to occur.

So the way you come up with an estimate for research-related work is the same way you would come up with an estimate for ordinary development: break the task into smaller pieces, and evaluate each piece separately. Research work is no different in that regard from ordinary, non-research development; you quantify the work, and then give it your best guess.

For example, you might not be able to estimate the task "Determine how this black box works, and how to interface our software product with it." But you may be able to come up with independent estimates for:

Evaluate and understand the black box's API
Evaluate the behaviors of the black box
Determine what commands will execute the behaviors we want
Analyze, identify and mitigate possible safety concerns
Map commands from our software to commands in the black box, and evaluate the returned data.

And so forth.

Some estimates may depend on the completion of previous steps. For example, estimating step #3 may not be possible until step #2 is completed.

Two ways I'm using:

The deliberate discovery approach: start with two stories (in reverse order):

how much time you'll need to figure the system out
how much time you'll need to estimate how much time you need to figure the system out - keep that one very short on purpose

The key is that any of those is limited by the value of the feature. It doesn't make sense to spend a year figuring out a system in order to deliver feature X, if in half that time you could deliver feature Y and Y has more business value. Your PO should be able to give you a number: 'if you need more than two months to figure it out, it's not worth it'. Refine the estimate for the first story when you complete the second one.

The discover as you go approach:

how to get our product to interoperate with a very complex black box that we did not develop.

This is way to big a story. What is the first thing/feature/aspect you're going to work on (which one is the most valuable?). It might be as simple as 'log in the black box'. This is a simple story, the integration points shouldn't be that bad, and is reasonably easy to estimate with minimal research. Estimate that. Put any other story at it's business value boundary and refine as you learn more.

Either approach is problematic regarding long term planning. When you get started, you just can't tell when you'll have all features done. That's ok - you'll just start by having reliable predictions for tomorrow, then next week, then next month, then 6 months as you build expertise.

Estimates are not single numbers. An estimate is a range that has a high end, a low end and a confidence. With the initial question of "how long will it take you" an estimate is not "it will take me 14 hours" but rather "it will take me between 10 and 18 hours".

As you refine the process, you can get a tighter and tighter range on the estimate

_{(source: construx.com)}

Note that this shows the variability of the estimate, not that "it converges at the average". An estimate may start out at "it will take me 10 to 18 hours" and then be "it will take me 14 to 18 hours" and then "it will take me 16 to 18 hours" as you get further along in the task.

One of the most often cited books in estimation is Software Estimation: Demystifying the Black Art by Steve McConnell.

Start out with a very wide estimate that has the best case and the worst case. Then shortly into the research, do another estimate, and then another and another. Don't make a commitment as to when it will be done with one number and then try to make it.

There are a number of different techniques for generating an estimate. Use historical data from past estimates. Try to get an idea of how big it is, the complexity and work from that (the list goes on, there's an entire book written about this).

While many research tasks can be open-ended, this one has a clear(ish) requirement: "how to get our product to inter-operate with a ... black box".

We are good at estimating jobs that are similar / related to jobs we have done before. Since you haven't done it before, you can only break the job into chunks and give each chunk a time-box. Such a job can go on forever, at the end of each time-box progress and tasks remaining should be reevaluated. In extreme cases this is when a go / no-go decision should be made.

When I work on such jobs, I define a 5 step process

(1) Initial planning. Make an initial guess at (2) what needs to be done to get started, and define (3) a bunch of iterations / stages, each followed by (4) a review and (5) re-plan step. For example:
Getting started tasks. In this case it might be "get the black box specs", "study the specs", "obtain a black box", "obtain sample software", "establish support link".
Iteration 1. Maybe "Test basic operations. Setup basic debugging environment. Review. Re-plan."
Iteration 2 ... and so on, with at least 3 but no more than 5 or maybe 7 iterations before a big progress review.

How long will each iteration take? Time box them - such jobs will go on forever if you don't. Only you can say if the boxes should be an hour, a day, a week, ... .

Also, be flexible: as you learn, the plan may have to change. Don't punish yourself. This kind of job is like debugging - you have to teach yourself. But also like debugging, it can help to get input from others. And like debugging, you cannot do it while multi-tasking.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange