Pergunta

We have a working proof of concept that implements the sunny day scenarios, but we need to start handling the exception paths. We are contractors and need to put together detailed estimates to discuss this next phase with the client before they renew the contract.

Details

We have spent 10-20 developer months implementing the proof of concept library in JavaScript. It touches browser APIs, third party libraries, and connects to our client's proprietary server software for bi-directional real time communication. Currently this library runs behind an HTML+JavaScript UI written by our client. In the near future they will also be releasing this library as an sdk.

Up till now, we have mostly focused on sunny day scenarios to put the proof of concept together. Now that that's finished, we need to bring the library to maturity and support rainy day scenarios in the different areas of the library including:

  • bad data provided through the API by sdk users
  • error paths in our code
  • legal error responses from the browser API and third party libraries
  • network issues (jitter, bandwidth limits, temporary disconnection, etc.)

Question: What process can we follow to generate a reasonable task list of rainy day scenarios that we don't but ought to support. In other words: How can I figure out what's missing between (1) my proof of concept application and (2) a hypothetical mature, robust application that implements the same features

I am also interested in retrospective type suggestions about what we could have done while implementing the proof of concept to make it easier to itemize this work.

What I would do in a vacuum

Without any better advice, I will be doing a static analysis of the code I've written, and the specs for the third party libraries and browser apis, and identifying possible error cases. Starting with the whole, I would divide it into components, subcomponents, etc.. When I've reached enough granularity, I would perform the static analysis or analyze the spec and itemize ways it could fail (legal error output, reasonable exception cases, etc.). Then I would aggregate those failure modes for each subcomponent, then component, and back to the whole and I would have a list of rainy day scenarios that I need to support.

Issues with this:

  1. how can I avoid missing something major?
  2. is there a way to avoid analyzing code and spec line by line?
  3. we'll need to prioritize these rainy day scenarios - how do we divide the list into more important and less important? (likelihood that it would occur?)
  4. how to aggregate the list of particular error cases for particular methods/apis into something high level enough for the project leads to make decisions about what/how much/when? (This is closely related to the last question about prioritization)

Update: I've done an FMEA (Failure Mode Effects Analysis) which was helpful in analyzing the system boundaries - rainy day scenarios related to networking, consumers of the public API, and (to a lesser degree) errors at the third party api layer. The FMEA process suggests using a diverse team to generate the list of possible failures, which would help with my issue #1 (how to avoid missing something major). It also deals with #3 (how to prioritize) by giving a priority metric based on the severity of a failure, it's likelihood of occurring, and the likelihood that it will be detected. It didn't help me (at this point) to come up with a rainy day scenario list for the code we've written, and it didn't address my issues #2 and #4.

Foi útil?

Solução

I would usually advise against developing a program like this. Taking a prototype and "selling it" as working well once all the paths are fixed usually devolves into a ROI on every little error. One thing your going to hear in the near future is "How many users will even see that. No I'm not going to pay to fix that."

However, my Way back machine is currently in the shop, so onward we go. First never say things like "the product is done except...". Again, you spent 15 months developing the product. Now every thing you want to do from this point is going to be "is it worth it?". If instead you say things like "need to add polish", or "just needs to have some corners rounded", you will be in a much better place. It's easy to explain that the house is built but still needs paint, then it is to look at a house that is built and trying to explain it's not build yet.

Understand, that what you build using this prototype first method is a crappy product. It works, you just said it did. It just doesn't work well, and that's the fight your going to have going forward. As a contractor, if your not careful you might even work yourself into a place where your eating billable time.

Consider a house again. You have one built. You pay. Now you take a tour, and it looks good, not awesome, but good. Later as you start taking a closer look, you notice one of the walls is shorter then it should be. You bring it up but the builder says, "We know, we said there are things that are not done yet. It will be another $X." How would you respond.

I know it seems I am picking, but it's very important to frame the situation correctly to avoid as many issues as you can.

Now, on to the technical. You have two goals. First the total cost of the "paint job" MUST be less then the total cost of development. Second you need to make sure the library works.

So, implement a reference project. Use something like https://github.com/lord/slate and document how to use the lib. Make sure that each method has a positive and negative response. foo(1) does this, but calling foo("sting") throws an error. As you build your reference project you can identify and fix problems in your lib without having to have it cost directly against that project in many cases. This lets you avoid the "paint job must cost less" issue, while still giving your a place to bill should you need to.

Most people would be more willing to pay for an example project and documentation. Most are willing to accept that good documentation has a cost. Most will even understand that during the documentation process, errors are discovered and fixed. So you get a new bucket to bill from AND a way to find issues in your lib that will be most critical.

Accept that it won't find everything. But that's the place you put the project in. Now your playing an ROI game sooner then you should be. This will find the most critical issues.

As to estimating your time, I would say double your project time. You spent 15 months on the lib, spend 30 on the reference project and documentation. Is it perfect, no, but it's easy to understand, and gives your a place to start.

Outras dicas

How can I figure out what's missing between (1) my proof of concept application and (2) a hypothetical mature, robust application that implements the same features.

Locating the product

If the hypothetical mature application already exists it means that we are competing against someone else. It's good for us to know what our place in the market is. And the actual situation of the market too.

Usually, we aim to fill up a gap that nobody else is filling. This is our differential value. It could be any thing. For example usability, performance, compatibility, scalability, distribution, missing features. etc. Anything that actually matters to the potential consumer1.

Locating to us in the market lead us to identify our immediate competitors and compare. We will find interesting the community's opinion regarding the competitors because there's a lot of valuable feedback there. We also get a vague notion of what the community is asking for or what's missing.

Stressing the product

With or without competitors, we'll have to stress the product. As much and as long as possible. At this point, we can look at the video game industry. In the video game industry, open/closed betas are quite common. During OB/CB, a limited number of players are invited to play and stress the game for a limited period of time. Players are encouraged to "hack" the game and share their thoughts. Trust me when I say that some players are capable of pushing the limits far beyond the developer's expectations.

Betas generate an invaluable feedback for both, developers and stakeholders. It won't only tell us what's missing, it also will tell us what shall we do. What works and what doesn't work. Maybe, what we consider to be a key feature, it's insultingly ignored and trivialized.

QA

Is there a way to avoid analysing code and spec line by line?

I would expect automated tests and static analysis all the time. The static analysis might seem irrelevant, but it's an objective "measure of quality" we can take as reference. Essentially, it tells us in which conditions our product is released in. However, it can not tell us how "good", "bad", "wrong", "right" is the product. In this specific case, it can't tell how "developer-friendly" is the SDK.

We should consider setting a "beta tester" team with our best players -developers- for manual testing and analysis.

Analysis

You are doing a good job so far. However, don't get fixated upon it to the degree of not taking the next step. Taking forward the project. At this point, you might be interested in checking out some anti-patterns related to the Organization, as for instance Analysis paralysis or Bicycle shed.

how to aggregate the list of particular error cases for particular methods/APIs into something high level enough for the project leads to make decisions about what/how much/when? (This is closely related to the last question about prioritization)

Prioritize issues

Would be naive to me to say how you should do this because, IMHO, this pretty much depends on your technical vision and the business strategy. Back to the competitors, whatever makes your SDK different among the competitors make it valuable and I would consider these differences to be priorities2. I think also that the consumer's feedback will provide you with the inputs necessary to make decisions regarding this subject.


1: The competitor`s weakness can be my strength.

2: I would not let my weakness to be my competitor's strength

Throw it away!

Not the answer you were looking for? This may sound like a strange advice, but proof of concepts and prototypes are designed to be thrown away. If you implement the final system by improving the prototype gradually, you will have something very similar in structure to the prototype. Not good! I wouldn't limit my final version of the program into what can be reasonably evolved from a prototype. I have been in projects where the prototype was made into the final code, and the results are, well, quite horrible.

Don't, however, immediately go on and delete everything. Perhaps your prototype has a good thing or two, that you could make into reusable libraries. My prototypes certainly have had something good in them which I have always rescued by adding proper handling of corner cases. Parts of your prototype are good, parts are bad, your job is to determine which are good and make the good parts into reusable libraries.

When designing the real system after the proof of concept has demonstrated it's feasible, beware of the second-system effect. If your mindset is that "this time I'm going to do everything well and correctly", you are going to have tons of complexities in the code that will result in nothing good. Just build a good enough product using the experience obtained with the prototype and avoid the pitfalls you fell into when implementing the prototype.

This might be an awful and arduous task, if you weren't designing things with an eye towards extensibility. If you were, though, this shouldn't be too bad.

Your first task is to determine the boundaries of your code - what is publicly consumable and changeable, what is publicly visible but not publicly modifiable, and what is entirely internal to your layer.

From there, you've got two primary concerns:

  • Ensuring that what's private is actually private. In a language that permits you to do so, that means that private fields are actually marked private; in JS/ES, you will have to rely on naming conventions (eg, you might decide that all "private" variables are prefixed with "__" and have getters and/or setters if appropriate that are not prefaced that way).
  • Ensuring that all use cases are accounted for somehow or another. You want to ensure that people can access what they do need to outside of your layer.

You probably want to write explicit tests around those two things.


Once you've got that taken care of, you probably want to try implementing a few things as a consumer of your layer. This will help ensure that your idioms work well in other ecosystems, and will assist in finding anything that's not accessible but should be.

In parallel to all of this, if you don't already have tests, you should be adding them. At a minimum you want unit tests, but ideally you've already got those - however, you sound like you may not have many that cover failure cases. Now is a good time to add those: what happens if you pass in an incorrectly formatted date? A negative number? A fractional number? A string that contains some SQL?

This is also a very appropriate time to start adding integration tests as well. Those integration tests would be a good place to try to account for networking issues.


So, how do you present this to your client? That will depend on your relationship to them and what development norms you currently have. If you've been working in sprints, I'd urge you to commit to one sprint of exploration, whose output will be a more detailed list of what areas need work and what areas are currently looking good.

If the client expects a detailed writeup of what will be done by what date... that may be a problem, since you need to start doing some of the work in order to estimate that. All hope isn't lost, though - you can still estimate a few dates for "preliminary analysis", "writing failing tests", "fixing failing cases", and maybe a round of acceptance testing. How long those should be will depend on the size and complexity of your application, but remember that you always want to estimate higher than you think you'll actually need. If they push back, see if you can come to an agreement - maybe "OK, I'll allocate one week (instead of the two months requested) for doing this, but only for this one subsystem. After that's done, we'll revisit the other systems with a revised estimate, allowing us to use what we've learned while working on that one subsystem." This gives you some leeway - you may find some good shortcuts to the bigger effort, and you may find that the two months initially requested was inflated somewhat. On the other hand, if you run up against concrete problems that mean you definitely would have needed the full two months, you can provide the client with a concrete writeup of what that blocker was (rather than a generic "well, I'm pretty sure it'll be hard").

Licenciado em: CC-BY-SA com atribuição
scroll top