Is it time for our development organization to let go of Scrum?

https://softwareengineering.stackexchange.com/questions/376097

07-02-2021
|

سؤال

I'm part of a development organization which has traditionally been using Scrum principles. We have 3-week sprints, at the end of which we produce a software artefact which our customers install on premise, and at their own discretion. Most customers have chosen to skip every odd sprint, and thus only install every 6 weeks.

Now we have a new, large customer, with several branches which will - or already are - use our software. Each branch has slightly different requirements, and a tight schedule to production.

What we're seeing since a few months:

Our development team spends less than 50% on backlog tasks
Daily business is heavily influenced by ad-hoc support tickets (which can concern defects, database cleanups, training the users, or simple cause analysis)
Every week we build service packs for already delivered releases. Thus a defect is committed to master, but then cherry picked to one or several older branches.
The time required for above support tickets and service packs will increase in the next 2 years, after which it hopefully will decline again
Our customers tend to not give us quick feedback, we have a feedback loop lag of sometimes several months

-- Edit reg. branches --

We do develop on origin/master. But say we're currently developing release 5.0.12, and have a bug fix or feature which the customer urgently wants on 5.0.10. In this case we cherry pick the patches to 5.0.10.1 and 5.0.11.1. There are no customer-specific branches, everyone gets the same software.

-- End of edit --

I'm pretty sure Scrum is not a good fit anymore for this kind of work. I'm not proficient with DevOps, but my understanding is that it is heavily built upon rapid and frequent deployment. If my understanding is correct, that also wouldn't fit.

Are there any best practices how to deal with this? Should we:

Split support and development, and keep Scrum for development? (a dedicated support team is planned)
Switch to Kanban (but keep ad-hoc support and development mixed)?

المحلول

No. Scrum is designed to fight exactly the issues you are experiencing.

You could switch to a reactive approach fixing bugs as they are reported as quickly as possible, but this comes at the cost of predictable completion dates and often the stability of your software.

I would say you should push back, ensure all requests go in the backlog and are prioritised. Increase the amount of time spent testing, ensure that what you release is exactly what was asked for and has no bugs. You are better off doing a single monthly release with zero defects, than 4 buggy weekly releases.

However, I would also say, 3 weeks is too long for a sprint. Move to 1 week sprints.

Stop Cherry Picking! Merge bug fixes in to branches. Try to move towards a single codebase, not away from it.

Hire more people. whatever methodology you choose there are only so many hours in the day. If you have more work, you need more people.

Re: Sprint length

With any sprint system the turn around time as at minimum three sprints, ie the customer sees the result of sprint 1, has a play, suggests changes which go on the backlog during sprint 2, which are maybe maybe completed in sprint 3.

In this specific case the customers don't always install every sprint, and dont get back promptly so they would be looking at installing after 6 weeks, reporting problems say on weeks 9-11 for implementation by week 15 and then installation of the fix on week 21. That's almost at 6 month cycle.

If you have a week long sprint, then a customer can install on week 6, get their requests in for week 9-11 and install the fixes on their next 6 week cycle on week 12. A much faster cycle.

Going shorter than a single week sprint isn't practical in my experience. Just in terms of the cadence of business and a working week. 'done by next week' is acceptable, 'done in two weeks, because we are working on your other stuff atm' is understandable delay. 'done in 4 weeks, because SCRUM' is unacceptable.

نصائح أخرى

I am not sure you have quite precisely put your finger on exactly what the problem is, especially due to having taken on the new large customer. Are you starting to miss sprint goals/ deadlines? Are you seeing quality suffer? Are you seeing throughput/ velocity drop? Are you losing new prospective customers to competition because you are no longer working as much on your backlog tasks (which I assume are related to your roadmap)? What has changed, other than the fact that due to the new customer, your work mix has changed to more maintenance/ bug-fix type of work? And that you are concerned your lead time to fix issues might go up in the near future?

Having a large number of customers who are giving you feedback is a good thing! And not being able to - at least temporarily - work on your product roadmap is not a bad thing. Perhaps, this is the time to put your house in order, learn about and implement DevOps, get your test automation in place, reskill some of your team members.

Here, Kanban can help you in doing so. You don't have to give up Scrum, but use Kanban on top of that (Scrumban as many people call it), to help you improve and resolve issues. Kanban can help in several ways -

It can help visualize your process - and work - on a Kanban board in more detail than is perhaps currently being done, to help you highlight which steps in your dev and deployment process might be the bottlenecks.

It will start giving you data on your current performance to help you benchmark important metrics such as lead time and throughput and help you setup improvement goals. More about Kanban metrics here.

If you are worried about increased lead time, due to too much work, Kanban recommends implementing WIP (work-in-progress) limits that you can implement at each stage of your workflow, to encourage your team members to finish what they already have on their plate before taking up the next work item. Limiting WIP will help you improve throughput and reduce cycle time.

Kanban also gives you the ability to visualize and manage different types of work - customer work, backlog or roadmap work, technical debt, etc. - using different swimlanes on a Kanban board. That way, both the team as well as your stakeholders, get a full picture of just how much work there is across these different categories, that the team can take up when they have spare capacity. Kanban's "class of service" categorization of work into Expedited, Standard, Fixed Date and Intangible, helps everyone also understand what might be the cost of delay associated with each of the work items the team is expected to do, and perhaps discard stuff that has low or no cost of delay.

Given the complexity of your current situation, you could implement WIP limits both at column-level and for specific swim-lanes.

Before you do any of this, you, of course, need a discussion - multiple discussions perhaps - with your dev teams and your stakeholders to understand what and where the problems are, and what you need to do to resolve them. These should help you come up with explicit policies - something else that the Kanban Method recommends - to deal with the backlog tasks vs. customer feedback, perhaps changing your sprints to every 2 weeks, and making a customer release only every 3 sprints, making capacity allocation to each type of work, taking a regular look at your overall backlog and prioritizing the top 5 or 10 items to be taken up next, etc.

There is no question that Kanban can help you analyze and improve your current dev management practices. You don't need to necessarily give up Scrum for that. If you'd like to learn more about Scrumban, you can look up here.

HTH - all the best!

There are a few things in your question which make me think that this is more than a process problem. There are already some great answers about work item management.

I agree that measuring Lead/Cycle times is the best way to analyse support efficiency but make sure that you measure these times from when the customer opened the ticket until it's closed in the customer's environment. Vanity development metrics mean nothing, all they'll do is make dev look good at the expense of the customer.

I agree that sprints are the best way to solve the work and allow you to pivot. I'm not completely sold on the 1 week sprint suggestion, that depends on how much effort is required to deploy at the end (you may want to look into automating deployments through something like Octopus).

However the big point I wanted to raise which no one else has is your comment about multiple branches.

From what you describe you have multiple branches of your code base which any fixes then have to be merged between the two. You're moving in the wrong direction here, what happens if you need a new flavour of your software? What happens if you take on a new client? Your source control will grow and grow and make it less and less manageable.

You need to move towards trunk based development (or at least short lived feature branches). You can use feature toggles/roles/configuration to customise your software for each client/project.

Doing this will:

Reduce bug fixing/feature implementation overhead
Allow you to consolidate your testing down to a single branch (making testing more efficient)
Allow you to build a proper CI pipeline where EVERY build is regression tested
Allow you to develop a proper deployment process without quirks of individual branches

The easier it is to push code out, the easier it is to develop features. But also the easier it is to deploy fixes. This means that the support work your team are currently facing becomes easier. You don't know what's going on, deploy more logging. You've found a bug which needs to be resolved today, don't DLL patch it, just deploy a new version.

One final point. You're concerned about the amount of time going into support work, you should be, it's a lot and every interruption costs you guys time. Define the criteria for work to be brought into the sprint and, if it is, push an equivalent work item out. My suggestion would be to have your "P1 criteria" the same as your 24 hour callout criteria. After all, if it's not important enough for an analyst to get out of bed at 2am it's probably not important enough to sacrifice the most important work on your backlog.

When measuring record the SPs of the work which you planned in and was completed. The missing space represents the interrupting support work. Your concern should be "how much of the planned work did we deliver?", support issues you didn't drop everything to resolve can be planned in but do so against your backlog items.

TLDR

Measure how long it takes between a customer raising an issue and them closing it
Move towards a trunk based development model
Consider automating deployments
Agree P1 criteria with the business, that's what interrupts your sprint

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى softwareengineering.stackexchange