A few years ago, a major release got delayed by several weeks because defects kept coming up when the code went to the staging environment. We "thought" we had decent QA practices in place, but apparently something was wrong. The team leads, managers, and director all got together and discussed what we were going to do about the problem and decided to put the focus on the quality of features, rather than quantity over the next several releases. The final outcome of the meeting was the following:
  • Teams would no longer expand the iterations as needed (which was something that was going on), we were going to stick with a fixed time frame.
  • More time would be allocated at the end of each iteration for testing purposes.
  • A formal set of criteria would be written up for code reviewers to follow. It included things like running through all test cases, testing corner case scenarios, running profilers, etc.
While I completely agreed with the 1st change, I was at odds with the other two. The other 2 items were stated as a formal change in our process, but at the same time the tech leads were told they had the authority to allow the team to decide it's own internal process. Rather than bring up the contradiction, I focused more on the "allow the team to decide it's own" part.

Here is a short overview of some of the problems I saw on my team that ultimately contributed to the lack of quality:
  • Stories were not considered Accepted until after the iteration was complete. The definition of acceptance was that the team finished the iteration and was moving on to the next iteration. In other words, there wasn't a clear definition of what Accepted meant.
  • Developers tended to work on an entire story alone, calling for a code review and checking it in all at once.
  • QA Analysts would prioritize writing test cases over testing stories that were finished and pushed to integration.
  • There was a designated code reviewer, which was responsible for not only making sure the code followed conventions and finding potential issues, but also for running through all of the test cases with the developer, and testing the functionality before giving the green light on checking it in. When defects were found post iteration, more focus was put on the code reviewer for not finding the issue, than QA or the original code author.
  • The team allocated a block of time at the end of the iteration to perform QA. So in a 3 week iteration, we would typically block off 3 or 4 days for just testing.

Acceptance and Definition of "Done"

As the technical lead, I wanted to gather as much data as I could for our retrospective meetings. After deciding that team velocity was a good place to start, I sat down to compute our velocity. I quickly realized that I couldn't do it because there was no way for me to know whether the team received points for a particular story because there was no definition of done. If the story isn't done, then we don't get the points. We were missing this along with any type of story acceptance during the iteration. I felt that part of the definition of done, should be that the story needed to be accepted by our QA Analyst during the iteration. Rather than define it myself, I wanted the team to be involved so there would be buy in from them. When I brought this up in our next retrospective, it became an action item to develop this definition so we could start tracking our velocity. The definition eventually included a bullet point that said that QA had to accept the story. The QA Analyst was suddenly given this enormous power and the team was more supportive in doing whatever was needed to get the QA Analyst to accept the story. When QA would accept a story, they would update the state of the story in our process management software so everyone could see what was accepted and what wasn't.

Stop "Shot Gunning" the Iteration

A single story was usually divided up into the layers of the application. There would be a DB/DAL task, a service layer task, and a UI task. The natural tendencies of the team would be for each person to pick up a story and start with one of the outer layers and work their way to the other side. This has some unintended side effects. For starters, there's very little collaboration within the team if each team member is in their own silo, working on a story. The stories take longer to complete and causes a rush of completed stories near the end because only one developer is working on the story. I like to think of this approach as the "shot gun" approach because at the start of the iteration, each of the 6 developers shoot out of the gates on their own story. What we found works better, is to have as few in-progress stories as possible. Instead of one developer, two developers work on a story. The two developers decide upon the interfaces between the layers, check that into source control, and then simultaneously work on filling in the implementation of those interfaces. We found that stories got done faster, the team started to gel better, and the stories were made available for QA at a more steady pace. We basically discovered the obvious, if you start testing your stories sooner, you'll usually have less defects later.

QA Priorities

In a given iteration, the QA Analyst would start working on the test cases for each story in priority order. Several days into the iteration, the team would be finished with a user story and would push the code into our integration environment for QA to test. However, since the test cases for all of the other stories weren't complete, the stories that were pushed would just sit there and the team member(s) would move on to the next story. What we ended up with, was the development team moving on in the process and leaving the QA Analyst behind. The only time they could catch up, was during the last few days of the iteration. What we would end up with, was several stories that QA didn't get a chance to look at, and test cases for a story or two that were pushed out to the next iteration. I encouraged our QA Analyst to immediately stop working on test cases as soon as a story was available to be tested. I also encouraged the developers to stop working on new stories if there were defects found by QA during the iteration. This helped keep the team from getting ahead of the QA Analayst, however it also made it clear that we either needed another QA resource, or needed to allow a developer to write test cases.

Improvement?

Based on what I've discussed so far, here's a look at the distribution of stories for each iteration in chronological order. What I'd like to point out here, is the improvement from having no stories accepted in an iteration, to having stories accepted and completed throughout the iteration.






Notice how the green bars become more prevalent throughout the iteration as we improve from iteration to iteration. You'll also notice that the amount of yellow area's decreased over time, which indicates the amount of time between a story being completed and QA accepting it has been reduced.

In a future article, I'll discuss other changes that in combination to the above, allowed the team to go from about 20 defects post-iteration, to 1 or 2.

1 comment:

  1. Great analysis. I like that you were able to show the result over time of implementing those changes.

    ReplyDelete