Thursday, January 24, 2013

The Evolution of Quality on an Agile Team

A few years ago, a major release got delayed by several weeks because defects kept coming up when the code went to the staging environment. We "thought" we had decent QA practices in place, but apparently something was wrong. The team leads, managers, and director all got together and discussed what we were going to do about the problem and decided to put the focus on the quality of features, rather than quantity over the next several releases. The final outcome of the meeting was the following:
  • Teams would no longer expand the iterations as needed (which was something that was going on), we were going to stick with a fixed time frame.
  • More time would be allocated at the end of each iteration for testing purposes.
  • A formal set of criteria would be written up for code reviewers to follow. It included things like running through all test cases, testing corner case scenarios, running profilers, etc.
While I completely agreed with the 1st change, I was at odds with the other two. The other 2 items were stated as a formal change in our process, but at the same time the tech leads were told they had the authority to allow the team to decide it's own internal process. Rather than bring up the contradiction, I focused more on the "allow the team to decide it's own" part.

Here is a short overview of some of the problems I saw on my team that ultimately contributed to the lack of quality:
  • Stories were not considered Accepted until after the iteration was complete. The definition of acceptance was that the team finished the iteration and was moving on to the next iteration. In other words, there wasn't a clear definition of what Accepted meant.
  • Developers tended to work on an entire story alone, calling for a code review and checking it in all at once.
  • QA Analysts would prioritize writing test cases over testing stories that were finished and pushed to integration.
  • There was a designated code reviewer, which was responsible for not only making sure the code followed conventions and finding potential issues, but also for running through all of the test cases with the developer, and testing the functionality before giving the green light on checking it in. When defects were found post iteration, more focus was put on the code reviewer for not finding the issue, than QA or the original code author.
  • The team allocated a block of time at the end of the iteration to perform QA. So in a 3 week iteration, we would typically block off 3 or 4 days for just testing.

Acceptance and Definition of "Done"

As the technical lead, I wanted to gather as much data as I could for our retrospective meetings. After deciding that team velocity was a good place to start, I sat down to compute our velocity. I quickly realized that I couldn't do it because there was no way for me to know whether the team received points for a particular story because there was no definition of done. If the story isn't done, then we don't get the points. We were missing this along with any type of story acceptance during the iteration. I felt that part of the definition of done, should be that the story needed to be accepted by our QA Analyst during the iteration. Rather than define it myself, I wanted the team to be involved so there would be buy in from them. When I brought this up in our next retrospective, it became an action item to develop this definition so we could start tracking our velocity. The definition eventually included a bullet point that said that QA had to accept the story. The QA Analyst was suddenly given this enormous power and the team was more supportive in doing whatever was needed to get the QA Analyst to accept the story. When QA would accept a story, they would update the state of the story in our process management software so everyone could see what was accepted and what wasn't.

Stop "Shot Gunning" the Iteration

A single story was usually divided up into the layers of the application. There would be a DB/DAL task, a service layer task, and a UI task. The natural tendencies of the team would be for each person to pick up a story and start with one of the outer layers and work their way to the other side. This has some unintended side effects. For starters, there's very little collaboration within the team if each team member is in their own silo, working on a story. The stories take longer to complete and causes a rush of completed stories near the end because only one developer is working on the story. I like to think of this approach as the "shot gun" approach because at the start of the iteration, each of the 6 developers shoot out of the gates on their own story. What we found works better, is to have as few in-progress stories as possible. Instead of one developer, two developers work on a story. The two developers decide upon the interfaces between the layers, check that into source control, and then simultaneously work on filling in the implementation of those interfaces. We found that stories got done faster, the team started to gel better, and the stories were made available for QA at a more steady pace. We basically discovered the obvious, if you start testing your stories sooner, you'll usually have less defects later.

QA Priorities

In a given iteration, the QA Analyst would start working on the test cases for each story in priority order. Several days into the iteration, the team would be finished with a user story and would push the code into our integration environment for QA to test. However, since the test cases for all of the other stories weren't complete, the stories that were pushed would just sit there and the team member(s) would move on to the next story. What we ended up with, was the development team moving on in the process and leaving the QA Analyst behind. The only time they could catch up, was during the last few days of the iteration. What we would end up with, was several stories that QA didn't get a chance to look at, and test cases for a story or two that were pushed out to the next iteration. I encouraged our QA Analyst to immediately stop working on test cases as soon as a story was available to be tested. I also encouraged the developers to stop working on new stories if there were defects found by QA during the iteration. This helped keep the team from getting ahead of the QA Analayst, however it also made it clear that we either needed another QA resource, or needed to allow a developer to write test cases.

Improvement?

Based on what I've discussed so far, here's a look at the distribution of stories for each iteration in chronological order. What I'd like to point out here, is the improvement from having no stories accepted in an iteration, to having stories accepted and completed throughout the iteration.






Notice how the green bars become more prevalent throughout the iteration as we improve from iteration to iteration. You'll also notice that the amount of yellow area's decreased over time, which indicates the amount of time between a story being completed and QA accepting it has been reduced.

In a future article, I'll discuss other changes that in combination to the above, allowed the team to go from about 20 defects post-iteration, to 1 or 2.

Thursday, January 17, 2013

SCM Branching Patterns for Agile Teams

Many patterns exist out there for software configuration management, none of which are a panacea for all of your SCM problems. Things get especially complicated when you have more than one team working on features for your product. To start off the discussion, I'm going to start with a simple branching strategy that works well for a single team, and then show what can happen when you introduce multiple teams into the mix.

Before discussing the different patterns, I'd like to clarify how to think about branches. Branches are inherently evil if they're abused (and it's easy to do so). Each time you branch, you have more to maintain, more to merge, and usually more headaches. So when you decide to branch, you have to come up with a good reason to justify the headaches. These justifications can be represented in the form of policies for the branch. A branch policy is a set of statements that define the rules for checking into that branch. If a set of branches have the same policy, then they generally should be merged into one branch. There are exceptions to this, such as code isolation, but is true for the most part.

A Basic Strategy

A simple branching strategy for a development team could be the following:

  • Main
    • Code builds
    • Only potentially releasable code is checked into this branch.
    • Code passes all unit tests.
  • Release
    • Only regression tested, released code is checked into this branch.
    • Only bug fixes related to the released version should be checked in.

The idea is that each developer works on user stories and only checks code in when the branch policy has been satisfied. Once all users stories in the Main branch are ready to be released, a branch is created from Main. Developers continue to work on the next release out of the Main branch while maintenance fixes are checked into the release branch. At some point before the next release, the bug fixes in the release branch are merged back into the Main branch so they'll be present in subsequent releases.

Let's discuss what happens when we scale this to 4 development teams. Right away, team's will notice that the build breaks more often. Why? Because instead of 6 or 7 people checking in code, we now have close to 30 people checking in code. As humans, we make mistakes and now those mistakes are going to happen more often and when it does, it hurts everyone. Another thing that you'll start to see is fewer check-ins from developers. The reason is because every time someone checks in code, everyone else has to pull it down and merge. When this happens, you have to retest what you were working on to make sure it still works, which takes more time.You'll also see the opposite, more check-ins (but ones that break the branch policy). A small number of developers will get tired of merging code and will be more prone to check in code that hasn't been fully tested. What's needed is some sort of isolation among the teams so that each team will be less likely to interfere with another team.

Multi-Team Strategy

The pattern that I've experienced that works well is for each team to have their own branch, and then have an integration branch to integrate the features being developed in the development branches.

  • Team
    • Code reviewed
    • Code builds
    • Code passes all unit tests
  • Main
    • Only potentially releasable code is checked into this branch.
    • Must be regression tested
  • Release
    • Only regression tested, released code is checked into this branch.
    • Only bug fixes related to the released version should be checked in.



Team A and Team B are both working on user stories. Team A finishes story A.1 and merges that code into the Main branch. Later, Team B finishes story B.1 and before pushing it to Main, pulls down the latest code from Main (which contains A.1). Once the Team B branch is in sync with Main, the code is merged into Main. The cycle keeps going until the end of the iteration. When it's time to cut a release, the Release branch is branched from Main. Bug fixes found in the release are fixed in the Release branch and merged down into the Main branch, which will eventually make it's way into the team branches.

With this pattern, each team is free to check in and work as the did with the first pattern. In fact, the branch policy is more relaxed because all of the QA and integration is done in the Main branch, which allows for more intermediate check-ins into the team branch. As is with developer workspaces, the more you pull down from the Main branch and merge, the easier it becomes.