Thursday, January 24, 2013

The Evolution of Quality on an Agile Team

A few years ago, a major release got delayed by several weeks because defects kept coming up when the code went to the staging environment. We "thought" we had decent QA practices in place, but apparently something was wrong. The team leads, managers, and director all got together and discussed what we were going to do about the problem and decided to put the focus on the quality of features, rather than quantity over the next several releases. The final outcome of the meeting was the following:
  • Teams would no longer expand the iterations as needed (which was something that was going on), we were going to stick with a fixed time frame.
  • More time would be allocated at the end of each iteration for testing purposes.
  • A formal set of criteria would be written up for code reviewers to follow. It included things like running through all test cases, testing corner case scenarios, running profilers, etc.
While I completely agreed with the 1st change, I was at odds with the other two. The other 2 items were stated as a formal change in our process, but at the same time the tech leads were told they had the authority to allow the team to decide it's own internal process. Rather than bring up the contradiction, I focused more on the "allow the team to decide it's own" part.

Here is a short overview of some of the problems I saw on my team that ultimately contributed to the lack of quality:
  • Stories were not considered Accepted until after the iteration was complete. The definition of acceptance was that the team finished the iteration and was moving on to the next iteration. In other words, there wasn't a clear definition of what Accepted meant.
  • Developers tended to work on an entire story alone, calling for a code review and checking it in all at once.
  • QA Analysts would prioritize writing test cases over testing stories that were finished and pushed to integration.
  • There was a designated code reviewer, which was responsible for not only making sure the code followed conventions and finding potential issues, but also for running through all of the test cases with the developer, and testing the functionality before giving the green light on checking it in. When defects were found post iteration, more focus was put on the code reviewer for not finding the issue, than QA or the original code author.
  • The team allocated a block of time at the end of the iteration to perform QA. So in a 3 week iteration, we would typically block off 3 or 4 days for just testing.

Acceptance and Definition of "Done"

As the technical lead, I wanted to gather as much data as I could for our retrospective meetings. After deciding that team velocity was a good place to start, I sat down to compute our velocity. I quickly realized that I couldn't do it because there was no way for me to know whether the team received points for a particular story because there was no definition of done. If the story isn't done, then we don't get the points. We were missing this along with any type of story acceptance during the iteration. I felt that part of the definition of done, should be that the story needed to be accepted by our QA Analyst during the iteration. Rather than define it myself, I wanted the team to be involved so there would be buy in from them. When I brought this up in our next retrospective, it became an action item to develop this definition so we could start tracking our velocity. The definition eventually included a bullet point that said that QA had to accept the story. The QA Analyst was suddenly given this enormous power and the team was more supportive in doing whatever was needed to get the QA Analyst to accept the story. When QA would accept a story, they would update the state of the story in our process management software so everyone could see what was accepted and what wasn't.

Stop "Shot Gunning" the Iteration

A single story was usually divided up into the layers of the application. There would be a DB/DAL task, a service layer task, and a UI task. The natural tendencies of the team would be for each person to pick up a story and start with one of the outer layers and work their way to the other side. This has some unintended side effects. For starters, there's very little collaboration within the team if each team member is in their own silo, working on a story. The stories take longer to complete and causes a rush of completed stories near the end because only one developer is working on the story. I like to think of this approach as the "shot gun" approach because at the start of the iteration, each of the 6 developers shoot out of the gates on their own story. What we found works better, is to have as few in-progress stories as possible. Instead of one developer, two developers work on a story. The two developers decide upon the interfaces between the layers, check that into source control, and then simultaneously work on filling in the implementation of those interfaces. We found that stories got done faster, the team started to gel better, and the stories were made available for QA at a more steady pace. We basically discovered the obvious, if you start testing your stories sooner, you'll usually have less defects later.

QA Priorities

In a given iteration, the QA Analyst would start working on the test cases for each story in priority order. Several days into the iteration, the team would be finished with a user story and would push the code into our integration environment for QA to test. However, since the test cases for all of the other stories weren't complete, the stories that were pushed would just sit there and the team member(s) would move on to the next story. What we ended up with, was the development team moving on in the process and leaving the QA Analyst behind. The only time they could catch up, was during the last few days of the iteration. What we would end up with, was several stories that QA didn't get a chance to look at, and test cases for a story or two that were pushed out to the next iteration. I encouraged our QA Analyst to immediately stop working on test cases as soon as a story was available to be tested. I also encouraged the developers to stop working on new stories if there were defects found by QA during the iteration. This helped keep the team from getting ahead of the QA Analayst, however it also made it clear that we either needed another QA resource, or needed to allow a developer to write test cases.

Improvement?

Based on what I've discussed so far, here's a look at the distribution of stories for each iteration in chronological order. What I'd like to point out here, is the improvement from having no stories accepted in an iteration, to having stories accepted and completed throughout the iteration.






Notice how the green bars become more prevalent throughout the iteration as we improve from iteration to iteration. You'll also notice that the amount of yellow area's decreased over time, which indicates the amount of time between a story being completed and QA accepting it has been reduced.

In a future article, I'll discuss other changes that in combination to the above, allowed the team to go from about 20 defects post-iteration, to 1 or 2.

Thursday, January 17, 2013

SCM Branching Patterns for Agile Teams

Many patterns exist out there for software configuration management, none of which are a panacea for all of your SCM problems. Things get especially complicated when you have more than one team working on features for your product. To start off the discussion, I'm going to start with a simple branching strategy that works well for a single team, and then show what can happen when you introduce multiple teams into the mix.

Before discussing the different patterns, I'd like to clarify how to think about branches. Branches are inherently evil if they're abused (and it's easy to do so). Each time you branch, you have more to maintain, more to merge, and usually more headaches. So when you decide to branch, you have to come up with a good reason to justify the headaches. These justifications can be represented in the form of policies for the branch. A branch policy is a set of statements that define the rules for checking into that branch. If a set of branches have the same policy, then they generally should be merged into one branch. There are exceptions to this, such as code isolation, but is true for the most part.

A Basic Strategy

A simple branching strategy for a development team could be the following:

  • Main
    • Code builds
    • Only potentially releasable code is checked into this branch.
    • Code passes all unit tests.
  • Release
    • Only regression tested, released code is checked into this branch.
    • Only bug fixes related to the released version should be checked in.

The idea is that each developer works on user stories and only checks code in when the branch policy has been satisfied. Once all users stories in the Main branch are ready to be released, a branch is created from Main. Developers continue to work on the next release out of the Main branch while maintenance fixes are checked into the release branch. At some point before the next release, the bug fixes in the release branch are merged back into the Main branch so they'll be present in subsequent releases.

Let's discuss what happens when we scale this to 4 development teams. Right away, team's will notice that the build breaks more often. Why? Because instead of 6 or 7 people checking in code, we now have close to 30 people checking in code. As humans, we make mistakes and now those mistakes are going to happen more often and when it does, it hurts everyone. Another thing that you'll start to see is fewer check-ins from developers. The reason is because every time someone checks in code, everyone else has to pull it down and merge. When this happens, you have to retest what you were working on to make sure it still works, which takes more time.You'll also see the opposite, more check-ins (but ones that break the branch policy). A small number of developers will get tired of merging code and will be more prone to check in code that hasn't been fully tested. What's needed is some sort of isolation among the teams so that each team will be less likely to interfere with another team.

Multi-Team Strategy

The pattern that I've experienced that works well is for each team to have their own branch, and then have an integration branch to integrate the features being developed in the development branches.

  • Team
    • Code reviewed
    • Code builds
    • Code passes all unit tests
  • Main
    • Only potentially releasable code is checked into this branch.
    • Must be regression tested
  • Release
    • Only regression tested, released code is checked into this branch.
    • Only bug fixes related to the released version should be checked in.



Team A and Team B are both working on user stories. Team A finishes story A.1 and merges that code into the Main branch. Later, Team B finishes story B.1 and before pushing it to Main, pulls down the latest code from Main (which contains A.1). Once the Team B branch is in sync with Main, the code is merged into Main. The cycle keeps going until the end of the iteration. When it's time to cut a release, the Release branch is branched from Main. Bug fixes found in the release are fixed in the Release branch and merged down into the Main branch, which will eventually make it's way into the team branches.

With this pattern, each team is free to check in and work as the did with the first pattern. In fact, the branch policy is more relaxed because all of the QA and integration is done in the Main branch, which allows for more intermediate check-ins into the team branch. As is with developer workspaces, the more you pull down from the Main branch and merge, the easier it becomes.

Tuesday, October 30, 2012

Data Triggers with Knockout


Knockout and MVVM

Coming from a Windows application development background, I found the MVVM pattern one of the better UI design patterns. The basic idea is to separate your business logic (the 'M'), your view (the 'V'), and you view logic (the 'VM'). In this pattern, the view model works without every having to reference the view (your buttons and widgets). For example, instead of taking the EmployeeName field from your domain model and setting a label to that value, your view model would simply set a public property called EmployeeName. The view would then see that change and update the appropriate label on the UI. This cleanly separates your presentation layer from your domain layer. In order for this to work, there needs to be some mechanism for watching for changes to the view model, and updating the UI. WPF had that mechanism built in, but what about other technologies? Specifically, what about Javascript and web development? Enter Knockout.

Knockout allows you to declare data bindings between javascript objects and HTML elements. This can clean up your javascript code and allow for better separation between domain and view logic. Instead of taking a JSON object received by the server and writing code to save the domain object properties into the HTML elements, you can declare bindings between the domain object and HTML elements, and instead just set the properties on your domain object. Your javascript doesn't have to know the layout of the HTML or any other view knowledge. Please refer to the Knockout site for examples.

Data Triggers

Let's take this a step further and say you have a style called "Error" that changes a text box to indicate that there's something wrong with the data in it. Sticking with MVVM, a naive approach to solve this would be to have a property on your view model called TextBoxStyle and bind the style of the text box to that property. This isn't a good approach because your view model would have to have knowledge of the specific style needed, which isn't a concern of the view model. Another approach would be to write some custom code that watches for the view model to indicate there's an error with that field, and then set the "Error" style on the text box. This resolves the separation of concerns issue, but leaves us with writing custom code, which defeats the purpose of even using a data binding framework. One of the features that WPF gives you with respect to MVVM are Data Triggers. A Data Trigger is a way to tell the view that you want certain styles to be applied when a data condition is met. Knockout doesn't natively support data triggers, but writing an extension to provide that functionality is straightforward.

The general idea behind my solution is when a condition is met (trigger fired), record the previous HTML attribute values. When the condition is no longer met, (trigger reset) restore the original values. First let me show how one would use the trigger.
<button data-bind="enable: viewModel.canSave" 
  type="button" 
  disabled="disabled">
 <img src="~/Content/normal.gif" style="cursor: pointer;"
  data-bind="trigger: {
   condition: !viewModel.canSave(),
   attr: {
    src: '~/Content/disabled.gif'
   },
   style: {
    cursor: 'default'
   }
  }"/>
</button>


First, you would place your normal HTML attributes on the element. In the above example, we want the normal image to be shown. Then, when declaring the trigger, you specify the condition, and the new attributes and values that should  be set when the condition is met. In this example, we're setting a new disabled image and changing the cursor shown when the user mouses over the image. The script for the extension is shown below.
// trigger
//
ko.bindingHandlers.trigger = {
 update: function (element, valueAccessor, allBindingsAccessor) {
  //
  // First get the latest data that we're bound to and the target html element
  var value = valueAccessor();
  var jElement = $(element);

  // If the condition is met, replace the attributes and styles. Otherwise
  // restore the original values.
  if (value.condition) {
   if (value.attr) {
    for (var prop in value.attr) {
     if (value.attr.hasOwnProperty(prop)) {
      if (!element["_" + prop + "_"])
       element["_" + prop + "_"] = jElement.attr(prop);
      jElement.attr(prop, value.attr[prop]);
     }
    }
   }

   if (value.style) {
    for (var prop in value.style) {
     if (value.style.hasOwnProperty(prop)) {
      if (!element["__" + prop + "_"])
       element["__" + prop + "_"] = jElement.css(prop);
      jElement.css(prop, value.style[prop]);
     }
    }
   }
  }
  else {
   if (value.attr) {
    for (var prop in value.attr) {
     if (value.attr.hasOwnProperty(prop)) {
      jElement.attr(prop, element["_" + prop + "_"]);
     }
    }
   }

   if (value.style) {
    for (var prop in value.style) {
     if (value.style.hasOwnProperty(prop)) {
      jElement.css(prop, element["__" + prop + "_"]);
     }
    }                    
   }
  }
 }
};


Friday, February 24, 2012

Custom Errors in WCF REST Services - Part I

Support for REST in WCF has improved enough to make it a viable option for those wanting to expose a RESTful API to their services. Error reporting in such an API is something that should be thought through from the beginning. How will users be notified when something doesn't work right? Getting a consistent data contract for error messages that originate from your application isn't all that hard with a little custom behaviors. However, getting the same data contracts respected by IIS and WCF itself can be a bit of a trick. If you don't take care of all 3 (application, WCF, and IIS) layers, then your REST API response would be difficult to parse. I'm sure your API consumers wouldn't appreciate this:

HTTP/1.1 405 Method Not Allowed
Allow: GET, PUT, DELETE, POST
Content-Type: text/html
Server: Microsoft-IIS/7.5
X-Powered-By: ASP.NET
Date: Wed, 11 Jan 2012 02:48:21 GMT
Content-Length: 1293

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
<title>405 - HTTP verb used to access this page is not allowed.</title>
<style type="text/css">
<!--
body{margin:0;font-size:.7em;font-family:Verdana, Arial, Helvetica, sans-serif;background:#EEEEEE;}
fieldset{padding:0 15px 10px 15px;} 
h1{font-size:2.4em;margin:0;color:#FFF;}
h2{font-size:1.7em;margin:0;color:#CC0000;} 
h3{font-size:1.2em;margin:10px 0 0 0;color:#000000;} 
#header{width:96%;margin:0 0 0 0;padding:6px 2% 6px 2%;font-family:"trebuchet MS", Verdana, sans-serif;color:#FFF;
background-color:#555555;}
#content{margin:0 0 0 2%;position:relative;}
.content-container{background:#FFF;width:96%;margin-top:8px;padding:10px;position:relative;}
-->
</style>
</head>
<body>
<div id="header"><h1>Server Error</h1></div>
<div id="content">
 <div class="content-container"><fieldset>
  <h2>405 - HTTP verb used to access this page is not allowed.</h2>
  <h3>The page you are looking for cannot be displayed because an invalid method (HTTP verb) was used to attempt access.</h3>
 </fieldset></div>
</div>
</body>
</html>

Part I - Application Generated Errors
Part II - WCF Generated Errors
Part III - IIS Generated Errors

Application Generated Errors
Making sure all of your services conform to the same error data contract is not difficult. WCF comes out of the box with the IErrorHandler extension interface which allows you the opportunity to handle and translate any faults that occur on a service. IErrorHandler has two methods to implement and mapping any faults to a custom error data contract can be done here.

One approach that I've seen work well is to have all faults in your application inherit from a base Fault that has useful properties for your API consumers. You can then use the different fault classes to convey any exception that your application might throw.
[DataContract]
public class FaultBase
{
    public FaultBase(string message, string code)
    {
        this.Message = message;
        this.Code = code;
    }

    public string Message { get; set; }
    public string Code { get; set; }
}

[DataContract]
public class ItemNotFoundFault : FaultBase
{
    public ItemNotFoundFault()
        : base("The resource item was not found.", "ItemNotFound")
    { }
}

Any exception caught by your IErrorHandler implementation can be translated into one of your faults and sent back to the caller. Nothing special here. I'm not going into a lot of detail on this because there are plenty examples of this on the web.

In Part II, I'll discuss how to handle errors generated inside WCF that don't get handled by the IErrorHandler implementation.

Custom Errors in WCF REST Services - Part II

In Part I, I discussed how to handle errors from your WCF services so that you can give your API users a consistent error result.

In Part II, I'll be discussing how to handle a few of those errors generated by WCF that don't get sent to your IErrorHandler implementation.

WCF Generated Errors
Handling the application errors will get you most of the way there. However, there are two cases where WCF will display that custom blue and white HTTP web page similar to what was shown above; 405 Method Not Allowed, and 404 Endpoint Not Found. These cases arise when WCF can't match the request with a service or a method. Unfortunately these errors don't go through the IErrorHandler interface, so you have to do something else to replace the output with your error data contract. I was pretty stumped on this one and had to resort to using Reflector, breakpoints, and stack trace examination to figure out what to do. There's a property on the DispatchRuntime called UnhandledDispatchOperation that WCF uses to dispatch operations that don't match an operation in your service. It's the IOperationInvoker instance stored in this property that is responsible for the blue and white HTML versions of the 405 and 404 responses.

The solution is to replace this instance with your own implementation that doesn't return HTML. The tricky part in writing the implementation is trying to determine whether to return a 405 or a 404. I'm guessing that it's possible to use the WCF API to figure out at runtime which case applies to the request. Rather than figure this all out, I cheated and just copied what the original IOperationInvoker instance did.
public class UnhandledOperationInvoker : IOperationInvoker
    {
        public object[] AllocateInputs()
        {
            return new object[1];
        }

        public object Invoke(object instance, object[] inputs, out object[] outputs)
        {
            outputs = null;
            bool uriMatch = false;
            Message message = inputs[0] as Message;

            if (message.Properties.ContainsKey("UriMatched"))
            {
                uriMatch = (bool)message.Properties["UriMatched"];
            }

            if (!uriMatch)
            {
                return new ItemNotFoundMessage();
            }
            else
            {
                return new MethodNotAllowedMessage();
            }
        }

        public IAsyncResult InvokeBegin(object instance, object[] inputs, AsyncCallback callback, object state)
        {
            throw new NotImplementedException();
        }

        public object InvokeEnd(object instance, out object[] outputs, IAsyncResult result)
        {
            throw new NotImplementedException();
        }

        public bool IsSynchronous
        {
            get { return true; }
        }
    }


Apparently something in the communication stack figures this out for us and saves this information in the current message's properties. Once you know which case applies to you, just return your standard error response appropriately. In my example, I'm returning an instance of one of my custom classes that inherits from System.ServiceModel.Channels.Message. This way I can control exactly the way the response looks. WARNING! Use the above approach at your own risk. Obviously it will stop working if a new version of .NET starts doing it differently.

In Part III, I'll discuss how to handle errors generated from IIS that don't get handled by WCF at all.

Custom Errors in WCF REST Services - Part III

In Part I, I discussed how to handle errors from your WCF services so that you can give your API users a consistent error result.

In Part II, I discussed how to handle a few of those errors generated by WCF that don't get sent to your IErrorHandler implementation.

Now that we've gotten WCF taken care of, there are even still some situations where an end user can get a bunch of HTML instead of a well formed XML response.

IIS Generated Errors
Now that we've gotten past WCF, there's one more layer that needs to be address. IIS has configuration settings that allow you to control the response that's given to a user for certain HTTP status codes. This might work for you, but if you need finer or programtic control, then you'll need to handle these errors yourself. A good example is the 500 Internal Server Error message. Let's say something in ASP.NET is misconfigured. In this scenario, IIS would give the end user an HTML 500 error. The approach I'm going to show for getting around this is for IIS 7.0. The idea is to write a custom IHttpHandler and tell IIS to route any IIS error to that handler. The handler can then return your custom error data contract to the caller. When IIS routes the request to your handler, it passes the error status code as a query string parameter. So all your handler has to do is parse this out and return the appropriate response.
public class ErrorHttpHandler : IHttpHandler
    {
        public bool IsReusable
        {
            get { return true; }
        }

        public void ProcessRequest(HttpContext context)
        {
            if (context.Request.QueryString.Count == 0)
                return;

            string strStatusCode = context.Request.QueryString[0].Split(';').FirstOrDefault() ?? "500";
            int statusCode = 500;
            int.TryParse(strStatusCode, out statusCode);

            string message = "Unhandled server error.";

            switch (statusCode)
            {
                case 400:
                    message = "Bad request.";
                    break;

                case 404:
                    message = "Item not found.";
                    break;
            }

            context.Response.StatusCode = statusCode;
            context.Response.Write(string.Format("<Error><Message>{0}</Message></Error>", message));
        }
    }

The above implementation is only to get you started. The last step is to configure IIS to forward errors to your handler.

<system.webServer>
    <httpErrors errorMode="Custom">
      <clear/> 
      <error statusCode="404" path="/application/ErrorHandler" responseMode="ExecuteURL"/>
      <error statusCode="400" path="/application/ErrorHandler" responseMode="ExecuteURL"/>
    </httpErrors>
    <handlers>
      <add name="ErrorHandler" path="ErrorHandler" verb="*" type="Your Application.ErrorHttpHandler, FrameworkAssembly"/>
    </handlers>
  </system.webServer>

The worst part of this approach is that you have to explicitly define which HTTP response codes you want to handle in your handler. It is possible to have a catch all, but doing so requires you to override the system configuration for IIS and I didn't have much success getting that to work on my machine.

Friday, January 20, 2012

White Elephant Planning

In a previous post, I discussed an agile estimation method called the White Elephant method. In that post I discussed some concerns that I had on it, but also noted that these concerns were not founded by any actual experiences with the method itself. Now that I've actually been a part of one, I wanted to share some things about it that I noticed.

Background
In my environment, we have 5 teams of approximately 7 people. Our product owner wants to have a single backlog for all teams. In order to fill out the backlog for the next release, we held a story estimation workshop that consisted of a representative from each team, plus the product owner. The product owner (along with some help) prepared by creating around 40 stories with descriptions and acceptance criteria. We time boxed the meeting at 3 hours and had lunch catered so we wouldn't have to stop. We usually went for about 45 minutes at a time with a 15 minute break in between.

The Game
Out of the 3 hours, we probably spent a little over 2 hours actually estimating (we started late, had some issues with lunch, etc). The product owner was hoping to get at least half of the 40 stories estimated. As it turned out, we got them all done! Just based on that alone, I would consider the exercise a complete success.

Moderation
I have to say right off that in order for this to work well, you must have a good moderator. Fortunately, we had someone that had done this before and he did an excellent job. He had a stop watch running on his phone and reset the timer each time a new story was selected. Once the timer reached about 4 minutes, he would let us know and try to push the current player to park his story in the 'Parked' column for later discussion so we could move on. Without this type of moderation, we could have talked on and on and the whole exercise might not have been as successful as it was.

Less Discussion
There were several occasions where a player muttered under his breath when it was his turn that he didn't agree with the placing of a previous story but decided not to use his turn to move it because he didn't want to re-open the discussion again. This only happened when there a one point difference between what the story was currently estimated as, and where the player thought it should be. This was interesting to me because not only did we cut down on more discussion, but the player immediately recognized the cost of such discussions and decided that it wasn't worth it over a one story point difference. I struggle with my own team on this sometimes. I don't like having to remind team members during planning that while the discussion they're having about the story is important, it's not going to make any significant difference in the planned estimate and should be taken offline. I really like seeing the player come to this conclusion on their own with the White Elephant method.

Tuning Out
I'll be the first to admit, that it was easy to tune out. When it wasn't my turn, I caught myself several times paying less attention to the current discussion of a particular story. Once the story was placed on the board, I really didn't know much about the story and therefore wasn't able to contribute to its estimate. It was too easy to just say, "I'll be happy with whatever he thinks it is." Since this was the first workshop for us, it consisted of all team leads, so most of us had strong opinions about the stories. I see the tuning out problem being more wide spread on teams with members that just go with the flow more. Planning Poker requires the team member to decide on a number, and then possibly defend it if it doesn't match the rest of the team.

Conclusions
I was surprised to find that some of the things I thought were going to be issues were not at all. Anchoring wasn't a problem for us. Players regularly took a turn to move an estimate and would confidently say, "No way this is a 3 because...". Then again, that might be due to the combination of personalities in the room. I also didn't find changing the discussion from story to story to be a problem. I never once felt that it impeded my ability to estimate.

Once of the biggest things I took away from this is that most of the positives I saw could be easily applied to Planning Poker. Timing story discussions, parking stories that take too long, visually seeing the stories and where they're placed in relation to each other. I'm definitely going to apply some of those things in my next planning meeting with the team.