Monday, January 21, 2008

Software Development Methodology for Large Teams

Abstract

Maverick Methodologies are guided by these steps:

* Understand the current state.
* Identify the areas of concern and set goals.
* Consider all solutions known or available at this time.
* Choose a solution that meets the goal which is best for this situation.
* Act upon the choices and adapt or change solutions when it becomes needed.

This methodology, for large teams, was developed under the following situation:

1. Software and System requirements are not all known in advance.
2. Customer requires high quality in the first public release.
3. Team members are not all co-located.
4. The system is made of many components which are in various layers or tiers thus making integration a concern.
5. The team is made up of over 40 software developers.

The goals are for improvements in these areas:

1. Requirements and Design
2. Testing
3. Build Process
4. Iterations
5. Off-Site Issues
6. Development Platform and Setup

Terms

2D requirements.

Requirements

Interviewing to capture requirements as a one time effort has been shown to be difficult and rarely accurate.

As recommended by most methodologies, having the individuals that know the requirements (the customer) available to development is currently considered ideal. If development is not restricted to one site then consider training individuals to be a customer representative. These representatives are trained by the customer and communicate often with the customer.

Product requirements must be avaible to development. Colocated development can use index cards and tape them to a wall. Distributed teams may choose an electronic medium. Choose a medium that is easily shared and easily updated.

The developer requirements must be considered as well. What are developer requirements? The most obvious is that the product requirements can be implemented in software. Other developer requirements may be that the system is testable or the system run on Unix based systems. These requirements along with the product requirements make up what is termed 2D requirements.
2D Requirements

Suppose the developers are creating an Object Oriented solution. Objects exist to satisfy requirements in two dimensions. The first dimension is that of the real world entity being modeled. The second models the behavior of the object in a computer system.

There will be methods that meet the requirements for both dimensions. For example you have an object that models a Car. There may be methods like get the color or set the color. These are methods that have to do with the real world object. But the Car object exists in the computer world as well. And in the computer world we do things like compare objects. You normally don't think about saying, is this car equal to that car. But in the computer system such comparisons exist. Therefore, Maverick Agile Development uses requirements for the real world and well as the computer world and these are called 2D requirements.

In Maverick Quality Assurance (MQA) three requirements are identified that are of this second dimension.

1. Object factories. Object factories are needed by MQA to allow for the develoment of test harnesses, stubs, and mock objects that are needed to develop complete unit tests.
2. Object Validators are validation facilities that examine an object for comformance with required values. MQA component tests needs object validators to check objects returned from a component call.
3. Object Equivalence Verifier. MQA tests will need to compare two objects to see if the are the equivalent. A test that needs this functionality would be a persistence test. Equivalence can be a difficult problem to solve (e.g. The Record Linkage Problem) and there may be a valid need to relax the definition of quivalence.

Factories, Validators, and Equivalence Verifiers may not be part of the model for the real world object. However, since the object exists in the dimension of a computer system these are necessary requirements that must be met.

Testing

I was involved in the development of a large software product with a large team using common agile techniques that had not matured for large teams. The development division was doing unit tests (some developers did test first, some code first) and a team of developers under the Quality Assurance division was developing the integration tests.

It wasn't long before the integration developers fell victim to "down stream" issues. The product developers where changing things so quickly that the integration developers were constantly playing "catch-up". Also, the product developer's code had functionality that allowed for unit testing but was lacking functionality for integration testing. The integration developers could not get product developers to add functionality to help with integration tests. Ideas such as making integration developers into customers so that they could drive product requirements was considered.

The solution I propose is that having those with the domain knowledge perform all white-box tasks instead of an external team. With the requirements continuously being refined, and the design emerging, the transfer of this specific knowledge is too expensive. The product developers don't want to slow down to help or train the integration developers. Therefore, product development is responsible for unit tests and integration tests.

Many unit tests can be reconfigured to be component tests. If stubs of a unit test are replaced with real components then the test is now exercising the integration of multiple "real" components. Because of this, component tests are more of a build/configuration issue than a seperate issue from unit tests.

A system is integrated when no mock objects are used in the build and the system passes all of the component tests.

Continuous Build

During the development of this large system using many developers it became apparent that there were issues with the continuous build process.

The goal of the continuous build is to present a version of the product that is always functioning correct for the features implemented. Another goal of the build process is to eliminate broken builds at the main repository..

This presents a particularly difficult task when the team is large. If a developer updates code from revision control he may only get part of the files of a large "check in" that another is currently submitting.. Transactions are needed to prevent getting only part of the files of a large commit.

Long running tests cause developers to save up their changes inorder to avoid the wait. When one waits to commit changes increase the conflicts with other changes and cause the developer to get caught in a refactoring frenzy. Since files are changing so often it requires the idea of relaxing the constraint of developing from the "head" of the revision control system.

I suggest code is always updated from a good build which is labeled as a specific revision. The negative thought comes to mind that people will be working on "stale" code, but on a large team with many changes the code is stale so fast it doesn't matter when you get it. One reason to get from a revision is if the "head" build is broken and you have made changes and you want to update your local code to do a build and run the tests locally before you commit changes. If you update to a broken build you will have to wait until that build is fixed, or rollback to a good revision, or worse yet you don't notice the build was broken when you updated and you think that your changes have broken the build.

Other processes for updating and committing code can be used. A patch process can be used where the differences of the local files are sent to a build machine and that machine builds the system and if it compiles and passes the tests it is then committed to the "real" build machine (the machine that has the revision control repository). This process can take a long time to perform. On large systems the setting up of test data can take several minutes and the running of all of the tests could be a significant amount of time as well. If you are trying to commit a change that another team member is waiting on it could be a few hours before the changes propagate through the process.

The continuous build must be as fast as possible. Distributing the build process can help with a patch process. Each feature team (or some topological division of the developers) has a patch build machine. But that alone is not enough. To improve turn around time the builds themselves have to be changed. The ideal is to have a build for a large project (which includes running tests) be no longer than 15 minutes. Doing a clean build in the continuous build process takes to long. For large teams a clean build is done every fourth build or every hour. Interim builds are not clean builds. In the interim only changes and their dependencies are built.

A goal of the build process is to eliminate broken builds. Before submitting changes to the build process the developer should get an updated version of the system and build it locally and run the tests. Usually this local build would not be a clean build. If the build process is fast enough this step should be eliminated. In Maverick Development you do whatever moves you down the road, and if there no need for a step you eliminate it. This is built on the idea of being a thinking individual that understands the ramifications of decisions and that the individual is concerned and will not do anything to interfere with another team member's ability to work. So, if the build process is a patch process and it is fast enough the local build is redundant.

Iterations and Increments

The goal of incremental development is to always provide an improving product. Improvements are not limited to added functionality alone. Improvements in quality or performance are valid as well as anything else that is a real improvement.

For large teams iterations are no more than 3 weeks. It is about planning, refocusing, refactoring, and staying on target. Any more than three weeks is too long to readjust. If there is a concern that the iteration planning takes to much time or is too expensive to do every three weeks, then the way the planning is done needs to be improved.

There is some overhead in pulling everyone together. If the Maverick Development Model is followed and meetings have goals reflected by agendas and the meeting will be as optimal as a meeting can be. Maverick development is based on an idea that software is done when it is done and there are at least a certain minimal number tasks that will be performed before the software is done. Since these essential tasks will be done no matter how long the iteration it does not slow the development momentum to have short iterations.

Offshore/Offsite

If offshore/offsite teams are used in the development, the iterations are two weeks or less. This increases the communication to a level that is needed for redirection. The problem is that offshore teams suffer from poor communication and may get off target and no one knows until it is late. The method is to use an iterative approach where there is always a working subset of the product until it is developed entirely. With shortened iterations you will be able to use the iteration to quickly see if the offshore team is working as planned and if not to make a course correction. Like other agile methods, this depends on the other processes of the methodology in place. The continuous build is essential for this to work. The offshore team must be working from a shared source base and a patch process will address the build needs for a distributed work force.

Development Platform

One goal for large development teams is to keep developers up and working and to get new employees setup quickly. While working on this large project I quickly noticed that some developers were using Windows based machines and others were using Linux based machines.

I recommend the development machines should be setup the same, using the same OS, same versions of the SDK and such, same versions of Ant, Bash, etc. It becomes an unnecessary burden to manage special shell scripts and suttle differences of each platform and configuration.

There is an argument that if the product is going to be delivered on multiple platforms and configurations it is good to have such platforms and configurations spread about development. This is where Maverick Development comes in and says "Why?" One of the most prevalent arguments for having the various machines is to detect any problems with the product running on the different configurations. That is a valid goal, but the solution of having developers with different platforms and configurations is not satisfactory for large teams.

With large teams the different development platforms typically find problems with the build process, not the product under development, across the various platforms. The problems with the build become the predominant issue with the different platforms instead of finding issues with the product deployed on various machines.

The configuration of each development build should be the same. The location of items on each machine should be the same. The environment variables and paths should be the same.

If the product is to be deployed on differing systems then the continuous build process should build the product on each varied system. The various platforms that will host the product will be maintained by those in charge of deployment tasks. They will keep the various platforms configured and working. Problems that have to do with code will be brought to the attention of the developers. Problems that have to do with configuration and deployment will be handled by the deployment team. The build will be run continuously on all the platforms and any broken tests will be addressed by development. If a particular platform is becoming problematic then maybe it is good that it was discovered early and you can eliminate that platform from the list of supported systems.

Having an exact setup of the desired deployment environment is not usually possible early on in the development of the product. Bringing the various systems on line is dictated by driving forces such as budget, schedule, hosting facilities, and other important decision factors.

If the configuration is standard then new employees can receive an "image" that can be easily installed. If a developer's machine experiences hardware trouble the machine can be restored very quickly to a usable state. With the same setup on each machine it is easier to do pair development. The difficulties for large teams are enough and we do not need to add unnecessary difficulties with various configurations.

Note if the development is in a language like C++ and there are multiple target platforms development will need to have access to these platforms continuously. Languages that have compiler directives and the ability to conditionally specify code for compilation (#ifdef comes to mind) require that the tests be ran on each target platform in order to catch compile time errors. The continuous build process becomes more complicated as well. An important aspect of this methodology is that there is no porting team. The developers are responsible for their code to run on every supported platform. This will give many benefits. One is that they will not write the code in an optimized fashion for their preferred platform. Another is that you do not have two different individuals working on the same code that might interpret the requirements in subtly different ways. When organizing the teams and if continuous pair programming is used, then wisely pair people with experience on different platforms.

Conclusion

Using incremental development and iterations for large teams pose specific issues. Working in such a situation I observed that requirements, integration testing, continuous builds, iteration duration, and machine configuration needed specific recommendations.

Developer requirements must be considered just as customer/product requirements. Making the system verifiable is a developer requirement. Using object factories, object validators, and equivalance verifiers are part of developer requirements. These developer requirements are another dimension of requirement and are referred to as 2D requirements.

All white-box testing should be done by the product developers. This includes component and integrations testing.

Continuous builds become more complex with a large number of developers checking in many changes across many files. The "head" of the revision control system may not be "working" at any given instance. Patch processes and lables are used to help alleviate developers from getting a broken build.

Iteration length should be no more than three weeks and if parts of the team are off-site then two weeks is recommended as the maximum duration.

Developer machine configuration becomes an issue with large teams. Each developer should use the same configuration and directory layout. This allows others to easily pair or use someone elses machine. It also allows for the use of an image of the development setup which can be used to setup new developers or restore machines if there is a crash.

No comments: