Friday, May 11, 2012

Unit Tests

Unit Tests

by Geoffrey Slinker
v1.2 March 24, 2006
v1.1 March 5, 2005
v1.2 April 15, 2005


Abstract

How do you view unit tests? Is a unit test is simply a test to verify the code? But what is the purpose of having unit tests? Maybe unit testing is a design method. Maybe a unit test is a programmer test. Maybe unit testing is a diagnostic method. Maybe unit tests are a deliverable in a phase of development. The varied purposes and uses of unit tests lead to confusion during discussion.

Testing Definitions

Definitions are taken from : http://www.faqs.org/faqs/software-eng/testing-faq/section-14.html. Emphasis added.
The definitions of integration tests are after Leung and White.
Note that the definitions of unit, component, integration, and integration testing are recursive:
Unit. The smallest compilable component. A unit typically is the work of one programmer (At least in principle). As defined, it does not include any called sub-components (for procedural languages) or communicating components in general.
Unit Testing: in unit testing called components (or communicating components) are replaced with stubs, simulators, or trusted components. Calling components are replaced with drivers or trusted super-components. The unit is tested in isolation.
Component: a unit is a component. The integration of one or more components is a component.
Note: The reason for "one or more" as contrasted to "Two or more" is to allow for components that call themselves recursively.
Component testing: the same as unit testing except that all stubs and simulators are replaced with the real thing.
Two components (actually one or more) are said to be integrated when:
  1. They have been compiled, linked, and loaded together.
  2. They have successfully passed the integration tests at the interface between them.
Thus, components A and B are integrated to create a new, larger, component (A,B). Note that this does not conflict with the idea of incremental integration -- it just means that A is a big component and B, the component added, is a small one.
Integration testing: carrying out integration tests.
Integration tests (After Leung and White) for procedural languages.
This is easily generalized for OO languages by using the equivalent constructs for message passing. In the following, the word "call" is to be understood in the most general sense of a data flow and is not restricted to just formal subroutine calls and returns -- for example, passage of data through global data structures and/or the use of pointers.
Let A and B be two components in which A calls B.
Let Ta be the component level tests of A
Let Tb be the component level tests of B
Tab: The tests in A's suite that cause A to call B.
Tbsa: The tests in B's suite for which it is possible to sensitize A -- the inputs are to A, not B.
Tbsa + Tab == the integration test suite (+ = union).
Note: Sensitize is a technical term. It means inputs that will cause a routine to go down a specified path. The inputs are to A. Not every input to A will cause A to traverse a path in which B is called. Tbsa is the set of tests which do cause A to follow a path in which B is called. The outcome of the test of B may or may not be affected.
There have been variations on these definitions, but the key point is that it is pretty darn formal and there's a goodly hunk of testing theory, especially as concerns integration testing, OO testing, and regression testing, based on them.
As to the difference between integration testing and system testing. System testing specifically goes after behaviors and bugs that are properties of the entire system as distinct from properties attributable to components (unless, of course, the component in question is the entire system). Examples of system testing issues: resource loss bugs, throughput bugs, performance, security, recovery, transaction synchronization bugs (often misnamed "timing bugs").

What are your Goals of Unit Testing

The obvious goal of unit testing is to deliver fewer bugs. But what type of bugs? Unit bugs? Maybe integration bugs? Possibly design bugs? Maybe it is not a bug hunt at all, maybe you want to use unit tests to show valid uses of a unit or maybe you are using unit tests to drive the desing process.
But maybe you have a non-obvious goal for your unit tests. If you do, then you must specify the goal when you discus unit testing or there will be confusion.
It is obvious that unit tests can uncover unit bugs. But can unit tests uncover integration bugs? If a unit test is ran in complete isolation it doesn't seem possible. Suppose in the system someone changes a method by removing a parameter. All of your unit tests will fail to compile. (Unit tests can fail at least two ways, 1. Compilation failure. 2. Runtime assertion) This is an opportunity to bring the people involved with the integration point to discuss the changes and their ramifications.

Are You Ready for Full Unit Testing

Let's repeat the definition of unit tests from above.
Unit Testing: in unit testing called components (or communicating components) are replaced with stubs, simulators, or trusted components. Calling components are replaced with drivers or trusted super-components. The unit is tested in isolation.
The development of stubs, simulators, trusted components, drivers, and super-components is not free. If unit testing is introduced midstream in a development process it can literally reroute the stream. Developers might lose the velocity they currently have on some feature and thus disrupt the flow. Introduction of unit tests midstream must be justified and accepted. To understand how difficult this may be take all of your current development projects and go to the Planning Committee and tell them that all of the predetermined dates are now invalid and the dates will need to be shifted "X" months into the future. Maybe you should have someone under you deliver the news to the Planning Committee! It's always nice to give a more junior person presentation experience!
Test data generation is an expense that many people do not realize goes with unit testing. Imagine all of the units in your software system. These units "live" at different layers. Because of these layers the data it takes to drive a high level unit is not the same as the data that it takes to drive a low level unit. In the system high level data flows through the system to the lower levels and during its trip it is mutated, manipulated, extended, and constrained along the path. All of these "versions" have to be statically captured at each level in order to "feed" the unit tests.
Test data that comes from or goes into a data base store present their own difficulties. Setting up the database with test data and then tearing down the database after the tests have completed are consuming tasks. If your unit of code writes to a database then it will not have the functionality to delete from the database. So, the tearing down of the database has to be done externally to the tests.

Can You Afford Not to Unit Test

Since unit testing is expensive then some might say it is not worth it to do. That statement is too general and not prudent. One should say, "Unit testing is expensive, but it is less expensive than ...". This means you have clear goals and expectations of unit testing. For example, suppose that it took you six weeks to finish changes that were needed because of bugs found when the Quality Assurance team tried to use your code. Suppose that these bugs were some of the simple types such as boundary conditions, invalid preconditions, and invalid post conditions. Six weeks might not seem that expensive but let's examine this further. During these six weeks QA was unable to continue testing beyond your code. After they get your code the finally get into another set of bugs that were "below" you. That developer is now six weeks removed from the development work and will have to get back into the "mindset" and drop whatever he was working on. Also, the precondition bugs you fix will cause exceptions to be caught upstream which will cause code changes by those that call your code which will mean that those "up stream" changes will now have to be retested. If the upstream changes actually change data that flows down to you it will affect those that you call with that "new" data and may cause changes to be made down stream. Is it sounding expensive not to do unit tests in this imagined scenario? It was supposed to!

Unit Testing as Clear (White) Box Testing

This is probably the oldest and most well known role of unit tests. Typically a person other than the author of the code writes tests to verify and validate the code. Verify that it does the right thing and validate that it does it in the right way. Often the right way is not specified and the tests are just simply verifiers. One of the driving principles behind clear box testing by another party is that the developer is so close to the code he can not see the errors and can not imagine alternative uses of the code. However, these alternative uses of the code is often met with a complaint from the developer saying, "Your test is invalid. The code works correctly for the uses that the system will require."

Unit Tests as Usage Examples

Another role that has been filled by unit tests is to provide an example of how to use the unit. Unit tests that show the proper usage and behavior of a unit are a verification activity. These unit tests might better be termed usage examples but are usually lumped into the term unit test. These usage examples will show how to call the unit and asserts the expected behavior. Both valid and invalid paths are shown. These usage examples are used to verify the old term "works as designed." This type of unit test will qualify many bugs found by the QA team as "works as designed." These tests do not validate the design but verifies the current implementation instance of the design.

Unit Tests as Diagnostic Tests

A unit test or set of unit tests are often used in the role of diagnostic tool. These tests are run after changes to the system to see if the verified uses of the system are still valid. If a change to the system causes a diagnostic unit test to fail it is clear that further investigation is needed. Maybe the change has altered the behavior of the system and the tests need to be updated to reflect this, or maybe the changes did not consider side effects and coupling and are flawed.

Unit Tests as Programmer/Developer Tests

More and more often unit tests fill a role in programmer tests. Programmer tests is a term that comes from the eXtreme Programming community. On the C2.com Wiki the following definition of Programmer tests is given:
(Programmer Tests) XP terminology, not quite synonymous with Unit Test. A Unit Test measures a unit of software, without specifying why. A Programmer Test assists programmers in development, without specifying how. Often a Programmer Test is better than a comment, to help us understand why a particular function is needed, to demonstrate how a function is called and what are the expected results are, and to document bugs in previous versions of the program that we want to make sure don't come back. Programmer Tests give us confidence that after we improve one facet of the program (adding a feature, or making it load faster, ...), we haven't made some other facet worse.
Because programmer tests demonstrate usage it fills some of the role of usage example. Also the programmer tests are used to make sure previous bugs do not come back which is part of the role of diagnostic tests.
These programmer tests rarely stand alone as I have described them. They are used in Test Driven Development.

Unit Tests as a Part of Test Driven Development

Unit tests are used in the Test Driven Development (TDD) methodology. This is not part of testing or part of quality assurance in the traditional sense usually defined along departmental boundaries. This is a design activity that uses unit tests to design (with code) the interfaces, objects, and results of a method call.
"By stating explicitly and objectively what the program is supposed to do, you give yourself a focus for your coding." Extreme Programming Explained, 2nd ed., Beck, p50.
By showing what what a program is supposed to do you have given a usage example.
"For a few years I've been using unit testing frameworks and test-driven development and encouraging others to do the same. The common predictable objections are "Writing unit tests takes too much time," or "How could I write tests first if I don’t know what it does yet?" And then there's the popular excuse: "Unit tests won't catch all the bugs." The sad misconception is that test-driven development is testing, which is understandable given the unfortunate name of the technique." Jeff Patton, StickyMinds Article.
TDD is about testing close to code changes. This provides a type of isolation (unit tests are performed in isolation) related to change. If you make hundreds of changes to your unit and then check it in and run your unit tests how do you know which change or changes caused the failure? Changes often have strong coupling and it makes it difficult to figure out which change is the problem. If you change one thing in the code and then run your tests you can diagnose any problems because they will be isolated to that change.


Adding Unit Tests to an Existing System

If you are working on existing code and you wish to start unit testing you are in a difficult spot. If you are going to test your code in isolation then creating drivers, stubs, and simulators to use with your unit tests could be overwhelming in an existing system. But as with everything, do some studies, some analysis and figure out what is the best approach for your situation. A divide and conquer approach will typically help out. Take small bites and chew slowly or you will never eat the entire elephant!
A typical approach is that any new code that is developed will have unit tests supplied with it as well. Some of the difficulty to this approach lies in the fact of which layer you unit exists. If you are a mid-tier unit then you have to create a driver for your unit. This driver may not be complicated but it must reflect some state of the real object that it proxies for. Also, you will have to create stubs for the lower level units that you call. The stubs could be standing in for existing units that have fairly sophisticated behavior with legacy coupling issues and known but painful side effects.
Any tier of code other than the top tier can get caught in a vicious "down stream" flood. Suppose you have developed a lower tier unit of code. You have created drivers for your unit which use test data that you have generated to feed to your unit. Suppose someone upstream adds three integers to the data object that drives your class. Just to test the conditions of the upper and lower bounds and one valid value you may have to add 27 data objects to your set of generated test data. Therefore, down stream situations must be considered.
If your unit calls several methods that manipulate the data in a pipeline fashion it increases the difficulty in creating stubs. For example if you call Method X with object A and it returns a modified object A which we will call A' and you pass A' into Method Y and it returns A'' and you pass A'' into Method Z and it returns an A''' then each hard coded stub for X, Y, and Z must behave correctly for all of the variations of A that is used in your test suite (note this is true for an existing system or for a newly developed system).
If you decide to use programmer tests as your definition of unit tests then you do not have to develop all of the drivers, stubs, and simulators that are required to test the code in isolation. In an exsiting system write some unit tests that state simply and explicitly how the system is currently behaves and then start to make changes (refactoring changes) to the existing code. When adding new units of code to the existing system you may take the TDD approach from this point moving forward. The choices are yours to make.
But enough doom and gloom. I think it is understood that this is not a trivial task. The issue is going to be on deciding how many unit tests are enough.

Issues After You Have Unit Tests

After you have unit tests in place there will be issues that arise because of their existence. What do you do when someone breaks someone else's unit tests? What do you do when QA finds a bug that could have been found if there had been more complete unit tests? What do you do when there is a system change that causes a chain reaction of changes in all of the test data?
These are all issues that easily escalate into the "blame game". That will not help team work or morale. Even though it is justifiable it is not wise to allow the team that works on the "core" technology push around the supporting teams. In those situations segragation of code begins to occur. Pretty soon you have the situation where no one wants to work with the "core" team and the "core" team doesn't want to call any one else's code and the problems begin to increase.
So, you have to decide how you will constructively handle the opportunities that will arise. For example, someone makes changes that break someone else's unit tests. This opportunity can be viewed as good. It is good in that the "breakage" has been found and has been found quickly. That is a good thing. It is good in that we know which persons are involved in resolving the issue, the breaker and the breakee. That is a good thing. The two involved can get together and figure out the best solution for the business. Best solutions are good things. So, in this example, many good things occurred.
If you do not take advantage of these opportunities it could go something like this. Joe broke Sally's unit tests. Sally goes to Joe and says, "These unit tests represent a usage contract. You have violated the contract. Change your code." Joe says, "Don't be ridiculous. Your code doesn't do what it should and your unit tests are bogus. Update your code to work in the new world!" Sally says, "I don't have time to deal with your inability to work within a team environment. You fix it." Joe says, "ME, ME not working as a team player. It is you! Gosh!" I think you get the picture.

Conclusion

Make it clear to your development team what definition of unit tests is being used. Understand that unit testing is not free and the expense increases with the amount of test data needed and the management of data bases that are involved. View failed unit tests as a good thing. The earlier you know a unit of code fails the cheaper it is to fix.
After you have defined your defintion and use for unit tests pick up a book on the subject, summarize it, and make it available to the people involved. Clear communication is always a problem (communication is a topic in every development methodology and business guide that I have studied) and getting people to agree on terms and usage will eliminate many wasted hours in meetings.

No comments: