TDD - Red-Light-Green-Light:: A critical view

Subject: The concept of red-light-green-light for TDD/BDD style testing has been around since the dawn of time (well almost). Having written thousands of tests using this approach I find myself questioning the validity of the principle

The issue:

False positive or a valid test strategy that can be trusted?

A critical view:

I agree that the red-green-light concept has some validity, but who has ever written 2000 tests for a system that goes through a ton of chnages due to the organic nature fo the application and does not have to change, delete or restructure their existing tests? If you asnwer to the latter question is" "Yes I had a situation(s) where I had to refactor my code and it caused me to have to rewrite/change/delete my existing tests", read on, else press CTRL+ALT+Del :-)

Once a test has been written, failed the test (red light), and then you comlpete your code and now get the green light for the last test, the test for that functionality is now in green light mode. It can never return to red light again as long as the test exists, even if the test itself is not changed, and only the code it tests is changed to fail the test. Why you ask? because the reason for the initial red-light when you created the test is not guaranteed to have triggered the initial red-light result for the same reasons it is now failing after a code change has been made.

Furthermore, when the same test is changed to compile correctly in case of a compile-breaking code change, the green-light once again has been invalidated. Why? Because there is no guarantee that the test code fix is in the same green-light state as it was when it first ran successfully.

To make matters worse, if you fix a compile-breaking test without going through the red-light-green-light test process, your test fix is essentially useless and very dangerous as it now provides you with a false-positive at best. Thinking your code has passed all tests and that it works correctly is far worse than not having any tests at all, well at least for that part of the system that the test-code represents.

What to do?

My recommendation is to delete the tests affected, and re-create them from scratch. I have to agree. Hard to do and justify if it has a significant impact on project deadlines.

What do you think?

Print | posted @ Monday, June 14, 2010 4:21 PM

Comments on this entry:

Gravatar # re: TDD - Red-Light-Green_Light:: A critical view
by Ryan at 6/14/2010 4:28 PM

Hopefully your tests are testing a small enough piece of code that when they fail it's very obvious what is broken. I used to write tests with many expectations, but eventually I realized this was leading to a lot of false breaks. (That is, many tests broke at once and the reason for it wasn't obvious.) Now I'm very strict about having one expectation per test and when things fail, it's usually only one test.
Gravatar # re: TDD - Red-Light-Green_Light:: A critical view
by Erik at 6/17/2010 7:38 AM

I agree, generally having one assert per test is a good idea. Also, if you're changing code that much, now's the time to refactor where it makes sense to not only make the code cleaner, but also make the testing easier. Smaller classes are easier to test, and make it much easier to write appropriate tests that don't give false positives.
Gravatar # re: TDD - Red-Light-Green_Light:: A critical view
by Renso at 6/28/2010 7:17 AM

Thanks for the feedback Ryan. I think you said the magic word "small". I have always liked large methods where you can see "everything" that is going on in stead of having to jump around in the app to find small methods. However, although SRP (single responsibility principle) causes lots of small classes and methods it is great for maintenance, makes the tests real small and as you said hopefully not many are affected when something changes.

Erik, thanks for the input, we seem to all agree.

However, my original point is that when I have to fix a test due to a coding change, the red-light-green-light principle is violated as I am going from green-light-to-green-light, so probably deleting the test or making sure it fails on (red-light) for the correct reasons before getting green-light is critical, what do you think?
Post A Comment