Putting the Unit in Unit Tests

Some excellent comments to my last post. Please allow me to respond:

Should you test your persistence layer? Yes.
Should you test your system from end to end? Yes.
Should you test complex edge cases? Yes.

If you assumed the answer to any (sensible) question that starts with ‘Should you test…’ is ‘Yes’, you wouldn’t be too far from the truth.

However….

Should you test any of those things in your Unit tests? No. Unit tests (to me) are the things that the developers write and run many times a day, and should always pass. Testing against DB’s is hideously slow. Orders of magnitude too slow for unit tests. Tests against DB’s can too easily fail for the wrong reasons. Tests that require masses of setup code and mock objects just to run are too brittle. Lots of mock objects can hurt you when refactoring, as they tend to break when you pull their superclasses apart.

Also note that I didn’t say ‘Don’t use Mocks’. Sometimes they are required. MockPersistenceLayer being a good example. Although if you have interface/impl separation (which you should for your persistence layer), then ‘InMemoryPersistenceLayer’ or ‘HashMapBackedPersistenceLayer’ would be equally good names, and less open to misinterpretation than MockPersistenceLayer.

The other kind of Mocks, the ones that verify their methods were called in a certain sequence with certain parameters can be useful, but, as I said, can be brittle and prone to breakage during refactoring, so should be used with caution. They also make it less clear what’s being tested. Chances are the system would benefit from finer grained objects that could be tested individually, which should reduce the need for a full Mock.

Its perfectly ok to have a separate set of JUnit tests that aren’t ‘Unit’ tests, that (for example) test the persistence of each of your objects exactly once. Run them as often as you want, add them to your commit script if you feel the need, just don’t bolt them into the main body of unit tests.

Project guidelines

Just some thoughts, mostly for my own benefit. I reserve the right to change my mind, and even do complete U-turns where necessary.

  • Use CruiseControl, or some other form of continuous integration.
  • Use clover, and fail checkins if there is any code not covered by a unit test.
  • Block System.out.println with a CVS commit script.
  • Ban unit tests from hitting the database.
  • Don’t write tests that require more than 10 lines of (non production) setup code per test method.
  • Don’t write code that requires a (JUnit) setUp method that is bigger than all your test methods combined.
  • Use mock objects very sparingly. Consider them indications of code smell.
  • Don’t use anonymous subclasses in testcases just to loosen method permissions for testing, or to stub out hard-to-test behaviour. Code smell warning.
  • Do put your test code in a parallel directory structure, in the same packages as your production code.
  • Do test protected and package methods.
  • Don’t worry about testing private methods too much – they must be called from a public, package or protected method, or they’re redundant and can be deleted.

Don’t break the law of Demeter

…Or the OO cops will get you. The best summary of the law of demeter I saw was from c2.com. It goes something like this:

You can play with yourself.
You can play with your own toys (but you can’t take them apart),
You can play with toys that were given to you.
And you can play with toys you’ve made yourself.

That’s it. To recast in geek-speek:

An object can call its own methods.
An object can call methods on its own fields. (But not on it’s fields’ fields)
An object can call methods on parameters passed to it.
An object can call methods on objects it has instantiated.

That’s it. You can’t call methods on the fields of other objects. So anything that looks like foo.getBar().getBaz().doSomething() is not allowed. What you can do is add a method to foo called ‘doSomething’ which delegates to a method on its ‘bar’ field called ‘doSomething’ which delegates to its ‘baz’ field. That doesn’t break the law, because it prevents foo’s caller from knowing about bar and baz, and foo from knowing about baz.

Its all about preserving encapsulation.

If you take it to the extreme you find that getter methods break the law of demeter. Ask yourself why other objects need to get at your object’s fields. Can’t they just tell your object what they want it to do directly? Setter methods aren’t much better. They still expose an object’s state to the world.

Something new every day

IDEA is just the coolest IDE since sliced bread. Today I discovered the ‘invert if’ and ‘split if’ refactorings. Just put the text cursor next to an ‘if’ statement and IDEA will ask if you would like to invert it or split it! Consider the following:

if (foo != null  && bar != null) {
// Do some stuff
return true;
}
return false;

After ‘split if’:

if (foo != null) {
if (bar != null) {
// Do some stuff
return true;
}
}
return false;

After two ‘invert ifs’:

if (foo == null) {
return false;
}
if (bar == null) {
return false;
}

// Do something
return true;

Which I think is much clearer and more open to further refactorings, as it has clearly separated the guard clauses from what the method is actually supposed to be doing. Cool.

Further JUnit

Following on from Bill’s excellent comments, and chats with Geoff and Alan, I agree that there is more than one way to consider JUnit. One is JUnit as a standalone unit testing tool, with some extensions bolted on. The other is JUnit as a framework. As a developer’s unit testing tool, JUnit works very well, and does exactly what it says on the tin. As a framework, it is not ideal. It started life as a Java implementation of the SUnit smalltalk test tool, and has been refactored over time, to the point where it is at least possible to write plug-in extensions, if not necessarily easy.

Alan made the excellent point that JUnit could (should?) be simply a straightforward test framework that makes assertions and fires events. The mechanism of orchestrating the tests and handling the events could be entirely separate. Which would possibly have made SuiteRunner easier to implement as a layer on top of JUnit.

Stop dissing JUnit

JUnit seems to have come under fire recently, with articles such as this, and posts like this one from Geoff and this one from Cedric. The Artima article complains that JUnit’s reporting features are underdeveloped, and Cedrid & Geoff want to see success messages as well as failure ones.

JUnit is a ‘unit testing’ tool. Its supposed to be run many times a day. Its output is supposed to be transient. All the tests are supposed to pass all the time. Red bar / green bar. Pass or fail. If all the tests are passing it should remain silent. If a test fails, print a useful message. If you have a suite of thousands of tests, how easy is it to find 1 failure message if its mired inside 999 inane ‘test passed, foo does equal foo’ printouts?

The same goes for reporting. It is actually possible to get detailed reports from JUnit as XML, which can be processed by an ANT task to produce nice looking web pages, but if its being used correctly, all the tests would always be at 100%, as you don’t commit if any of the tests are failing, right?

Bolting on extra features blurs the line between functional testing and unit testing, and I for one am happy for JUnit to remain clearly focussed on doing one job very well, which it does admirably.

Side note: The Artima article is called ‘Why we refactored JUnit’. Following neatly on from my earlier post, what they did was neither big-R or little-R. What they actually did was write a brand new, JUnit testcase compatible tool from the ground up. Refactoring is defined as ‘improving the design of existing code’. In the strictest sense the only people who can refactor JUnit are the JUnit committers and contributers, and the result would still have been JUnit, with all the same behaviour, but with a cleaner internal structure.

Refactoring refactored

Something Alan said a few weeks ago has just bubbled to the surface: there appear to be two main ways that people think about refactoring. I’m going to call them ‘Refactoring’ (big ‘R’) and ‘refactoring’ (small ‘r’).

The apparently more common usage is the big-R form, Refactoring as a task in its own right. Typical comments would be “We have to spend the next two weeks Refactoring this system”. The small-r form is done as part of everyday programming activities, at the same level as writing statements and method calls. Nobody says, “Today my tasks will include writing for-loops, methods and if-else statements.” (Although that is a great way of describing your activities without actually saying anything about what you will be doing.)

The XP / Test-first crowd tend to think of refactoring in the small-r sense. Its just one piece of the Test-Code-(r)efactor cycle that they go through many times a day. In this way (in a perfect world), the code is kept clean and lean enough that Refactoring (big-R) is never required. Having to plan a Refactoring session for some code probably means that its in what the XP books call ‘Refactoring debt’ (its like oxygen debt for software). The system wasn’t refactored mercilessly enough (if at all) while it was being grown, and, just like a human in oxygen debt, progress slows down and pain increases, until it becomes necessary to stop and do something about it, in a big Refactoring session.

I am of the opinion that just as aerobic exercise (eg. jogging) is less painful and can be done for longer than anaerobic exercise (eg. sprinting), its better to keep to the discipline of refactoring as you go (ideally as part of the test-code-refactor cycle), and try to minimise the incidences of Refactoring required (during which customer-visible progress tends to slow or stop).