Thursday 28 October 2010

TDD: The Starwars Answer vs The MacGyver Principle


"But what if you write a bug - like a typo or something - into the test? I mean, your tests aren't going to be perfect, are they?"

This is a genuine question someone - understandably - asked me earlier this week when I was introducing them to the idea of TDD.

I have two answers to this question:

1) The math answer:

If P(Error) is the probability of any line of code you write containing an error, and you write t lines of test code and tn lines of production code, where n is greater than 1, then bugs in both tests and production code can only slip through when they coincide, and you will fix tn bugs in production code for every t bugs in the tests.

And as P(A + B) = P(A) * P(B) then you vastly reduce the number of undetected bugs caused by genuine code errors - eg accidental assignment, wrong operations, wrong variable names etc.

If you don't speak Math, the outcome is many, many fewer small bugs.

2) The Starwars Answer:

"These are not the bugs you are looking for."

No - really, typos and wrong negatives are cool to find, and do cause many wasted hours debugging, but this isn't why I do TDD.

TDD is most powerful when it really hurts. The pain comes because you realise that you have written, or planned to write, some code that is hard to test. TDD pokes you right in the coupling. It stabs you with your own statics and shoves your crappy assumptions right...

... you get the idea?

When you write a test for a class, or a unit of functionality, you are trying to do that outside of the application. So all those hooks and shortcuts and bits of spaghetti that seem so handy inside your app swiftly start to look like what they are: problems.

But there's something TDD brings to your attention that's even harder to swallow...


TDD goes against the MacGyver principle

TDD exposes to you the numerous pathways through your code. If you're testing something as simple as a form with a set of radio buttons, you have to test every possible option for that radio button selection. You imagined you were going to write half a dozen tests for this class and suddenly you've got 15. And you can still think of more special cases that aren't covered.

This is a head-fuck because we like to pretend, to ourselves, that what we're trying to do is much, much easier and less complex than it actually is. I suspect there are very few genuine pessimists* in programming. We are self-selected MacGyver types with a tendency to see the solutions in any situation.

This process relies on the confidence trick of pretending to yourself that what you're undertaking is pretty straightforward, really. We have to turn a blind eye to the factorial expansion of pathways through our code presented by each new 'tiny' feature, because the alternative is lying awake at night wondering whether we've really covered every important combination of user actions, and knowing that we're never going to hit the deadline.*

Many flash programmers are just as badass as MacGyver. They bravely code ahead, pushing to the side the concept of failure - they know they can cross between these two structures using only a coathanger and some magic strings (static) because there is no room for any other possibility.

And hell, it works out a lot of the time. We all built some awesome stuff in as1 when we didn't even have compile-time-checking or type safety. But if you're still MacGuyver coding today, I have one piece of advice: Don't look down.



* There are plenty of us who are cynical, but I think that's distinct from pessimism. We probably believe there are solutions to most problems, but that people are too damn stupid / greedy to allow them to be implemented.

** In reality, nobody ever hits the deadline with MacGuyver coding, but they at least agree to the deadline, which most of the time is all the boss is looking for. With TDD, nobody agrees to the deadline the boss was hoping for. As always, TDD just shifts that pain forward in the timeline.


Friday 15 October 2010

Don't dehydrate your code


DRY
(don't repeat yourself) is one of the first principles you learn as a fledgling coder.

When you build your first loop, you're embracing DRY, and all the wonderful code-shrinking that comes with it. And it feels good, and we're taught to seek out repetition as a terrible code smell, and do away with it as soon as possible.

But it's possible to be too DRY - no, really, it is.

When considering repetition for refactoring, there are two different questions we can ask:
  1. 'Are these two blocks of code the same?'
  2. 'Do these two blocks of code serve the same purpose?'

The different questions can lead to different answers for the same blocks of code. If the answer to question 2 is 'no' then you potentially create another, harder, refactor down the line, when you realise that the operation needs to change for one case, and not for another.

These too-dry refactorings leave their own particular code smell - optional parameters. They're not always a sign of overly-dry code - the robotlegs context takes an useful optional parameter for 'autostartup' - but when the optional parameter is littered through the code inside the function, it can be a sign that your code has become dehydrated and you'd be better off splitting it back out again - and either living with a little repetition or slicing the functionality up differently.

We make a similar call when we make decisions about inheritance and interfaces.

ClassB might have all the same functions as ClassA, and a few extras of its own, but unless ClassB truly "is a" ClassA, then there's no reason why the two should evolve together in the future. Better to tolerate the repetition because it correctly represents the cognitive model behind the application.

Similarly, unless ClassA and ClassB share obligations, and could potentially stand in for each other without your application logic losing sense, they shouldn't implement the same interface. Even if they have the same functions. Yes, even if they have the same functions.

Shut up with the "it's less code" thing

Of course all of this requires us to recognise that "It's less code" is never a refactoring justification in itself. Often, great refactorings create more code, not less. The objective is always to make your application easier to maintain, change and grow.

So - unless you're writing your code for a programmable calculator from the 1980s, any time you hear yourself think or say (or write, I see this a ton on the robotlegs support forum) "but it's less code this way..." just give yourself a little slap. There are often good reasons to take the fewer-classes approach, but they need to be more fleshed out than 'less code'.

The scarce resource is your brain. Your attention, your cognition, your working memory, your mental-models. And of course your time. An over-dry refactoring (particularly for the sake of 'less code') that requires a reversal later is expensive in terms of all these scarce resources.

Embrace (selective) repetition

A criticism of robotlegs is that you can end up with code repetition of very simple blocks of code in your Commands and Mediators, as well as a lot of custom Events. It *feels* weird to type the same code twice or three times to translate 3 different Events into the same corresponding action. But, in my mind, this is part of the power of the Command pattern.

Each Command encapsulates the logic steps required to respond to that situation - and they can freely change in future without impacting upon each other. The code repeated in your Commands is usually cheap and simple - if it's not then think about farming some of the logic out to a helper of some kind.

So don't sweat the 'glue' code having similarity between Commands and Mediators sometimes. Code that is dehydrated is just as tricky (sometimes trickier) to work with as code that needs a little DRYing off.

Thursday 7 October 2010

TDD: Make your intentions clear(er)

A habit I've got into recently is creating more and more classes specifically for the purpose of testing.

If the class extends a class which is meaningful to my application it's called ThingSupport (where Thing is the name of the Class it extends).

If the class is extending a more generic class - eg Sprite, or Command with an empty execute - then it's called SampleThing. (Again, Thing is replaced with the name of the Class it extends).

So - my source library now contains a growing number of classes like:

SampleCommandA
SampleCommandB
SampleVO
SampleEvent

UserVOSupport
MainMenuItemSupport
AccountModelSupport

The Support classes tend to populate the constructor parameters.

So if the constructor of UserVO is

public function UserVO(key:uint, username:String, password:String, jobTitle:String etc )

then the constructor of UserVOSupport is

public function UserVOSupport(key:uint = 1)
{
super(key, 'username_'+String(key), 'password_'+String(key) etc)
}

Now I can create a dozen UserVOs with distinct properties only passing an incrementing uint variable to the constructors. Lovely ... but...

...it's not about typing less

The ThingSupport classes do two things: they make it easier for me to instantiate the class, but also they isolate my tests from changes to the constructor of the Thing classes. If I update the Thing constructor to include another param, or change the order, I don't have to touch my tests - I just make a matching change in one place - ThingSupport.

When my coder-friends protest that they find TDD tiresome because if you make changes you have to do loads of maintenance on your tests, this makes me think that they're not isolating their tests well enough.


So what about the Sample classes?

What's the point in creating a SampleEvent? If the class/test requires a particular class why bother to create a sample class as well? Why not just use Event?

This is the newest thing I've added to my workflow - and hey, by the end of next week I might have changed my mind - but I'm experimenting with using it to make my intentions clearer in my tests.

As in - what's *real* and what's just test scaffolding. Particularly in integration tests where multiple classes are working together. The use of Sample as a prefix is helping me keep my head straight on which elements are being tested and which are simply provided to wire things up.

That way, when I look through my code and it uses SampleCommandA I know that the SampleCommandA is not part of what's being tested - it's just fodder for the test.


Couldn't you just use mocking?

I do use mocking (with mockolate) a lot. Actually I tend to use stubbing / verification - via mockolate. I use it to mock interfaces for services so that I can verify that a method on that service has been run by a command with the correct treatment of the event payload, even before I've implemented the concrete service to the interface.

I also mock (for verification) framework elements - injector, CommandMap - to verify that runtime mappings have been made correctly - again, usually in Commands. It's easy to think of Commands as being so banal that they're barely worth testing, but a Command that doesn't actually run the method on the service, or doesn't manipulate the event payload correctly, or fails to make a mapping could potentially lead to hours of debugging once compiled into the wider app.


Focus on minimising your WTFs per minute

As @mark_star tweeted this week: "The only valid measurement of code quality is WTFs/minute" (clean code).

I believe the use of Support and Sample classes is significantly lowering my WTF rate. It feels good.

I'm starting to think that a focus on "less typing = better" is really dangerous in a coder. I can type at 80 wpm. Can I code at 80 wpm? Can I f***! This 'more code = slower' belief is part of what keeps people from embracing TDD.

But it's a dangerous obsession for more than one reason. The search for the perfect IDE may actually lead to an increase in the kinds of WTFs that go on for hours and even days.

@_ondina posted some wisdom on the robotlegs board today about the value of being intimate enough with your code that you can write it by hand. We were discussing code generation tools for robotlegs at the time - and she made the valid point that if you rely too much on code generation tools then there's a chance that small errors - [inject] vs [Inject] - creep in to your code and you have no way of diagnosing them, because you haven't built up enough experiences of what looks right for your brain's pattern recognition system to let you know what is wrong.

Food for thought.