Tuesday 28 September 2010

Harnessing the Drive in Test Driven Development

I had an interesting exchange with someone on twitter last night - he'd been working on some math, that kind of looked right to him now, but had no way of knowing whether it was 'right' or not.

I teased him that he should have done TDD, but he felt that TDD meant you had to know your API ahead of coding, and his situation was evolving, so that ruled out TDD.

I was arguing that actually TDD is ideal for when you're not quite sure where you're headed - a view point that didn't fly with his experience - so this is an attempt to further explain that sentiment.


Your brain and body will try to resist TDD

A common barrier to adopting TDD (this is what my colleagues and peers come back with over and over) is "I don't know enough about what I'm doing to write the test yet."

My response is that if you don't know enough about what you're doing to write the test yet, you sure as hell don't know enough to write the code yet!

Test Driven Development shifts a ton of that 'wtf am I trying to do?' pain to the front of the process. And that's hard to get used to. It exposes what you don't have clarity about - when you really just want to get on and pretend you do know where you're headed.

So - how can TDD possibly help when you don't have a clear idea of your direction?


TDD means more than one kind of test

I do 3 kinds of tests. End-to-end tests, integration tests and unit tests. Combining all 3 test varieties is the key to driving evolving development. (Which, IMHO, is the only kind there really is.)

I write the end-to-end tests first. An end-to-end test describes a user story. If you don't know what your user stories are then you need to get those together before you take another step. User stories will be numerous, even in the simplest app, so don't overwhelm yourself - just start with the shortest, simplest user story in your requirements.


User story / end-to-end tests

In a complex app, the user stories are rarely actually end-to-end (startup to shutdown) but they capture a unit of meaningful interaction with the application.

There is only one user story end-to-end test in each test case. Some examples from my current app include:

LibrarySimpleSearchReturnsInvalid
LibrarySimpleSearchProducesNoResults
LibrarySimpleSearchProducesResults
LibraryShowAllShowsAllItems
LibraryAdvancedSearchProducesResultsByType
LibraryAdvancedSearchProducesResultsByExclusion

... you get the idea.

In each case, the test recreates the user experience from the moment of opening the library (which is a searchable, browsable set of resources of different types - jpg, document, video etc) until they receive their search results.

This means that it's a UI driven test. I have the test code enter text, push buttons etc, and I delve into the display list to verify that the correct items / text etc have appeared on screen at the end, and usually this is asynchronous to allow time for transitions.


Integration / functional area tests

These test a component of functionality. For example the search window, or the results viewer.

Unlike unit tests they make use of real concrete instances of the classes needed to fully instantiate and work with the components being tested. If the functional area depends on the framework to wire it together, the framework is instantiated (robotlegs in my case) in order to wire up the component.

In my current app I have an integration tests for the main menu:

NestedMenuTest

This menu has multiple layers of nested items and has to collapse all / expand all / auto-collapse and so on in response to checkbox clicks. My integration tests check that the scrolling behaves itself when the items are being expanded/collapsed.

test_max_scroll_then_collapseAll_resolves
test_mid_scroll_then_expandAll_keeps_top_visible_item_in_place_and_scales_scroller

and so on...


Usually, integration tests are event driven - I kick it all off by manually firing an event. Often, but not always, they require you to use the display list to verify the results.


Unit / API tests

These test that a specific class does what it is supposed to. They test all the public (API) functions of a class, sometimes multiple times if there are errors to be thrown or alternative paths through the class itself.

There is a wisdom that says test all API except for property accessors. I tend to test my property accessors as well, because there is no limit to what I can screw up and it's faster to get them right them at this point than when the error emerges later.

Instead of using concrete instances of their complex dependencies (eg services), my unit tests make use of mocks (using Drew Bourne's Mockolate) to verify that they've acted upon those classes correctly.

If I was testing that a command pulled values from the event that triggered it, did some jiggery pokery with these values and then used the results in calling a method on the appropriate service, I would mock the services to verify that call, rather than try to test against the results of the call.

Here, I've mocked the two services, lessonLoaderService / joinedLessonLoaderService:
public function testLoadsLessonIfRequestEventNotJoinedLesson():void{
var lessonLoadRequestData:ILessonLoadRequestData = new LessonLoadRequestDTO("test","testSwfPath", true, '', false);
var testEvent:LessonDownloadEvent = new LessonDownloadEvent(LessonDownloadEvent.LESSON_DOWNLOAD_READY, lessonLoadRequestData);
instance.event = testEvent;
instance.execute();

verify(instance.lessonLoaderService).method("loadLesson").args(equalTo('testSwfPath'));
verify(instance.lessonLoaderService).method("loadLesson").once();
}

public function testLoadsJoinedIfRequestEventIsJoinedLesson():void{
var lessonLoadRequestData:ILessonLoadRequestData = new LessonLoadRequestDTO("test","testSwfPath", true, '', true);
var testEvent:LessonDownloadEvent = new LessonDownloadEvent(LessonDownloadEvent.LESSON_DOWNLOAD_READY, lessonLoadRequestData);
instance.event = testEvent;
instance.execute();

verify(instance.joinedLessonLoaderService).method("loadLesson").args(equalTo('testSwfPath'));
verify(instance.joinedLessonLoaderService).method("loadLesson").once();
}


Putting it all together

Often, we put applications together from the bottom up. With half an eye on the requirements, we start thinking about interfaces and event types and functional areas. This works, but it can also result in some YAGNI code, as well as throwing code out that seemed like it was going to be relevant until you realised that the requirements weren't complete.

I think there's more sanity in a work flow that runs this way:

1) User story
2) End-to-end test that verifies this user story (or part of it - this can evolve)
3) Integration tests for the functional areas required to fulfil the end-to-end-test
4) Unit tests for the classes required to provide the functional areas
5) Code to pass 4, to pass 3, to pass 2, to verify against 1

... and when you have no fails, then add to 2, add to 3, add to 4, do 5, rinse and repeat etc.

Doing it this way, the consequences are:
  1. A lot of head scratching early in each cycle (creating end-to-end tests is hard).
  2. Usually having failing tests in your test suite, until you're done adding a feature/user story.
  3. Always being able to tell what the hell you were doing when you last stopped ... because your failing tests make that obvious.
  4. Never writing code that doesn't actually add value to the project by contributing to the implementation of a user story.
  5. Always working towards a 'shippable' goal, which is good for the client (and your cash flow if you bill against features) and also allows real user feedback to improve the work still to be done.
  6. Reduced cognitive load for you at a micro level - you fix problems in the code while that part of the code is what you're focussed on.
  7. Reduced cognitive load for you at a macro level - you don't have to hold the 'where am I going' part in your head, or remember to test manually, because your user story tests have that covered.


I would argue that as a consequence of those last two there's a bigger reward: being able to show up more fully for the rest of your life. A bug, or a concern about whether I've really implemented feature X correctly, impacts on my ability to be present for my family. I'm kind of not-really-there at dinner, because 90% of my brain is background processing code stuff. This still happens with TDD, but it happens a lot less.

So, if you don't find that TDD is improving your code and your process - not just your output but also your enjoyment - then my (cheeky) suggestion is that you've not discovered what it's really about yet.

In my experience, TDD is fun. Chemicals in my brain fun.


PS. If you haven't already got it, this book is essential reading on harnessing the Drive in TDD: http://www.growing-object-oriented-software.com/ Props to Joel Hooks for recommending it.

11 comments:

shaun said...

Nice post! The embedded code is very hard to read though as the indenting has been stripped. How about embedding a Gist?

nek said...

Amazing! I've just made my first step on the same road and here is a crystal clear roadmap. Also I think you have a mixed BDD/TDD technique.

From my experience it's a good to have end-to-end tests treat your system as a black box and just emulate user actions.
I use cucumber (http://cuke.info) to do that with my flex/air apps (with the help of amazing melomel lib http://melomel.info) .

Can you tell me from the height of your TDD experience is it a good idea?

Anonymous said...

Thanks for the article, @stray. I think see how you can use TDD to help guide API development; especially if you use a mocking framework to mock up what you think would be a good API, then test against that.

However, I'm still not convinced that TDD is a net win: it's not clear to me that the value added by writing, debugging, re-writing, re-debugging (when specs change) and maintaining tests written in TDD style (eg, exhaustive and up-front) outweighs the costs associated with it.

For example (full disclosure: I read this comparison somewhere else, I didn't think of it), consider two “famous” sudoku solvers, Norvig's[0] and Jeffries'[1].

Jeffries tries to write his Sudoku solver in TDD-style… But, five blog posts and a whole lot of code later (I stopped counting at 400 lines), he has what I can only guess is a partially finished (but throughly tested!) sort-of-Sudoku-solver.

Norvig, on the other hand, writes a complete Sudoku solver in 200 lines code, which happen to include a few, albeit less exhaustive, tests.

Of course, this isn't an entirely fair comparison (first, because not much is a fair comparison when one of the comperands is Norvig, but also because the goals of the articles are slightly different)… But it does illustrate a situation where TDD clearly didn't produce better code.

[0]: http://norvig.com/sudoku.html
[1]: http://xprogramming.com/articles/oksudoku/

Stray said...

@shaun - good suggestion. This template totally hates me. I'll do the Gist thing.

@nek - great! Always nice when it's good timing for someone.

Yes - the end-to-end tests just treat the app as a black box. They are essentially a set of automated eyes and fingers that can only do what a user would do.

Melomel/cucumber looks interesting.

I'm currently doing my UI/end-to-end tests via http://github.com/Stray/RobotEyes

It's primitive but works fine, and allows me to include my end-to-end tests in my normal ASUnit test suite. I'm adding functionality as I need it.

Stray said...

@Wolever - I'm still not buying 'exhaustive and upfront' on the TDD style. I write 1 test, then the code that passes that test... yes, eventually it's exhaustive, and each individual test is written just prior to the code that implements it, but 'exhaustive and upfront' sounds like I sit down and write 7000 tests, and then start the code!

If your test code is tough to maintain then it's because you've thrown all your normal coding best practices re keeping dry etc out the window (amazing how easy it is to do that).

So - I have a 'support' source directory that contains objects like 'BadUserVO' and 'GoodUserVO' and so on, which are used in my tests. If the signature of the UserVO has to change, I only have to change my actual objects in one place. I use helper functions a lot. I have classes which simply support my tests and reduce code repetition and complexity.

The test suite on my current app is about 7500 tests. Believe me, at the point where it was about 1000 I felt the 'change' pain and got my crap together to implement the same practices in my test code as my app code!

Norvig vs Jeffries is interesting, but I don't buy it as proof of anything other than that some solutions, and some programmers, are better than others at any given moment in time!

Anyway - a 200 line app is so far from what most people are building that it's not really representative. It's a nice story, but I'm a scientist. One anecdote is nothing other than an invitation for a robust investigation.

De-bugging is an interesting one. I have only 'debugged' three or four times this year. In all of those instances, the source of the bug has turned out to be the flash framework itself, and not something I can do anything about - just something I need to work around. (Crappy swf loading etc).

I trace stuff out obviously, and I have logging, but I do very little debugging, and certainly don't have to return to debugging once a test has passed. And when I am looking a traces, I'm only compiling that one class, for that one test, so it's a lot less to wade through.

The bit about specs-changing - I think TDD helps me embrace change. I can make changes, whether they're to the requirements or just refactoring, confident that if I accidentally break something that was previously working then my tests will tell me. That feels very different to my experience without TDD!

Still - I think it's about the confluence of the project and the coder and the coder's circumstances. I am constantly being interrupted, and my project is huge, so I need testing to make me efficient - it keeps my focus small, and lowers the cognitive entry point of picking up after an interruption.

There are only two reasons why TDD would be slower: either you're doing it wrong, or you're an awesomely competent coder with the concentration of a laser and plenty of opportunities to work long periods without that being broken!

If you never write crappy code, never forget to actually implement a function, never forget to set the variable value, always understand the full implications of your changes before you make them... (or even if those things are true 95% of the time) ... then TDD won't add anything except an unbelievably robust and complete set of documentation about how to use every function of every class in your application.

I happen to think that this is pretty useful too - for me (returning after time away) or for another coder to pick up my project.

Squee-D said...

I think what most people miss is that your tests are supposed to describe intent, it's like the amazing clarity you get trying to explain a problem to a friend, and before the friend can help, you've arrived at the answer because you've had to organise your thoughts.

Except, with TDD, you organise your thoughts into tests, which can tell you wether you are "Achieving your intent". The test is your "definition of intent"

The iterative part of the process (the thing that makes it 'not all up front') is that when you feel you should be "achieving your intent" but your tests are failing, you have immediate feedback, and it becomes very clear whether your code is wrong, or your "definition of intent" is wrong.

Now I achieve this kind of feedback without TDD, I use descriptive domain specific language a lot in my actual code, making my intent clearer, however I still don't have an indicator for when my intent is broken by later changes. I'd love to have more tests around my code, and my excuses for not doing so are a bit weak.

I must say though, until i figured that a class should represent a very small unit of work (Single Responsibility Principle), and had a toolbelt for making them small, I had far better excuses for not using TDD.. TDD does not work with monster classes, it's too hard to maintain. Way too hard. Mostly because code logic ends up in the tests out of necessity.

So if you're aspiring to TDD, it will go a lot more smoothly if you embrace SRP as well.

Stray said...

You're right Squee-D - TDD (with some BDD) is the process by which I figure out how I intend to fulfil a user story.

The test process is not totally dissimilar to the talking to friend / dog process. It does feel like a dialog in many ways.

I also use as much descriptive code as I can - both in the tests and the code to pass the tests. It does really help to have function names that describe what you're trying to achieve rather than just how you think it works :)

I'm very into SRP. I have a lot of small classes. Not everybody's taste but it works for me!

nek said...

Stray: I'm actually watching your repo with RobotEyes for some time already. :)
I was considering to use it but chose cucumber with it's gherkin DSL as something our business analytic could understand easily. But I think I could use RobotEyes for the projects where I write all user stories myself.

And thanks a lot for http://www.growing-object-oriented-software.com/ it's simply amazing!

Stray said...

Hi Nek,

yes, RobotEyes is more suitable for the developer - it needs a test framework like ASUnit and you have to write your tests in AS3.

So - ideal if you want to integrate your end-to-end tests into your test suite, but not ideal if you want non-coders to get involved.

The cucumber/gherkin DSL is awesomely understandable. I imagine it would be possible to build a little AIR app that allowed the end user to create RobotEyes user story tests, but I'm not sure how soon I'd get a chance to build it. Nice idea though!

Yennick Trevels said...

Hi Stray,
What libraries other then FlexUnit and Mockolate do you use while TDD'ing?
Do you have any experience with UInput (http://bitbucket.org/loomis/uinput/wiki/Home) or MoreFluent(http://bitbucket.org/loomis/morefluent/wiki/Home)? If so, what's your experience with them?

Stray said...

Hi Slevin,

I actually use ASUnit rather than FlexUnit - I like to have hackable source and I find ASUnit really great to use.

I have my own library for end-to-end testing - roboteyes - which is pretty primitive, but does the job. It allows you to search for an on-stage instance of 'something', by property if you have more than one of that 'thing', and then it gives you back a driver which allows you to click, enter text, check properties, listen for events etc.

From what I can see, roboteyes is basically the same as UInput in its approach, except that roboteyes is AS3 based, where UInput is flex driven.

MoreFluent looks interesting - I'll dig in to that this week - thanks for that, hadn't crossed my path before :)

The other essential in my test kit is sprouts - I use sprouts + some custom rake tasks to let me run tests using a filter.

eg: rake testpackage['.tools.'] will only run the tests for everything in the tools package.

rake testpackage['View'] will test anything with View in the class name.

rake testpackage['SomeClassTest'] runs exactly the class test.

That's super useful when you're on a big project and your test suite is in the thousands.

It also helps keep me focussed!

I'll have a play with MoreFluent and blog my experiments here.

Cheers,

Stray