Tuesday 28 September 2010

Harnessing the Drive in Test Driven Development

I had an interesting exchange with someone on twitter last night - he'd been working on some math, that kind of looked right to him now, but had no way of knowing whether it was 'right' or not.

I teased him that he should have done TDD, but he felt that TDD meant you had to know your API ahead of coding, and his situation was evolving, so that ruled out TDD.

I was arguing that actually TDD is ideal for when you're not quite sure where you're headed - a view point that didn't fly with his experience - so this is an attempt to further explain that sentiment.


Your brain and body will try to resist TDD

A common barrier to adopting TDD (this is what my colleagues and peers come back with over and over) is "I don't know enough about what I'm doing to write the test yet."

My response is that if you don't know enough about what you're doing to write the test yet, you sure as hell don't know enough to write the code yet!

Test Driven Development shifts a ton of that 'wtf am I trying to do?' pain to the front of the process. And that's hard to get used to. It exposes what you don't have clarity about - when you really just want to get on and pretend you do know where you're headed.

So - how can TDD possibly help when you don't have a clear idea of your direction?


TDD means more than one kind of test

I do 3 kinds of tests. End-to-end tests, integration tests and unit tests. Combining all 3 test varieties is the key to driving evolving development. (Which, IMHO, is the only kind there really is.)

I write the end-to-end tests first. An end-to-end test describes a user story. If you don't know what your user stories are then you need to get those together before you take another step. User stories will be numerous, even in the simplest app, so don't overwhelm yourself - just start with the shortest, simplest user story in your requirements.


User story / end-to-end tests

In a complex app, the user stories are rarely actually end-to-end (startup to shutdown) but they capture a unit of meaningful interaction with the application.

There is only one user story end-to-end test in each test case. Some examples from my current app include:

LibrarySimpleSearchReturnsInvalid
LibrarySimpleSearchProducesNoResults
LibrarySimpleSearchProducesResults
LibraryShowAllShowsAllItems
LibraryAdvancedSearchProducesResultsByType
LibraryAdvancedSearchProducesResultsByExclusion

... you get the idea.

In each case, the test recreates the user experience from the moment of opening the library (which is a searchable, browsable set of resources of different types - jpg, document, video etc) until they receive their search results.

This means that it's a UI driven test. I have the test code enter text, push buttons etc, and I delve into the display list to verify that the correct items / text etc have appeared on screen at the end, and usually this is asynchronous to allow time for transitions.


Integration / functional area tests

These test a component of functionality. For example the search window, or the results viewer.

Unlike unit tests they make use of real concrete instances of the classes needed to fully instantiate and work with the components being tested. If the functional area depends on the framework to wire it together, the framework is instantiated (robotlegs in my case) in order to wire up the component.

In my current app I have an integration tests for the main menu:

NestedMenuTest

This menu has multiple layers of nested items and has to collapse all / expand all / auto-collapse and so on in response to checkbox clicks. My integration tests check that the scrolling behaves itself when the items are being expanded/collapsed.

test_max_scroll_then_collapseAll_resolves
test_mid_scroll_then_expandAll_keeps_top_visible_item_in_place_and_scales_scroller

and so on...


Usually, integration tests are event driven - I kick it all off by manually firing an event. Often, but not always, they require you to use the display list to verify the results.


Unit / API tests

These test that a specific class does what it is supposed to. They test all the public (API) functions of a class, sometimes multiple times if there are errors to be thrown or alternative paths through the class itself.

There is a wisdom that says test all API except for property accessors. I tend to test my property accessors as well, because there is no limit to what I can screw up and it's faster to get them right them at this point than when the error emerges later.

Instead of using concrete instances of their complex dependencies (eg services), my unit tests make use of mocks (using Drew Bourne's Mockolate) to verify that they've acted upon those classes correctly.

If I was testing that a command pulled values from the event that triggered it, did some jiggery pokery with these values and then used the results in calling a method on the appropriate service, I would mock the services to verify that call, rather than try to test against the results of the call.

Here, I've mocked the two services, lessonLoaderService / joinedLessonLoaderService:
public function testLoadsLessonIfRequestEventNotJoinedLesson():void{
var lessonLoadRequestData:ILessonLoadRequestData = new LessonLoadRequestDTO("test","testSwfPath", true, '', false);
var testEvent:LessonDownloadEvent = new LessonDownloadEvent(LessonDownloadEvent.LESSON_DOWNLOAD_READY, lessonLoadRequestData);
instance.event = testEvent;
instance.execute();

verify(instance.lessonLoaderService).method("loadLesson").args(equalTo('testSwfPath'));
verify(instance.lessonLoaderService).method("loadLesson").once();
}

public function testLoadsJoinedIfRequestEventIsJoinedLesson():void{
var lessonLoadRequestData:ILessonLoadRequestData = new LessonLoadRequestDTO("test","testSwfPath", true, '', true);
var testEvent:LessonDownloadEvent = new LessonDownloadEvent(LessonDownloadEvent.LESSON_DOWNLOAD_READY, lessonLoadRequestData);
instance.event = testEvent;
instance.execute();

verify(instance.joinedLessonLoaderService).method("loadLesson").args(equalTo('testSwfPath'));
verify(instance.joinedLessonLoaderService).method("loadLesson").once();
}


Putting it all together

Often, we put applications together from the bottom up. With half an eye on the requirements, we start thinking about interfaces and event types and functional areas. This works, but it can also result in some YAGNI code, as well as throwing code out that seemed like it was going to be relevant until you realised that the requirements weren't complete.

I think there's more sanity in a work flow that runs this way:

1) User story
2) End-to-end test that verifies this user story (or part of it - this can evolve)
3) Integration tests for the functional areas required to fulfil the end-to-end-test
4) Unit tests for the classes required to provide the functional areas
5) Code to pass 4, to pass 3, to pass 2, to verify against 1

... and when you have no fails, then add to 2, add to 3, add to 4, do 5, rinse and repeat etc.

Doing it this way, the consequences are:
  1. A lot of head scratching early in each cycle (creating end-to-end tests is hard).
  2. Usually having failing tests in your test suite, until you're done adding a feature/user story.
  3. Always being able to tell what the hell you were doing when you last stopped ... because your failing tests make that obvious.
  4. Never writing code that doesn't actually add value to the project by contributing to the implementation of a user story.
  5. Always working towards a 'shippable' goal, which is good for the client (and your cash flow if you bill against features) and also allows real user feedback to improve the work still to be done.
  6. Reduced cognitive load for you at a micro level - you fix problems in the code while that part of the code is what you're focussed on.
  7. Reduced cognitive load for you at a macro level - you don't have to hold the 'where am I going' part in your head, or remember to test manually, because your user story tests have that covered.


I would argue that as a consequence of those last two there's a bigger reward: being able to show up more fully for the rest of your life. A bug, or a concern about whether I've really implemented feature X correctly, impacts on my ability to be present for my family. I'm kind of not-really-there at dinner, because 90% of my brain is background processing code stuff. This still happens with TDD, but it happens a lot less.

So, if you don't find that TDD is improving your code and your process - not just your output but also your enjoyment - then my (cheeky) suggestion is that you've not discovered what it's really about yet.

In my experience, TDD is fun. Chemicals in my brain fun.


PS. If you haven't already got it, this book is essential reading on harnessing the Drive in TDD: http://www.growing-object-oriented-software.com/ Props to Joel Hooks for recommending it.

Friday 3 September 2010

Air Swf loading anomolies

The application I do most of my work on (An AIR 1.5 app) loads swfs in 2 ways.

There are modules of functionality that are loaded into the application sandbox using the secure loading method that I've blogged about here before.

There are also 'lessons' which consist of multiple swfs that are loaded into the non-application sandbox. If a lesson consists of 4 swfs then they are all loaded, and then they are added to / removed from the stage as the user browses through the lesson. When another lesson is loaded the previous lesson swfs are first unloaded.

These swfs are having code injected into their timelines, but they also contain code at timeline depth and in nested MovieClips. This is generally just simple 'show this label on MouseOver' functionality. Nothing fancy.

I found a few months ago that loading the swfs serial - as you would expect to do [by waiting for one to INIT before loading the next] - caused the code in the nested MovieClips to *often* fail - on Mac or PC. I tried to track down the cause of this, and through my experiments I discovered that loading the swfs parallel seemed to solve the problem. So, unable to discover a reason I went with the approach that worked.

Now the problem has reared its ugly head again - but only on PC. On Mac, all code works fine all the time.

On PC (Windows XP), the timeline code, and the injected code, are always fine. But MovieClip code - whether done in a separate class file or just put into the MC timeline, is borked in SOME swfs, depending on what order the swfs INIT in.

So far I've discovered the following:

1) Loading the swfs serial, MC code is frequently borked on Mac or PC.

2) Loading the swfs parallel, code borking is dependent on the INIT order. [Might this indicate that the swf isn't fully loading when it INITs first?]

3) Whether the code is directly in the MC, or just in the MC timeline, it's just not there - even a stop() doesn't work.

4) I've replicated the problem with simple assets (a 2 frame MC with diff coloured squares and a stop() on frame 1), guaranteed not to clash with any other assets in any other swf being loaded.

5) The swfs most likely to be borked are those with the largest file size. BUT removing some assets to reduce the file size doesn't appear to help.

6) The swfs are currently being exported from Flash Pro CS4. I'm downloading CS5 trial just incase it helps. They're FP10.

7) Some swfs are only borked if they INIT first. Some are able to tolerate being the first to INIT without problems. Some are borked if they're first OR second (of six in that case) to load.

8) Adding a pause after all the swfs have loaded, before I do anything else, makes no difference.

9) Air runtime is 2.0.3


--- EDIT - conclusion

I think I've found the source of the problem.

The load order and the broken movieclip code are both symptoms of the same problem: the swf has not loaded into memory correctly.

The broken swfs are usually the largest, and when they INIT ahead of much smaller swfs it's probably a sign of incomplete loading, hence the correlation with Level-1 code not running. (Level-0 is timeline, level-1 is code inside MCs on that timeline etc).

Having run the tests hundreds of times I've found that this problem occurs approximately 50% of the time in a Parallels-XP virtual PC, and occasionally on a Windows 7 machine that has been down-versioned to windows XP (and hiccups in many other areas). So far it does not happen on a Mac book pro bootcamp XP install, or a stable Vista install.

I guess it's a genuine thread crash in the Air app, most likely some sort of memory issue?

I have not been able to find a way to prevent the problem from happening. Instead I have a work around:

Each swf contains a MC on the timeline on frame 1 with a line of code with the sole purpose of notifying the application (via the sandbox bridge) that level-1 code has been loaded correctly. If this notification is not received for a particular swf by the time the 'COMPLETE' event fires, the swf is deemed to have loaded incorrectly, and is unloaded from memory and loaded again.

This process is repeated until the load is correct. The files are local so the delay to the user is minimal.