Thursday, 28 October 2010

TDD: The Starwars Answer vs The MacGyver Principle

"But what if you write a bug - like a typo or something - into the test? I mean, your tests aren't going to be perfect, are they?"

This is a genuine question someone - understandably - asked me earlier this week when I was introducing them to the idea of TDD.

I have two answers to this question:

1) The math answer:

If P(Error) is the probability of any line of code you write containing an error, and you write t lines of test code and tn lines of production code, where n is greater than 1, then bugs in both tests and production code can only slip through when they coincide, and you will fix tn bugs in production code for every t bugs in the tests.

And as P(A + B) = P(A) * P(B) then you vastly reduce the number of undetected bugs caused by genuine code errors - eg accidental assignment, wrong operations, wrong variable names etc.

If you don't speak Math, the outcome is many, many fewer small bugs.

2) The Starwars Answer:

"These are not the bugs you are looking for."

No - really, typos and wrong negatives are cool to find, and do cause many wasted hours debugging, but this isn't why I do TDD.

TDD is most powerful when it really hurts. The pain comes because you realise that you have written, or planned to write, some code that is hard to test. TDD pokes you right in the coupling. It stabs you with your own statics and shoves your crappy assumptions right...

... you get the idea?

When you write a test for a class, or a unit of functionality, you are trying to do that outside of the application. So all those hooks and shortcuts and bits of spaghetti that seem so handy inside your app swiftly start to look like what they are: problems.

But there's something TDD brings to your attention that's even harder to swallow...

TDD goes against the MacGyver principle

TDD exposes to you the numerous pathways through your code. If you're testing something as simple as a form with a set of radio buttons, you have to test every possible option for that radio button selection. You imagined you were going to write half a dozen tests for this class and suddenly you've got 15. And you can still think of more special cases that aren't covered.

This is a head-fuck because we like to pretend, to ourselves, that what we're trying to do is much, much easier and less complex than it actually is. I suspect there are very few genuine pessimists* in programming. We are self-selected MacGyver types with a tendency to see the solutions in any situation.

This process relies on the confidence trick of pretending to yourself that what you're undertaking is pretty straightforward, really. We have to turn a blind eye to the factorial expansion of pathways through our code presented by each new 'tiny' feature, because the alternative is lying awake at night wondering whether we've really covered every important combination of user actions, and knowing that we're never going to hit the deadline.*

Many flash programmers are just as badass as MacGyver. They bravely code ahead, pushing to the side the concept of failure - they know they can cross between these two structures using only a coathanger and some magic strings (static) because there is no room for any other possibility.

And hell, it works out a lot of the time. We all built some awesome stuff in as1 when we didn't even have compile-time-checking or type safety. But if you're still MacGuyver coding today, I have one piece of advice: Don't look down.

* There are plenty of us who are cynical, but I think that's distinct from pessimism. We probably believe there are solutions to most problems, but that people are too damn stupid / greedy to allow them to be implemented.

** In reality, nobody ever hits the deadline with MacGuyver coding, but they at least agree to the deadline, which most of the time is all the boss is looking for. With TDD, nobody agrees to the deadline the boss was hoping for. As always, TDD just shifts that pain forward in the timeline.

Friday, 15 October 2010

Don't dehydrate your code

(don't repeat yourself) is one of the first principles you learn as a fledgling coder.

When you build your first loop, you're embracing DRY, and all the wonderful code-shrinking that comes with it. And it feels good, and we're taught to seek out repetition as a terrible code smell, and do away with it as soon as possible.

But it's possible to be too DRY - no, really, it is.

When considering repetition for refactoring, there are two different questions we can ask:
  1. 'Are these two blocks of code the same?'
  2. 'Do these two blocks of code serve the same purpose?'

The different questions can lead to different answers for the same blocks of code. If the answer to question 2 is 'no' then you potentially create another, harder, refactor down the line, when you realise that the operation needs to change for one case, and not for another.

These too-dry refactorings leave their own particular code smell - optional parameters. They're not always a sign of overly-dry code - the robotlegs context takes an useful optional parameter for 'autostartup' - but when the optional parameter is littered through the code inside the function, it can be a sign that your code has become dehydrated and you'd be better off splitting it back out again - and either living with a little repetition or slicing the functionality up differently.

We make a similar call when we make decisions about inheritance and interfaces.

ClassB might have all the same functions as ClassA, and a few extras of its own, but unless ClassB truly "is a" ClassA, then there's no reason why the two should evolve together in the future. Better to tolerate the repetition because it correctly represents the cognitive model behind the application.

Similarly, unless ClassA and ClassB share obligations, and could potentially stand in for each other without your application logic losing sense, they shouldn't implement the same interface. Even if they have the same functions. Yes, even if they have the same functions.

Shut up with the "it's less code" thing

Of course all of this requires us to recognise that "It's less code" is never a refactoring justification in itself. Often, great refactorings create more code, not less. The objective is always to make your application easier to maintain, change and grow.

So - unless you're writing your code for a programmable calculator from the 1980s, any time you hear yourself think or say (or write, I see this a ton on the robotlegs support forum) "but it's less code this way..." just give yourself a little slap. There are often good reasons to take the fewer-classes approach, but they need to be more fleshed out than 'less code'.

The scarce resource is your brain. Your attention, your cognition, your working memory, your mental-models. And of course your time. An over-dry refactoring (particularly for the sake of 'less code') that requires a reversal later is expensive in terms of all these scarce resources.

Embrace (selective) repetition

A criticism of robotlegs is that you can end up with code repetition of very simple blocks of code in your Commands and Mediators, as well as a lot of custom Events. It *feels* weird to type the same code twice or three times to translate 3 different Events into the same corresponding action. But, in my mind, this is part of the power of the Command pattern.

Each Command encapsulates the logic steps required to respond to that situation - and they can freely change in future without impacting upon each other. The code repeated in your Commands is usually cheap and simple - if it's not then think about farming some of the logic out to a helper of some kind.

So don't sweat the 'glue' code having similarity between Commands and Mediators sometimes. Code that is dehydrated is just as tricky (sometimes trickier) to work with as code that needs a little DRYing off.

Thursday, 7 October 2010

TDD: Make your intentions clear(er)

A habit I've got into recently is creating more and more classes specifically for the purpose of testing.

If the class extends a class which is meaningful to my application it's called ThingSupport (where Thing is the name of the Class it extends).

If the class is extending a more generic class - eg Sprite, or Command with an empty execute - then it's called SampleThing. (Again, Thing is replaced with the name of the Class it extends).

So - my source library now contains a growing number of classes like:



The Support classes tend to populate the constructor parameters.

So if the constructor of UserVO is

public function UserVO(key:uint, username:String, password:String, jobTitle:String etc )

then the constructor of UserVOSupport is

public function UserVOSupport(key:uint = 1)
super(key, 'username_'+String(key), 'password_'+String(key) etc)

Now I can create a dozen UserVOs with distinct properties only passing an incrementing uint variable to the constructors. Lovely ... but...'s not about typing less

The ThingSupport classes do two things: they make it easier for me to instantiate the class, but also they isolate my tests from changes to the constructor of the Thing classes. If I update the Thing constructor to include another param, or change the order, I don't have to touch my tests - I just make a matching change in one place - ThingSupport.

When my coder-friends protest that they find TDD tiresome because if you make changes you have to do loads of maintenance on your tests, this makes me think that they're not isolating their tests well enough.

So what about the Sample classes?

What's the point in creating a SampleEvent? If the class/test requires a particular class why bother to create a sample class as well? Why not just use Event?

This is the newest thing I've added to my workflow - and hey, by the end of next week I might have changed my mind - but I'm experimenting with using it to make my intentions clearer in my tests.

As in - what's *real* and what's just test scaffolding. Particularly in integration tests where multiple classes are working together. The use of Sample as a prefix is helping me keep my head straight on which elements are being tested and which are simply provided to wire things up.

That way, when I look through my code and it uses SampleCommandA I know that the SampleCommandA is not part of what's being tested - it's just fodder for the test.

Couldn't you just use mocking?

I do use mocking (with mockolate) a lot. Actually I tend to use stubbing / verification - via mockolate. I use it to mock interfaces for services so that I can verify that a method on that service has been run by a command with the correct treatment of the event payload, even before I've implemented the concrete service to the interface.

I also mock (for verification) framework elements - injector, CommandMap - to verify that runtime mappings have been made correctly - again, usually in Commands. It's easy to think of Commands as being so banal that they're barely worth testing, but a Command that doesn't actually run the method on the service, or doesn't manipulate the event payload correctly, or fails to make a mapping could potentially lead to hours of debugging once compiled into the wider app.

Focus on minimising your WTFs per minute

As @mark_star tweeted this week: "The only valid measurement of code quality is WTFs/minute" (clean code).

I believe the use of Support and Sample classes is significantly lowering my WTF rate. It feels good.

I'm starting to think that a focus on "less typing = better" is really dangerous in a coder. I can type at 80 wpm. Can I code at 80 wpm? Can I f***! This 'more code = slower' belief is part of what keeps people from embracing TDD.

But it's a dangerous obsession for more than one reason. The search for the perfect IDE may actually lead to an increase in the kinds of WTFs that go on for hours and even days.

@_ondina posted some wisdom on the robotlegs board today about the value of being intimate enough with your code that you can write it by hand. We were discussing code generation tools for robotlegs at the time - and she made the valid point that if you rely too much on code generation tools then there's a chance that small errors - [inject] vs [Inject] - creep in to your code and you have no way of diagnosing them, because you haven't built up enough experiences of what looks right for your brain's pattern recognition system to let you know what is wrong.

Food for thought.

Tuesday, 28 September 2010

Harnessing the Drive in Test Driven Development

I had an interesting exchange with someone on twitter last night - he'd been working on some math, that kind of looked right to him now, but had no way of knowing whether it was 'right' or not.

I teased him that he should have done TDD, but he felt that TDD meant you had to know your API ahead of coding, and his situation was evolving, so that ruled out TDD.

I was arguing that actually TDD is ideal for when you're not quite sure where you're headed - a view point that didn't fly with his experience - so this is an attempt to further explain that sentiment.

Your brain and body will try to resist TDD

A common barrier to adopting TDD (this is what my colleagues and peers come back with over and over) is "I don't know enough about what I'm doing to write the test yet."

My response is that if you don't know enough about what you're doing to write the test yet, you sure as hell don't know enough to write the code yet!

Test Driven Development shifts a ton of that 'wtf am I trying to do?' pain to the front of the process. And that's hard to get used to. It exposes what you don't have clarity about - when you really just want to get on and pretend you do know where you're headed.

So - how can TDD possibly help when you don't have a clear idea of your direction?

TDD means more than one kind of test

I do 3 kinds of tests. End-to-end tests, integration tests and unit tests. Combining all 3 test varieties is the key to driving evolving development. (Which, IMHO, is the only kind there really is.)

I write the end-to-end tests first. An end-to-end test describes a user story. If you don't know what your user stories are then you need to get those together before you take another step. User stories will be numerous, even in the simplest app, so don't overwhelm yourself - just start with the shortest, simplest user story in your requirements.

User story / end-to-end tests

In a complex app, the user stories are rarely actually end-to-end (startup to shutdown) but they capture a unit of meaningful interaction with the application.

There is only one user story end-to-end test in each test case. Some examples from my current app include:


... you get the idea.

In each case, the test recreates the user experience from the moment of opening the library (which is a searchable, browsable set of resources of different types - jpg, document, video etc) until they receive their search results.

This means that it's a UI driven test. I have the test code enter text, push buttons etc, and I delve into the display list to verify that the correct items / text etc have appeared on screen at the end, and usually this is asynchronous to allow time for transitions.

Integration / functional area tests

These test a component of functionality. For example the search window, or the results viewer.

Unlike unit tests they make use of real concrete instances of the classes needed to fully instantiate and work with the components being tested. If the functional area depends on the framework to wire it together, the framework is instantiated (robotlegs in my case) in order to wire up the component.

In my current app I have an integration tests for the main menu:


This menu has multiple layers of nested items and has to collapse all / expand all / auto-collapse and so on in response to checkbox clicks. My integration tests check that the scrolling behaves itself when the items are being expanded/collapsed.


and so on...

Usually, integration tests are event driven - I kick it all off by manually firing an event. Often, but not always, they require you to use the display list to verify the results.

Unit / API tests

These test that a specific class does what it is supposed to. They test all the public (API) functions of a class, sometimes multiple times if there are errors to be thrown or alternative paths through the class itself.

There is a wisdom that says test all API except for property accessors. I tend to test my property accessors as well, because there is no limit to what I can screw up and it's faster to get them right them at this point than when the error emerges later.

Instead of using concrete instances of their complex dependencies (eg services), my unit tests make use of mocks (using Drew Bourne's Mockolate) to verify that they've acted upon those classes correctly.

If I was testing that a command pulled values from the event that triggered it, did some jiggery pokery with these values and then used the results in calling a method on the appropriate service, I would mock the services to verify that call, rather than try to test against the results of the call.

Here, I've mocked the two services, lessonLoaderService / joinedLessonLoaderService:
public function testLoadsLessonIfRequestEventNotJoinedLesson():void{
var lessonLoadRequestData:ILessonLoadRequestData = new LessonLoadRequestDTO("test","testSwfPath", true, '', false);
var testEvent:LessonDownloadEvent = new LessonDownloadEvent(LessonDownloadEvent.LESSON_DOWNLOAD_READY, lessonLoadRequestData);
instance.event = testEvent;


public function testLoadsJoinedIfRequestEventIsJoinedLesson():void{
var lessonLoadRequestData:ILessonLoadRequestData = new LessonLoadRequestDTO("test","testSwfPath", true, '', true);
var testEvent:LessonDownloadEvent = new LessonDownloadEvent(LessonDownloadEvent.LESSON_DOWNLOAD_READY, lessonLoadRequestData);
instance.event = testEvent;


Putting it all together

Often, we put applications together from the bottom up. With half an eye on the requirements, we start thinking about interfaces and event types and functional areas. This works, but it can also result in some YAGNI code, as well as throwing code out that seemed like it was going to be relevant until you realised that the requirements weren't complete.

I think there's more sanity in a work flow that runs this way:

1) User story
2) End-to-end test that verifies this user story (or part of it - this can evolve)
3) Integration tests for the functional areas required to fulfil the end-to-end-test
4) Unit tests for the classes required to provide the functional areas
5) Code to pass 4, to pass 3, to pass 2, to verify against 1

... and when you have no fails, then add to 2, add to 3, add to 4, do 5, rinse and repeat etc.

Doing it this way, the consequences are:
  1. A lot of head scratching early in each cycle (creating end-to-end tests is hard).
  2. Usually having failing tests in your test suite, until you're done adding a feature/user story.
  3. Always being able to tell what the hell you were doing when you last stopped ... because your failing tests make that obvious.
  4. Never writing code that doesn't actually add value to the project by contributing to the implementation of a user story.
  5. Always working towards a 'shippable' goal, which is good for the client (and your cash flow if you bill against features) and also allows real user feedback to improve the work still to be done.
  6. Reduced cognitive load for you at a micro level - you fix problems in the code while that part of the code is what you're focussed on.
  7. Reduced cognitive load for you at a macro level - you don't have to hold the 'where am I going' part in your head, or remember to test manually, because your user story tests have that covered.

I would argue that as a consequence of those last two there's a bigger reward: being able to show up more fully for the rest of your life. A bug, or a concern about whether I've really implemented feature X correctly, impacts on my ability to be present for my family. I'm kind of not-really-there at dinner, because 90% of my brain is background processing code stuff. This still happens with TDD, but it happens a lot less.

So, if you don't find that TDD is improving your code and your process - not just your output but also your enjoyment - then my (cheeky) suggestion is that you've not discovered what it's really about yet.

In my experience, TDD is fun. Chemicals in my brain fun.

PS. If you haven't already got it, this book is essential reading on harnessing the Drive in TDD: Props to Joel Hooks for recommending it.

Friday, 3 September 2010

Air Swf loading anomolies

The application I do most of my work on (An AIR 1.5 app) loads swfs in 2 ways.

There are modules of functionality that are loaded into the application sandbox using the secure loading method that I've blogged about here before.

There are also 'lessons' which consist of multiple swfs that are loaded into the non-application sandbox. If a lesson consists of 4 swfs then they are all loaded, and then they are added to / removed from the stage as the user browses through the lesson. When another lesson is loaded the previous lesson swfs are first unloaded.

These swfs are having code injected into their timelines, but they also contain code at timeline depth and in nested MovieClips. This is generally just simple 'show this label on MouseOver' functionality. Nothing fancy.

I found a few months ago that loading the swfs serial - as you would expect to do [by waiting for one to INIT before loading the next] - caused the code in the nested MovieClips to *often* fail - on Mac or PC. I tried to track down the cause of this, and through my experiments I discovered that loading the swfs parallel seemed to solve the problem. So, unable to discover a reason I went with the approach that worked.

Now the problem has reared its ugly head again - but only on PC. On Mac, all code works fine all the time.

On PC (Windows XP), the timeline code, and the injected code, are always fine. But MovieClip code - whether done in a separate class file or just put into the MC timeline, is borked in SOME swfs, depending on what order the swfs INIT in.

So far I've discovered the following:

1) Loading the swfs serial, MC code is frequently borked on Mac or PC.

2) Loading the swfs parallel, code borking is dependent on the INIT order. [Might this indicate that the swf isn't fully loading when it INITs first?]

3) Whether the code is directly in the MC, or just in the MC timeline, it's just not there - even a stop() doesn't work.

4) I've replicated the problem with simple assets (a 2 frame MC with diff coloured squares and a stop() on frame 1), guaranteed not to clash with any other assets in any other swf being loaded.

5) The swfs most likely to be borked are those with the largest file size. BUT removing some assets to reduce the file size doesn't appear to help.

6) The swfs are currently being exported from Flash Pro CS4. I'm downloading CS5 trial just incase it helps. They're FP10.

7) Some swfs are only borked if they INIT first. Some are able to tolerate being the first to INIT without problems. Some are borked if they're first OR second (of six in that case) to load.

8) Adding a pause after all the swfs have loaded, before I do anything else, makes no difference.

9) Air runtime is 2.0.3

--- EDIT - conclusion

I think I've found the source of the problem.

The load order and the broken movieclip code are both symptoms of the same problem: the swf has not loaded into memory correctly.

The broken swfs are usually the largest, and when they INIT ahead of much smaller swfs it's probably a sign of incomplete loading, hence the correlation with Level-1 code not running. (Level-0 is timeline, level-1 is code inside MCs on that timeline etc).

Having run the tests hundreds of times I've found that this problem occurs approximately 50% of the time in a Parallels-XP virtual PC, and occasionally on a Windows 7 machine that has been down-versioned to windows XP (and hiccups in many other areas). So far it does not happen on a Mac book pro bootcamp XP install, or a stable Vista install.

I guess it's a genuine thread crash in the Air app, most likely some sort of memory issue?

I have not been able to find a way to prevent the problem from happening. Instead I have a work around:

Each swf contains a MC on the timeline on frame 1 with a line of code with the sole purpose of notifying the application (via the sandbox bridge) that level-1 code has been loaded correctly. If this notification is not received for a particular swf by the time the 'COMPLETE' event fires, the swf is deemed to have loaded incorrectly, and is unloaded from memory and loaded again.

This process is repeated until the load is correct. The files are local so the delay to the user is minimal.

Tuesday, 13 July 2010

Using actionscript 3 to inject code into the timeline

Two undocumented handy features from AS3 for working with the MovieClip timeline:


Which runs the code on frame 1 of the timeline, with corresponding functions for any frame that contains code - frame99(), frame354() etc.

addFrameScript(frameNumber:int, newFrameActions:Function):void

Which adds the code specified in newFrameActions at the frameNumber. Note that this is a zero-based frame number, yet frame1() is one-based. To add code to the first frame of an MC and then call it, you would do:

mc.addFrameScript(0, firstFrameActions);

addFrameScript will overwrite any existing actions, so if you want to augment and not just replace the code that is there, you'll need to first grab the existing frame code and then include that in your new function like this:
var oldFrameActions:Function = function():void {};

if((mc['frame23'] != null) && (mc['frame23'] is Function))
oldFrameActions = mc['frame23'];

var newFrameActionsFrame23:Function = function():void {
// add new code here...
Of course all of that is much better wrapped up in some portability that doesn't care whether it's frame 23 or frame 99 etc. I'm using a class for that -

You feed it a MovieClip and you'd probably want to hold a reference inside that MovieClip as well because one of the other uses is in overriding gotoAndPlay() so that it will rerun the code in the current frame without having to move the playhead (assuming frameScriptManager is a reference to an instance of the class in the gist):

override public function gotoAndPlay(frame:Object, scene:String = null):void
var frameNo:int = this.currentFrame;

super.gotoAndPlay(frame, scene);

if(this.currentFrame == frameNo)
var frameFunction:Function = frameScriptManager.getFrameScriptAtFrame(frameNo);
try {



The class, and in particular the getFrameScriptAtFrame() function also deals with the major gotcha: even once you've added additional code to the timeline at frame 1, the frame1() function continues to only run the actions that were present in the MovieClip (or Fla) timeline when it was published.

An alternative to rerun the current frame actions is to call Stage.invalidate() before the gotoAndPlay() - which is OK unless your content, like ours, is loaded into a sandbox that doesn't have access to Stage.

--- ADDENDUM ---

I have now also discovered that the visibility of the frame1() function is internal - so unless the FrameScriptManager is compiled into the same package as the SWF / MovieClip base class, these will show up as undefined (annoyingly no more helpful error is thrown).

So - the workaround is that your MC to be injected should implement the following function:
    public function getFrameScriptAt(frameNumber:int):Function
return this['frame'+frameNumber];

This is also specified in the IInjectableTimeline interface I've added to the Gist file. However, the FrameScriptManager class itself doesn't require an IInjectableTimeline, but will accept a vanilla MovieClip because the complications of ApplicationDomains for content loaded at runtime, plus variation in where the FrameScriptManager is complied (in the parent swf or in the child swf), make it tricky to roll a one-size-fits-all solution with strong typing.

So - the interface is there to help you keep yourself honest. Unfortunately there's no compiler enforcement in this case.


What's the point of all this? Well, in our case the swfs being targeted for timeline code injection are animated lessons. These lessons also need some code in them to address the application they run in - so that they can ask the main message window to show a title, or offer a download, or some instructions, or set the status of the play / pause button at the end of a section. Previously we did this using frame code, but this was hard to test and maintain. Instead we now specify most titles, instructions, download paths etc in an external xml file which is quick to edit.

At runtime the xml file for each lesson is processed and the relevant actions are added to the timeline of the lesson swfs, meaning that as far as the application is concerned these new, easier to manage (for us and our client) lessons behave exactly like the old ones.

Friday, 26 February 2010

Programming for the slowest processor

Last week, er... no, actually it was this week, I had a bang on the head. Black out, concussion, skull x-rays, the works.

Everything is intact, but my short term memory is temporarily a bit rubbish. My concentration span is minutes at most.

But it has brought into focus something I've been working on for a while - changing my processes to really concentrate on minimising the load on the slowest processor: my brain.

The AS3 application I'm currently building is not a fast-firing game or a web-delivered viral. File size and the speed of event firing aren't a critical success factor, so I'm taking the opportunity to experiment with how I programme, to make it easier for me to do my job as well as I can.

Over the next couple of weeks When I get a chance I'll be posting a series of entries on:

1. Using the [robotlegs] framework to cut cognitive boilerplate (it's about thinking, not typing!)

2. Reducing cognitive overhead through ultra-verbose programming.

3. Getting the compiler to pull its weight - including AS3 signals.

4. Using end-to-end testing and the roboteyes tools to get more sleep.

5. Employing project sprouts to make true TDD (unit, integration and end-to-end testing) easier than falling off a log.

The emphasis is on making sure that I can do my job well, and enjoy doing it, even on days when I am much less than my best. Not because I foresee a lot of head injuries in my future, but because as a team leader, business owner, parent, partner and dog owner, there turn out to be a lot of days when I'm interrupted.

I could pretend that all of this is in the interests of my clients. Sure, they benefit. But this new way of working is about the advantages to me today, and the future developers on my project (often also me) - whether that future is 10 minutes, 10 days or 10 months away.

Part 1 coming this weekend.