The Skeptical Methodologist

Software, Rants and Philosophy

Solving the “Problems” with TDD

First, read Dalke Scientific’s criticisms of TDD, or Test-Driven-Development.

The important thing to note is that when we talk about TDD, we’re talking about writing a test to drive the design of an API, you ‘test first’ before the implementation is written.  This is often juxtaposed to testing after an implementation is known.

This is an important thing to note, as I think taking DS’s criticism’s in this light makes sense of many of the problems the author finds with TDD approaches.  Namely, as is often said, TDD should not supplant many other testing strategies for code.  DS points out, rightly so, that some authors like Uncle Bob and Kent Beck have oversold TDD as yet another silver bullet.  Not all steps to the development process naturally flow from TDD.  In fact, as some have pointed out, TDD has little rigorous advice towards incorporating domain knowledge, or exploring a domain in which you have very little experience.  You can’t TDD your way through a Sudoku solver, or a ray tracer.  Instead, you need to understand the algorithms behind those solutions.

TDD is specifically about driving clean, reusable interfaces from well understood requirements.  Hah!  Since when are requirements ever well understood?  But that’s ok, since TDD is frequently packaged with a good dose of refactoring, we should expect any accomplished TDD’er to also respond well to changing requirements.  So, let’s rephrase that – TDD is about driving clean, reusable interfaces from requirements.  That’s all – it is no guarantee of code quality.  Furthermore, as DS points out, it is no guarantee of code coverage either!  Simplistic TDD would state that no line would be constructed without a test requiring its behavior.  That is true.  But recall, as I just stated, TDD is packaged with refactoring, which itself (as DS rightly notes) makes no guarantee on test coverage.  On an opposite note, one place where I disagree with DS is where he makes a big deal out of the performance requirements of finding Prime numbers.  TDD solutions are (or should be) entirely derived from requirements since they more or less are code implementation test forms of the requirements.  If Uncle Bob’s self-imposed requirements on his prime finder don’t specify processing speed, then his approach is legitimate.  What DS should have argued is that in the case of algorithmic code like a prime finder, generally speaking performance matters and would have been specified formally in one way or another (which implies that performance should be tested under a TDD scheme and would have ruled Uncle Bob’s solution out.)  I do sympathize with DS’s aesthetic revulsion to Uncle Bob reinventing an already understood algorithm in a slower form to prove that TDD works, but that’s a side issue.

So, I’d like to reset the debate around this claim:  TDD is a good “enterprise” technique to drive clean, reusable interfaces with high, but not perfect, code coverage.  I use the dreaded term “enterprise” here to mean “corporate” style requirements that are somewhat well understood versus blue-sky or academic research requirements, and they also tend to be brute force heavy and algorithmically light.  Oddly enough the ‘plumbing’ code this description provides makes up a huge percentage of what we, as software developers, are expected to write.  So TDD certainly has its place.

Where is that place?  Let’s go a little into testing theory real quick and it will jump out at us.  Traditionally, there are ‘verification’ tests and ‘validation’ tests.  Verification asks “did we build the thing right?”, while validation asks “did we build the right thing?”.  Verification asks questions about quality, about coverage, about safety and reliability.  It makes sure that, whatever we built, we did it ‘well’, that it has ‘quality’ or ‘craftsmanship’.  Validation, on the other hand, asks whether the thing we built was the thing the customer asked for in the first place.

There’s another dimension to testing, commonly referred to as white-box vs black-box.  Black box tests treat the implementation as an unknown and strictly tests the interface.  White box techniques look at the implementation specifically and try to break it.  They are both beneficial as looking at software as a black box generally allows one to test it a lot more aggressively since they have no ‘clues’ via the implementation to try to break it, so they just throw everything they have at boundaries and faults.  White box is beneficial for exactly the opposite reason, the implementation gives the tester hints on what might break via reverse reasoning (assume it broke and then work backwards to see what inputs would break it.)  Black box testing is also easier to automate ‘tests’ for, like fuzz or smoke tests, while white box styles allow better ’static analysis’ techniques.

You ‘need’ to cover all 4 quadrants of testing for a good strategy.  DS points out the ‘flaws’ in TDD, but in the context of testing theory, these are not flaws.  The characteristics of TDD put it squarely in the “black box validation” camp, along with techniques like requirements traceability and use case analysis.  I’d say in that company, it does quite well.   The “traditional” unit tests DS refers to fall more in the “white box verification” camp, i.e., you know the implementation and you’re specifically checking it for quality.  This quadrant needs to be supported to, and there are a myriad of techniques to do so (some of my favorite include Joel’s ‘completely separate testing team’ approach as well as Design-By-Contract.)

So, to sum up, the “problems” with TDD aren’t problems at all: they’re endemic to the quadrant of testing TDD is associated with.  TDD has been oversold as a magic bullet by its promoters, that is for certain, but it is not synonymous with “good testing” as the original author states.  It has a very specific place in a “good testing” strategy, namely, taking requirements and directly stating them as automated validation tests of any potential solution’s interface.

December 29, 2009 Posted by austinwiltshire | Uncategorized | | 1 Comment

To Spec or not to Spec

Like software patterns, there are many different software creative processes that tend to be re-invented over and over again.  Similar to the way that some software patterns are actually different versions of each other, or more importantly, weaknesses in the implementing language, I believe some processes only exist due to weaknesses in the chosen technology.  An example of one of these processes is the ’spec’ or specification.

Specifically, I’m talking about the technical specifications described in the above link, not functional specifications.  I believe technical specifications are a signal of weakness due to the following thought process – a technical spec can be deemed ‘good’ when it is unambiguous and has a relationship with actual implementation nearing 1-to-1.  If either of those two facts is untrue, then that means there is some element of interpretation in the spec, which further means that two coders may implement the spec in two different ways, making different assumptions, thus leading to defects.

So let’s assume we have a good spec, one in which we have a close to 1-to-1 mapping between specifications and implementation, and a lack of ambiguity.  There are two thoughts I have on this – one is, how did we get here?  The other is, how is this any different from code itself.  I’ll deal with the second first, just to spice things up.

A good spec really isn’t any different than good pseudocode, in fact you’ll find a lot of English descriptions of looping behavior, if statements, etc., primitives in most imperative languages.  These can be turned into implementation rather mechanically, so the question results, why didn’t you just implement the spec in the first place rather than writing the entire thing out in English?  What exactly does a spec get you that doing the exact same process except using actual implementation language doesn’t?

The other question is, how do you create a good spec?  Well, most likely you write a working draft, then send it out for reviews.  Various other designers look over it, read the spec, then send back some comments.  Maybe you missed a corner case, maybe you didn’t think of a certain interaction.  Over time, the spec is rewritten to be more and more specific and consistent.

Let me know if you disagree with any of the above logic so far, because now I’m going to add a twist.  Let’s say, instead of writing a spec, we actually wrote implementation code from the get go, except we used the same processes described above.  That is, we iteratively rewrote it, sought frequent feedback from humans or otherwise, and we stopped when there was a 1-to-1 mapping between the problem at hand and a complete lack of ambiguity (rather easier to do with a programming language).

We just re-invented rapid prototyping.  You see, spec writing is what people are forced to do when they work with languages or technologies that don’t allow rapid prototyping?  What exactly, when enumerated, would these technologies be?  I’d venture a guess that things like a REPL and high level ’scripting’ interface would be a good subset.  Languages like Ruby or Python, or heuristically, anything that doesn’t have a long ‘compilation’ cycle.  There are many compiled languages, for example, that turn around quickly allowing rapid prototyping, so I’m not JUST saying scripting languages.  Languages with more boilerplate, more line noise, and a longer turn around cycle lend to writing specs over rapid prototyping (i.e., C/C++).

Rapid prototyping has many other benefits over spec writing.  For instance, during spec writing, human reviewers are not only finding high level logic errors, but also syntax problems, type problems, and the like.  These are all problems that a high level language will automatically find.  The enumeration of ‘what’ could be passed into a specific function mentioned in the article linked to previously is a prime example of type checking that Haskell would let you know on the spot if you hadn’t fixed.

Moreover, spec writing and rapid prototyping aren’t competitng ideas.  In other words, a spec COULD be a rapid prototyped code base!  If we need to build an embedded system using a subset of C, for example, rapid prototyping would be a rather onerous process compared to our REPL and high level constructs.  But no one said the spec couldn’t be itself written in a high level language to be ported to low level C once it was deemed complete.  Therefore, rapid prototyping should be used even on projects where a spec is used, and the two should be practically substitutable.

Of course, all process patterns like these are just that, patterns, and there really should be no ‘make software using process x’ rules since x may not always apply.  Sometimes you need some exploratory programming (very similar to rapid prototyping), or sometimes you may know how to express something in English but not in code (such as in technical fields).  Patterns, even process patterns, ought to be seen more as a ‘bag of tricks’ similar to the skills mathematicians build up – tricks to simplify or restate problems in other domains.  Too often we’re convinced that our problems are like arithmetic, and there’s just one simple algorithm or process that needs to be exercised again and again, but instead they’re more like problems found in the higher echelons of math.  They require insight, tricks, squinting, hand waving, and eventually, a eureka.

November 19, 2009 Posted by austinwiltshire | Uncategorized | | No Comments Yet

IoC without the fluff

Things like dependency injection and IoC are cool, and they do decouple your code and make it easier to extend at test.  But the problem is that the software community always tends to reinvent the same solutions to the same problems and then give them a totally different name.  I mean, what are the real differences between Haskell Type Classes and the now defunct C++1x ‘concept’… er… concept.  (Seriously guys, we’re running out of different synonyms for ‘thing’.  We’ve got object, class, type, type class, structure…)  Here’s what IoC and DI are, how they interrelate, without using silly words like ‘container’, which is just another word for ‘thing’ pretty much in this context.  We have a function, let’s say, and I’m going to use python notation since it’s short and sweet.

x = 0

def func(y):

return x+y

Everyone can figure out what that function does.  But how will it react in a large system?  Who the hell knows, it uses a global variable that can be modified at any time, from anywhere.  Our function ‘func’ is now coupled to a global variable, as is, by associativity (or is it commutativity?) anyone who uses it.  So we change it to this far more elegant version:

def func(x,y):

return x+y

Voila!  In DI terms, we’ve ‘injected’ the ‘dependency’ of func on the variable x.  Now func is only dependent on two locally defined parameter variables, a far more elegant and simple way to do things.  Let’s fast forward to OO for a second.  Assume we have a class foo…

class bar:

def __init__(self):

self.myFoo = foo()

def func(self, x):

return self.x + self.myFoo

This is OO, so it’s nice and decoupled right?  Not really.  If you unroll the object, thinking of it basically as a poor man’s closure, you realize that func is actually doing something like this (assuming no one changes self.myFoo)

def func(x):

return foo() + x

Global functions, when they’re pure, are nice to have.  But you can still see the problem of the coupling between func and foo, now.  If foo is not pure, then we have side effect issues, but moreover, we still have just plain maintainability/extensibility issues with the coupling between the two.  The number one use case pointed to by dependency-injection types is that of testing.  What if we want to test func, but we don’t want to test foo?  Maybe foo takes a long time.  Or maybe we haven’t even written foo yet! (Now you can see the connection between TDD, dependency injection, and why the two might change your style of design.  You can test top level objects without even implementing ‘low-level’ ones yet..  But that’s tangential.)  We can do the same thing we did in the first case though to break that coupling:

def func(x, foo_thing):

return x + foo_thing()

Oh hey, higher order programming.  Who’d have thought that would show up here?  Let’s roll our stuff back into familiar OO terms:

class bar:

def __init__(self, foo_thing):

self.myFoo = foo_thing()

def func(self, x):

return self.myFoo + x

Dependency, consider yourself injected.  Ok, fine, what the hell is IoC then?  Why does it have to have such a crazy name, and what are these containers, and wah wah wah!  Well, let’s go back to our familiar func and make it a little more complex:

x = 0

y = 0

z = 0

a = 0

b = 0

c = 0

def func(var):

return var + a + b + c + x + y + z

Gross!  Global dependencies on like, a bajillion different things!  Let’s inject those dependencies!

def func(var, x, y, z, a, b, c):

return var + a + b + c + x + y + z

Ah, much better.  Er, wait… now my buddy, who had code like this:

def func_2(var1, var2):

return func(var1) + func(var2)

Has had his interface explode to:

def func_2(var1, var2, a, b, c, x, y ,z):

return func(var1, a, b, c, x, y, z) + func(var2, a, b, c, x, y, z)

And his callers have their interfaces explode, and their callers have theirs, etc…  But wait, any of you OO gurus notice that a,b,c and x,y,z never seem to make any sense without each other?  They always seem passed with each other, they seem semantically related, who knows what else.  Why, I’ll bet our design could be more… what’s that nice word elegant designs are described as?  Oh yes, cohesive!

class point:

def __init__(self, x, y, z):

self.x = x; self.y = y; self.z = z  #just leave them as plain-old-data and public

Oh wait, and two points, well, we can call that a…

class distance: #er, maybe not the best name…

def __init__(self, pt1, pt2):

self.pt1 = pt1; self.pt2 = pt2

Now are interfaces shrink down, only needing one extra parameter, a ‘distance’, which I’m sure in our func, foo, bar whatever application we’re building, make good sense semantically and is drawn from our domain.  Likewise, let’s say that we’re using dependency injection to make our objects a lot less coupled.

class bar:

def __init__(self, obj1, obj2, obj3, obj4, obj5):

self.obj1 = obj1; self.obj2 = obj2; self.obj3 = obj3; self.obj4 = obj4; self.obj5 = obj5

Wow, those dependencies sure are injected all right.  But that class has been injected so much, it’s beginning to look like a heroine addict.  These objects all seem to have something to do with each other, why don’t we bundle them up in a cohesive object?

class bar:

def __init__(self, obj_builder_thingie): #TODO: think of a better name

self.obj1 = obj_builder_thingie.obj1(); …

Ah, now our design is decoupled and cohesive.  Hallmarks of good design.  (Of course, simply bundling arbitrary objects together isn’t necessarily cohesive.  We’re assuming in this case, whenever we’ve bundled something together, those objects have had something to do with each other semantically or in the domain.)  For some reason, whenever we bundle these objects together, our Java buddies (and C# buddies) want to call this ‘Inversion of Control’, mostly because it’s now the caller/creator’s problem of figuring out how it wants to ‘configure’ the innards of our bar class.  They sometimes will go even further and make obj_builder_thingie not just a normal object, but a framework for going out and reading XML files to dynamically figure out what obj1() and obj2() should return.  That way, instead of editing a nice clean python unit test file to configure and modify your tests, you can edit a huge ugly xml file with lots of angled brackets to configure and modify your tests.  Sounds much more enterprisey to me!  Job security here I come!

So, let’s recap.  Dependency injection is moving a ‘global’ to a ‘parameter’, which insofar as primitive types and values, has always seemed like a good idea.  But for OO guys, moving global functions (or an object representing those functions) seems like a brand spankin’ new idea, when really functional guys have been moving global functions to parameters for years.  C’est la vie, at least we’re learning something from each other.  The specific case bandied about is always how this makes testing easier because you can ‘inject’ which objects you’d like to collaborate with in the constructor (or a setter, etc… but setter’s are evil.)  But this solution can be generalized to ‘inject’ any configurable behavior you want in the form of variables, functions or classes themselves in a dynamic language like python.  The side effect is your parameter lists get huge, and while your code is decoupled, its decoupled kind of like a box of legos with no instructions on how to build your starship.  So you use your ole’ object-oriented smarts to start grouping some of those parameters together into objects themselves, and now  your ‘injected class’ begins looking through the objects that are passed into it to find what it needs, rather than everything being unrolled on the parameter list.  IoC is basically, then, dependency injection (or the voracious removal of anything global) on a large scale and must, then, solve the problems (explosive parameter lists) that come with that.

That’s it in a nutshell.  Hopefully I haven’t confused you further.  Dependency injection is just another step in the good practice of “make things as local as possible”, or decoupling.  Parameter lists are more local than global, even for functions like class constructors.  IoC is the counterbalance to massive decoupling, and that’s to make the program more ‘cohesive’ again by grouping up many of these new things being passed in parameter lists into objects and hierarchies themselves.  Any questions?

October 10, 2009 Posted by austinwiltshire | Uncategorized | | No Comments Yet

Smart Pointers for Dumb Developers

Edouard in his post claims that smart pointers, like shared_ptr and auto_ptr, are overused.  While he actually makes quite a few legitimate critiques, I think he misses the fundamental problem that hits nearly all of the computer scientist’s beautiful abstractions – they’re all dangerous if they’re overused.  C++’s smart pointers are no exception.  I do wish to come to their defense, though, and some what might seem like obvious guidelines for their use.

Smart pointers are the poor man’s garbage collection.  Or, in better terms, they are the average C++ developer’s garbage collection.  C++ is a language that is a little bit too ‘expert friendly’, and dynamic allocation, similar to multithreaded programming, is hard(tm).  Smart pointers in many cases allow your average developer to turn a horrible bug – memory leaks or null dereferences – into a less heinous inefficiency.  That’s a win.  They are no complete replacement for object lifecycle analysis though, which is Edouard’s main point.  Of course, raw pointers that are intelligently managed will always outperform smart pointers.  But what if you don’t have the smartest developers in the world?  Or what if you don’t have the time/money to throw your smart developers at making sure you squeeze every last FLOP out of that processor, but instead need them to focus on implementing new features, etc?  It’s all economics.

I’d like to offer my own taxonomy to budding developers.  Kind of a road map to memory usage in C++ to find a good middle ground between performance and safety.

Your first rule of thumb, and the first place you should go to build any object, is the stack.  The stack is well understood, putting objects on the stack is usually pretty cheap, and a good chunk of the time, most objects you need are transient for a few statements and then aren’t needed any  more.  For awhile there, it was simply considered more ‘OO’ to put things on the heap (thank you very much JAVA).  Thankfully, those days are gone.

What CAN’T be put on the stack?  Well, large objects run the risk of pushing the stack to its limits, since most of the time, the stack is smaller than the heap.  In addition, the stack requires a class to be concrete, which makes polymorphism a little more tricky.  Finally, again with large objects, is that passing these arguments to functions or methods can be expensive as their copy constructors are called.

For a large object that you want to put on the heap, enter the scoped pointer.  The scoped pointer acts like a stack variable in its construction and cleanup, but is created on the heap.  You should also question why you’re putting so many large objects on the heap.

To handle polymorphism, for the most part, you’re going to have to use pointers.  An abstract base class pointer can point to a derived class, so it’s a question of what kind of management you want wrapping that pointer.  Scoped pointers don’t really help any with polymorphism, but our two friends – one of which is disparaged by the Edouard – come to save us.  In the case of ‘factory’ like functions, or functions that create an object on the heap and ‘pass off’ ownership to the returnee, we’d like to use an auto_ptr, since this type of smart pointer enforces this pattern.  In the case where we’re getting a reference to an object, but the method called still wants to refer to a shared place in memory, then we usually want a shared_ptr.  The original poster is right, though, in many cases, through careful analysis, both of these pointer types can be done away with by simply studying the expected lifecycles of the object underneath.  But this is not always possible in the constraints of developing software.

Keep in mind the const reference trick, too.  Returning a reference to a stack variable is a bad idea, since the stack variable’s probably been deleted.  However, the standard states that getting a CONST REFERENCE extends the lifetime of whatever variable is being returned to the enclosing scope.  That actually opens up efficient usage of some uses of polymorphism, since the memory’s allocated on the stack yet we can still get a polymorphic reference to it.  I’d use this technique with care, though.

Ultimately, such low level control and the plethora of options available to the developer makes memory management in C++ tricky, yet powerful and abstract.  There’s different use cases for all the potential views of memory, and subtleties to each use.  Shared_ptrs can more or less cover almost all these uses, but there’s a cost associated with not doing any real analysis of your memory footprint.  For the average developer, though, if you want to make sure they don’t introduce any risks, giving them a poor man’s garbage collector is probably your best option.

(*Though you still need to peer review their code and check for cycles, the notorious corner case that shared_ptr will still leak with.)

September 1, 2009 Posted by austinwiltshire | Uncategorized | | No Comments Yet

The Science of Science

“Metacognition” is a word that basically means, thinking about thinking.  You take part in metacognition when you actively try and spot patterns in your own behavior that are acting against your goals, and then trying to create solutions that work around these problems.  It’s self reflection, but also action that takes place due to that reflection.  It might be uniquely human.

There is an analogue in society, though, when any sort of social movement begins to use it’s own doctrine to look back on itself.  This is generally restricted, though, and does not include  self reflective studies using other means, for example, a political movement using scientific evidence to rethink its position on things.  I’d suppose we could call that ‘iterative’ progress.  But instead, its when you turn a social movement with an intrinsic ability for self reflection back on itself.  Say, let’s psychoanalyze Freud, for example.  Or, more importantly, let’s apply the scientific method to science itself.

In science, let’s say medical science, we have a hypothesis that X causes B.  Now, medical science is one of a few ’special’ branches of science who are motivated by more than just the search for truth itself.  Generally, if X causes B, B is a bad thing and we want to stop it.  Knowing that X causes it means that we can have more control over B and reduce suffering.  So when we do medical science, we more or less want to do it as ‘best’ we can, to get to the bottom of various diseases and halt human suffering.

So we hypothesize that the current way we do medical science is the best, i.e., that it yields the most truth and that these truths are the most effective to reduce human suffering.  If, through that hypothesis, a researcher were to be able to find a pattern in medical science itself that yielded weakening evidence, that researcher would be doing science on science itself.

But like medical science, who’s treatments and drugs improve (by some small degree) by every paper published, does science not itself recursively improve through it’s own self analysis?  By applying the results of science and mathematics, in the case of the above link, network theory, to the processes that produced those results, science becomes STRONGER due to it.  The scientific method as we know it from the enlightenment is a sound way to find truth about our world, we must always remember that the evidence we produce from any one experiment is only as good as the assumptions that went into it.  And there are many assumptions in the scientific method itself, of which, we learn about more only through self reflection.

Medical science, not to pick on it, is rife with these enlightening moments.  From discovering that taking a sugar pill makes people feel better (the Placebo effect), to realizing that people who follow doctor’s orders tend to be healthier than those who don’t, even if those doctor’s orders are non-sensical.  We have invented the double blind placebo controlled trial to control for the scientists’ own bias as well as human’s predilection for sugar pills.  We’ve invented meta-analysis to momentarily reflect on our own science to try and combine and better glean results from a group of studies.  We’ve invented the idea of conflict of interest, the open access journal as well as many other tools and techniques that were the result of turning the scientific method back on itself.

You can see the result of this recursive growth in transistors.  We’ve built computers, and then only months later, used those very computers to design the next generation.  This same recursive growth is possible in science too – the very results mentioned above shows that we CAN do science better.  Lives are at stake in many cases, and if not lives, then large sums of money.  We can only do so much good science, produce so many real results, in a year for a given dollar.  We should focus on making these dollars go as far as possible by applying the results of (and funding more of) these sorts of self reflective studies.  The more bias we remove from our scientific process, the cheaper science gets for any dollar put in.  The cheaper it gets, the more we can do.  The more we can do, the more we learn.  The more we learn, the more lives we save and the more wealth we create.

And the cycle continues towards singularity.  Kurzweil is right, but he never mentions culture.  Culture, and the products of human culture, are also capable of exponential growth, and I don’t think we’re anywhere near an inflection point.

August 8, 2009 Posted by austinwiltshire | Uncategorized | | 1 Comment