The Skeptical Methodologist

Software, Rants and Management

Testing and Language Design

Design patterns are held by some, including me, to be evidence of ‘code smell’.  That is, many so-called patterns exist because a language makes it hard to do something that ought to be easy.  A prime example would be the strategy and command patterns, both of which virtually dissapear in a language with first class functions.

These patterns hit Java hard, but they strike C++ too.  The reason C++ gets away a little better than Java does is because C++ allows you to ‘hack in’ a few things, like passing around function pointers, that Java disallows.  This alone, I think, is primarily responsible for Java’s verbosity.

Anywho, that’s neither here nor there.  The point is that many things we do in software over and over again are evidence of a missing abstraction.  If you ever have to write anything twice that could have been written once and reused, and you are using the language correctly, then you’ve hit a limit of that language.  The language is not doing the work for you that it should, and now the ‘work’ is back on the programmer rather than the compiler.

This is bad.  But we have to deal with it.  Ideally, if we used “the one perfect language”, all work would be design and there would be no “busy work” of having to implement a design that’s already been implemented once.  We’d just reuse the other design, switching out the parts that are variable.  Object orientation was supposed do to this for us, but there’s obviously limits.  Instead, what’s more inevitable is that we’re just going to have to keep coming up with new things that are abstractable as language design progresses – from routines, to objects, to functions to aspects, and so on.

What does this have to do with language testing?  A lot of people HATE language testing.  Just check out a unit testing or mocking frameworkin Java, or C++, and you’ll see that while those frameworks help a little, there’s still a GREAT deal code repetition.  Compare these unit test frameworks to Python’s doctest.  In one, we have to have all the syntax of starting up a new test, associating it with a class, giving the inputs, the outputs, the assertions, etc…  In the other, I just pretend I’m working with an interactive shell – with a method I’ve already implemented.  I pretend the work is done, if the work was done, how would the function behave? Well… I call it with certain inputs, and I expect the shell to spit out certain outputs.

This type of testing isn’t perfect, but it does show something in python that is outright missing in other languages.  This is implemented in a completely non-language design way, though, via the use of the doctest library and a prebuilt parser.  What language features would we need to implement this style testing in the language itself?

Another similar feature that is completely apart of the language in Eiffel, Design By Contract, also is very similar to this style of testing.  In design by contract, I enforce, in the language, certain assumptions about my inputs, my outputs, and invariant things in my class.  This can of course be done via libraries(and assertions those libraries host) in other languages, but its the fact that one language includes these kinds of tests as ‘first class’ members whereas another just ‘gets away’ with a library based implementation.  Nothing is wrong with libraries – but if you implement something as cross-cutting and global as testing via a library, you make it very difficult for other features like reflection, introspection, etc.  If we had an introspection version of Eiffel, I would expect it to be able to tell me what assumptions any particular method or class made.  In c++, even if we had introspection, it’d be much more difficult for an introspective system to figure out what assertions I’m making and why.

This is because assertions can be used for all sorts of things, but a contract, as used by Eiffel, means ONE thing, and can always be assumed to mean one thing.  Just like in mathematics, when we make a constraint on certain things, proofs become much easier and the constraint actually gives us MORE freedom.

I imagine, for instance, that the inclusion of ‘Aspects’ in a language, or that is, the ability to post-facto put wrappers around any function/method, such that code is called before, after or in both places any particular function, would make testing far easier.  Or, alternatively, Python makes use of its introspection to do doctests – what if we removed the need for a parser and made tests a first class member of a language?

There’s an idea called predicite dispatch, where I don’t just call a function based on it’s name(like Python), or its number and type of arguments(like C) or it’s encapsulating class(like C++ and Java).  Instead, I can call a function based on the strictest interpretation of all polymorphic types(called multimethods) AND certain predicates about those types!

A better example would be in the world of Haskell, where multiple functions can be defined for any one name, and the runtime does a lookup based on the pattern of arguments.  That is, I can declare a recursive function defining the fibonnaci sequence not by simply building a single function with an if statement inside checking to see if my base condition is met, but by declaring two functions.  One is defined for the base case and one is defined for the recursive case – then, on each function call, the language itself does the lookup to see which case I need to run, the recursive or base case.

This moves branch testing up to a  language level, again, allowing introspective frameworks like static analyzers much more information making it easier for them to find bugs.  Furthermore, it simplifies code and makes testing easier – now a single function does not need to ensure it tests that all branches are exercised, but instead a multitude of smaller functions simply need to ensure that their test are satisfied – seperating concerns and easing reading.

Certainly, if we can store all of this predicate information, we’ve almost re-implemented Design By Contract(now as a pattern matching mechanism, so it’s not just doing debugging for us but also speeding our development!)  Is it that much of a stretch to use a similar mechanism for testing?  Of course, with the same predicate dispatch, our actual NEED for testing might in fact go down.  As it’s often the case to test for border conditions, obviously, in a predicate dispatch system, we’re actually going to institute an entirely different function for border cases(for instance, the base case in the fibonnaci recursive sequence, f(0) and f(1)).

But we’re likely to still have testing situations, even if we’ve moved a good chunk of testing out to static analysis and compile-time exceptions(“You don’t have a function defined for f when argument is -1!”).  These tests would probably be best put, doctest style, in the ‘function declaration’ of most of these predicate dispatch frameworks.  That is, predicate dispatch in frameworks like PyProtocols still ask you to define a new ‘generic function’ that will be able to be overloaded.  It is in these generic functions that we can push in expected inputs, outputs and checks for invariants.

In a way, its like class inheritance.  By defining the tests at a function level, we are enforcing a mechanism to say “Whatever I overload on this function, it should not violate that f(1) = 1 and f(0) = 0, etc…”(Of course, like I mentioned before, this particular example might be completely suited with the use of predicate dispatch in the first place.)  These tests can be run at compile time to make sure that for each argument I’ve expected for input, there is in fact a function that not only resolves to deal with that argument, but that function’s output is what I expect.

Putting these input/output style doctests at the generic function level will not only help documentation(just as Doctest already does) since the user won’t have to delve into all the overloaded members to get a good understanding of the function, but also aids in design since the programmer will first have to think “What are the use cases of this method/class?” at the declaration level, which probably gains the most quality of any step in the software design process.  The validation tests at compile time are just icing on the cake and can provide coverage, profiling and validation.

Furthermore, the definition of these generic function’s inputs and outputs ALSO gives us mock objects – for free!  If I can define a functions expected inputs and outputs, then I can simply mock that function based on those inputs and outputs – using pattern matching on these inputs.  If I declare a function but do not define it, but that declaration has tests, of course I cannot use the function – expect in a mock case.  If you think of an object like a photograph, the fixture is the positive and the mock object is the negative.  Both are defined, really, in a test – testing a class just by pluggin in it’s mock self is fruitless since you’re just getting exactly what you write down.  But using the tests you provide to test the real class, or testing another class using the ‘mock’ version of the class, you can tests other objects that rely on functions you haven’t even written yet – and test those objects in isolation.

IDE support can and should be given for this sort of thing, again utilizing introspection, to recognize when a test is not fulfilled giving the designer the option to ‘capture’ results – i.e., if the set up for a specific test might be time consuming, just run the program and capture the test on the fly.  This would work for webpage parsers, for example.  If I want to test a parser, but I don’t want to go through the time of actually building a webpage to test it against, I should be able to just do a dry run, then when the test fails, I capture whatever webpage I’ve already downloaded and write my tests against that.

There’s a lot of ideas here – all of which, I believe, can make testing easier to do and more effective.  We’ve got a bunch of different, but similar pictures, of testing from DBC to doctests to unit testing frameworks and it reminds me of that joke about the blind men and the elephant.  We’ve all got different ideas, and no one will fulfill all of our needs, because they are all different parts of the same big beast.  Putting testing into a language itself rather than as an afterthought helps us recognize how key testing is to the design process, push even more bug catching behavior to static analysis, and make it as easy as possible for the developer to create high quality code, fast.  Unfortunately, most of these things simply can’t be done with current languages due to the lack of features – not just a lack of pushing tests into code, but the features needed to use those tests like introspection and aspects.  As languages become more mature, I believe it will be easier and easier to make tests, as we know them today, not an optional thing a good designer does to find bugs, but an integral part of the design and code process, shrinking development times and letting us designers spend more time doing what we love rather than writing out testMyClass inherits TESTCLASS yet one more time.

June 13, 2008 - Posted by | Uncategorized | ,

No comments yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: