To Spec or not to Spec
Like software patterns, there are many different software creative processes that tend to be re-invented over and over again. Similar to the way that some software patterns are actually different versions of each other, or more importantly, weaknesses in the implementing language, I believe some processes only exist due to weaknesses in the chosen technology. An example of one of these processes is the ’spec’ or specification.
Specifically, I’m talking about the technical specifications described in the above link, not functional specifications. I believe technical specifications are a signal of weakness due to the following thought process – a technical spec can be deemed ‘good’ when it is unambiguous and has a relationship with actual implementation nearing 1-to-1. If either of those two facts is untrue, then that means there is some element of interpretation in the spec, which further means that two coders may implement the spec in two different ways, making different assumptions, thus leading to defects.
So let’s assume we have a good spec, one in which we have a close to 1-to-1 mapping between specifications and implementation, and a lack of ambiguity. There are two thoughts I have on this – one is, how did we get here? The other is, how is this any different from code itself. I’ll deal with the second first, just to spice things up.
A good spec really isn’t any different than good pseudocode, in fact you’ll find a lot of English descriptions of looping behavior, if statements, etc., primitives in most imperative languages. These can be turned into implementation rather mechanically, so the question results, why didn’t you just implement the spec in the first place rather than writing the entire thing out in English? What exactly does a spec get you that doing the exact same process except using actual implementation language doesn’t?
The other question is, how do you create a good spec? Well, most likely you write a working draft, then send it out for reviews. Various other designers look over it, read the spec, then send back some comments. Maybe you missed a corner case, maybe you didn’t think of a certain interaction. Over time, the spec is rewritten to be more and more specific and consistent.
Let me know if you disagree with any of the above logic so far, because now I’m going to add a twist. Let’s say, instead of writing a spec, we actually wrote implementation code from the get go, except we used the same processes described above. That is, we iteratively rewrote it, sought frequent feedback from humans or otherwise, and we stopped when there was a 1-to-1 mapping between the problem at hand and a complete lack of ambiguity (rather easier to do with a programming language).
We just re-invented rapid prototyping. You see, spec writing is what people are forced to do when they work with languages or technologies that don’t allow rapid prototyping? What exactly, when enumerated, would these technologies be? I’d venture a guess that things like a REPL and high level ’scripting’ interface would be a good subset. Languages like Ruby or Python, or heuristically, anything that doesn’t have a long ‘compilation’ cycle. There are many compiled languages, for example, that turn around quickly allowing rapid prototyping, so I’m not JUST saying scripting languages. Languages with more boilerplate, more line noise, and a longer turn around cycle lend to writing specs over rapid prototyping (i.e., C/C++).
Rapid prototyping has many other benefits over spec writing. For instance, during spec writing, human reviewers are not only finding high level logic errors, but also syntax problems, type problems, and the like. These are all problems that a high level language will automatically find. The enumeration of ‘what’ could be passed into a specific function mentioned in the article linked to previously is a prime example of type checking that Haskell would let you know on the spot if you hadn’t fixed.
Moreover, spec writing and rapid prototyping aren’t competitng ideas. In other words, a spec COULD be a rapid prototyped code base! If we need to build an embedded system using a subset of C, for example, rapid prototyping would be a rather onerous process compared to our REPL and high level constructs. But no one said the spec couldn’t be itself written in a high level language to be ported to low level C once it was deemed complete. Therefore, rapid prototyping should be used even on projects where a spec is used, and the two should be practically substitutable.
Of course, all process patterns like these are just that, patterns, and there really should be no ‘make software using process x’ rules since x may not always apply. Sometimes you need some exploratory programming (very similar to rapid prototyping), or sometimes you may know how to express something in English but not in code (such as in technical fields). Patterns, even process patterns, ought to be seen more as a ‘bag of tricks’ similar to the skills mathematicians build up – tricks to simplify or restate problems in other domains. Too often we’re convinced that our problems are like arithmetic, and there’s just one simple algorithm or process that needs to be exercised again and again, but instead they’re more like problems found in the higher echelons of math. They require insight, tricks, squinting, hand waving, and eventually, a eureka.
IoC without the fluff
Things like dependency injection and IoC are cool, and they do decouple your code and make it easier to extend at test. But the problem is that the software community always tends to reinvent the same solutions to the same problems and then give them a totally different name. I mean, what are the real differences between Haskell Type Classes and the now defunct C++1x ‘concept’… er… concept. (Seriously guys, we’re running out of different synonyms for ‘thing’. We’ve got object, class, type, type class, structure…) Here’s what IoC and DI are, how they interrelate, without using silly words like ‘container’, which is just another word for ‘thing’ pretty much in this context. We have a function, let’s say, and I’m going to use python notation since it’s short and sweet.
x = 0
def func(y):
return x+y
Everyone can figure out what that function does. But how will it react in a large system? Who the hell knows, it uses a global variable that can be modified at any time, from anywhere. Our function ‘func’ is now coupled to a global variable, as is, by associativity (or is it commutativity?) anyone who uses it. So we change it to this far more elegant version:
def func(x,y):
return x+y
Voila! In DI terms, we’ve ‘injected’ the ‘dependency’ of func on the variable x. Now func is only dependent on two locally defined parameter variables, a far more elegant and simple way to do things. Let’s fast forward to OO for a second. Assume we have a class foo…
class bar:
def __init__(self):
self.myFoo = foo()
def func(self, x):
return self.x + self.myFoo
This is OO, so it’s nice and decoupled right? Not really. If you unroll the object, thinking of it basically as a poor man’s closure, you realize that func is actually doing something like this (assuming no one changes self.myFoo)
def func(x):
return foo() + x
Global functions, when they’re pure, are nice to have. But you can still see the problem of the coupling between func and foo, now. If foo is not pure, then we have side effect issues, but moreover, we still have just plain maintainability/extensibility issues with the coupling between the two. The number one use case pointed to by dependency-injection types is that of testing. What if we want to test func, but we don’t want to test foo? Maybe foo takes a long time. Or maybe we haven’t even written foo yet! (Now you can see the connection between TDD, dependency injection, and why the two might change your style of design. You can test top level objects without even implementing ‘low-level’ ones yet.. But that’s tangential.) We can do the same thing we did in the first case though to break that coupling:
def func(x, foo_thing):
return x + foo_thing()
Oh hey, higher order programming. Who’d have thought that would show up here? Let’s roll our stuff back into familiar OO terms:
class bar:
def __init__(self, foo_thing):
self.myFoo = foo_thing()
def func(self, x):
return self.myFoo + x
Dependency, consider yourself injected. Ok, fine, what the hell is IoC then? Why does it have to have such a crazy name, and what are these containers, and wah wah wah! Well, let’s go back to our familiar func and make it a little more complex:
x = 0
y = 0
z = 0
a = 0
b = 0
c = 0
def func(var):
return var + a + b + c + x + y + z
Gross! Global dependencies on like, a bajillion different things! Let’s inject those dependencies!
def func(var, x, y, z, a, b, c):
return var + a + b + c + x + y + z
Ah, much better. Er, wait… now my buddy, who had code like this:
def func_2(var1, var2):
return func(var1) + func(var2)
Has had his interface explode to:
def func_2(var1, var2, a, b, c, x, y ,z):
return func(var1, a, b, c, x, y, z) + func(var2, a, b, c, x, y, z)
And his callers have their interfaces explode, and their callers have theirs, etc… But wait, any of you OO gurus notice that a,b,c and x,y,z never seem to make any sense without each other? They always seem passed with each other, they seem semantically related, who knows what else. Why, I’ll bet our design could be more… what’s that nice word elegant designs are described as? Oh yes, cohesive!
class point:
def __init__(self, x, y, z):
self.x = x; self.y = y; self.z = z #just leave them as plain-old-data and public
Oh wait, and two points, well, we can call that a…
class distance: #er, maybe not the best name…
def __init__(self, pt1, pt2):
self.pt1 = pt1; self.pt2 = pt2
Now are interfaces shrink down, only needing one extra parameter, a ‘distance’, which I’m sure in our func, foo, bar whatever application we’re building, make good sense semantically and is drawn from our domain. Likewise, let’s say that we’re using dependency injection to make our objects a lot less coupled.
class bar:
def __init__(self, obj1, obj2, obj3, obj4, obj5):
self.obj1 = obj1; self.obj2 = obj2; self.obj3 = obj3; self.obj4 = obj4; self.obj5 = obj5
Wow, those dependencies sure are injected all right. But that class has been injected so much, it’s beginning to look like a heroine addict. These objects all seem to have something to do with each other, why don’t we bundle them up in a cohesive object?
class bar:
def __init__(self, obj_builder_thingie): #TODO: think of a better name
self.obj1 = obj_builder_thingie.obj1(); …
Ah, now our design is decoupled and cohesive. Hallmarks of good design. (Of course, simply bundling arbitrary objects together isn’t necessarily cohesive. We’re assuming in this case, whenever we’ve bundled something together, those objects have had something to do with each other semantically or in the domain.) For some reason, whenever we bundle these objects together, our Java buddies (and C# buddies) want to call this ‘Inversion of Control’, mostly because it’s now the caller/creator’s problem of figuring out how it wants to ‘configure’ the innards of our bar class. They sometimes will go even further and make obj_builder_thingie not just a normal object, but a framework for going out and reading XML files to dynamically figure out what obj1() and obj2() should return. That way, instead of editing a nice clean python unit test file to configure and modify your tests, you can edit a huge ugly xml file with lots of angled brackets to configure and modify your tests. Sounds much more enterprisey to me! Job security here I come!
So, let’s recap. Dependency injection is moving a ‘global’ to a ‘parameter’, which insofar as primitive types and values, has always seemed like a good idea. But for OO guys, moving global functions (or an object representing those functions) seems like a brand spankin’ new idea, when really functional guys have been moving global functions to parameters for years. C’est la vie, at least we’re learning something from each other. The specific case bandied about is always how this makes testing easier because you can ‘inject’ which objects you’d like to collaborate with in the constructor (or a setter, etc… but setter’s are evil.) But this solution can be generalized to ‘inject’ any configurable behavior you want in the form of variables, functions or classes themselves in a dynamic language like python. The side effect is your parameter lists get huge, and while your code is decoupled, its decoupled kind of like a box of legos with no instructions on how to build your starship. So you use your ole’ object-oriented smarts to start grouping some of those parameters together into objects themselves, and now your ‘injected class’ begins looking through the objects that are passed into it to find what it needs, rather than everything being unrolled on the parameter list. IoC is basically, then, dependency injection (or the voracious removal of anything global) on a large scale and must, then, solve the problems (explosive parameter lists) that come with that.
That’s it in a nutshell. Hopefully I haven’t confused you further. Dependency injection is just another step in the good practice of “make things as local as possible”, or decoupling. Parameter lists are more local than global, even for functions like class constructors. IoC is the counterbalance to massive decoupling, and that’s to make the program more ‘cohesive’ again by grouping up many of these new things being passed in parameter lists into objects and hierarchies themselves. Any questions?
Smart Pointers for Dumb Developers
Edouard in his post claims that smart pointers, like shared_ptr and auto_ptr, are overused. While he actually makes quite a few legitimate critiques, I think he misses the fundamental problem that hits nearly all of the computer scientist’s beautiful abstractions – they’re all dangerous if they’re overused. C++’s smart pointers are no exception. I do wish to come to their defense, though, and some what might seem like obvious guidelines for their use.
Smart pointers are the poor man’s garbage collection. Or, in better terms, they are the average C++ developer’s garbage collection. C++ is a language that is a little bit too ‘expert friendly’, and dynamic allocation, similar to multithreaded programming, is hard(tm). Smart pointers in many cases allow your average developer to turn a horrible bug – memory leaks or null dereferences – into a less heinous inefficiency. That’s a win. They are no complete replacement for object lifecycle analysis though, which is Edouard’s main point. Of course, raw pointers that are intelligently managed will always outperform smart pointers. But what if you don’t have the smartest developers in the world? Or what if you don’t have the time/money to throw your smart developers at making sure you squeeze every last FLOP out of that processor, but instead need them to focus on implementing new features, etc? It’s all economics.
I’d like to offer my own taxonomy to budding developers. Kind of a road map to memory usage in C++ to find a good middle ground between performance and safety.
Your first rule of thumb, and the first place you should go to build any object, is the stack. The stack is well understood, putting objects on the stack is usually pretty cheap, and a good chunk of the time, most objects you need are transient for a few statements and then aren’t needed any more. For awhile there, it was simply considered more ‘OO’ to put things on the heap (thank you very much JAVA). Thankfully, those days are gone.
What CAN’T be put on the stack? Well, large objects run the risk of pushing the stack to its limits, since most of the time, the stack is smaller than the heap. In addition, the stack requires a class to be concrete, which makes polymorphism a little more tricky. Finally, again with large objects, is that passing these arguments to functions or methods can be expensive as their copy constructors are called.
For a large object that you want to put on the heap, enter the scoped pointer. The scoped pointer acts like a stack variable in its construction and cleanup, but is created on the heap. You should also question why you’re putting so many large objects on the heap.
To handle polymorphism, for the most part, you’re going to have to use pointers. An abstract base class pointer can point to a derived class, so it’s a question of what kind of management you want wrapping that pointer. Scoped pointers don’t really help any with polymorphism, but our two friends – one of which is disparaged by the Edouard – come to save us. In the case of ‘factory’ like functions, or functions that create an object on the heap and ‘pass off’ ownership to the returnee, we’d like to use an auto_ptr, since this type of smart pointer enforces this pattern. In the case where we’re getting a reference to an object, but the method called still wants to refer to a shared place in memory, then we usually want a shared_ptr. The original poster is right, though, in many cases, through careful analysis, both of these pointer types can be done away with by simply studying the expected lifecycles of the object underneath. But this is not always possible in the constraints of developing software.
Keep in mind the const reference trick, too. Returning a reference to a stack variable is a bad idea, since the stack variable’s probably been deleted. However, the standard states that getting a CONST REFERENCE extends the lifetime of whatever variable is being returned to the enclosing scope. That actually opens up efficient usage of some uses of polymorphism, since the memory’s allocated on the stack yet we can still get a polymorphic reference to it. I’d use this technique with care, though.
Ultimately, such low level control and the plethora of options available to the developer makes memory management in C++ tricky, yet powerful and abstract. There’s different use cases for all the potential views of memory, and subtleties to each use. Shared_ptrs can more or less cover almost all these uses, but there’s a cost associated with not doing any real analysis of your memory footprint. For the average developer, though, if you want to make sure they don’t introduce any risks, giving them a poor man’s garbage collector is probably your best option.
(*Though you still need to peer review their code and check for cycles, the notorious corner case that shared_ptr will still leak with.)
The Science of Science
“Metacognition” is a word that basically means, thinking about thinking. You take part in metacognition when you actively try and spot patterns in your own behavior that are acting against your goals, and then trying to create solutions that work around these problems. It’s self reflection, but also action that takes place due to that reflection. It might be uniquely human.
There is an analogue in society, though, when any sort of social movement begins to use it’s own doctrine to look back on itself. This is generally restricted, though, and does not include self reflective studies using other means, for example, a political movement using scientific evidence to rethink its position on things. I’d suppose we could call that ‘iterative’ progress. But instead, its when you turn a social movement with an intrinsic ability for self reflection back on itself. Say, let’s psychoanalyze Freud, for example. Or, more importantly, let’s apply the scientific method to science itself.
In science, let’s say medical science, we have a hypothesis that X causes B. Now, medical science is one of a few ’special’ branches of science who are motivated by more than just the search for truth itself. Generally, if X causes B, B is a bad thing and we want to stop it. Knowing that X causes it means that we can have more control over B and reduce suffering. So when we do medical science, we more or less want to do it as ‘best’ we can, to get to the bottom of various diseases and halt human suffering.
So we hypothesize that the current way we do medical science is the best, i.e., that it yields the most truth and that these truths are the most effective to reduce human suffering. If, through that hypothesis, a researcher were to be able to find a pattern in medical science itself that yielded weakening evidence, that researcher would be doing science on science itself.
But like medical science, who’s treatments and drugs improve (by some small degree) by every paper published, does science not itself recursively improve through it’s own self analysis? By applying the results of science and mathematics, in the case of the above link, network theory, to the processes that produced those results, science becomes STRONGER due to it. The scientific method as we know it from the enlightenment is a sound way to find truth about our world, we must always remember that the evidence we produce from any one experiment is only as good as the assumptions that went into it. And there are many assumptions in the scientific method itself, of which, we learn about more only through self reflection.
Medical science, not to pick on it, is rife with these enlightening moments. From discovering that taking a sugar pill makes people feel better (the Placebo effect), to realizing that people who follow doctor’s orders tend to be healthier than those who don’t, even if those doctor’s orders are non-sensical. We have invented the double blind placebo controlled trial to control for the scientists’ own bias as well as human’s predilection for sugar pills. We’ve invented meta-analysis to momentarily reflect on our own science to try and combine and better glean results from a group of studies. We’ve invented the idea of conflict of interest, the open access journal as well as many other tools and techniques that were the result of turning the scientific method back on itself.
You can see the result of this recursive growth in transistors. We’ve built computers, and then only months later, used those very computers to design the next generation. This same recursive growth is possible in science too – the very results mentioned above shows that we CAN do science better. Lives are at stake in many cases, and if not lives, then large sums of money. We can only do so much good science, produce so many real results, in a year for a given dollar. We should focus on making these dollars go as far as possible by applying the results of (and funding more of) these sorts of self reflective studies. The more bias we remove from our scientific process, the cheaper science gets for any dollar put in. The cheaper it gets, the more we can do. The more we can do, the more we learn. The more we learn, the more lives we save and the more wealth we create.
And the cycle continues towards singularity. Kurzweil is right, but he never mentions culture. Culture, and the products of human culture, are also capable of exponential growth, and I don’t think we’re anywhere near an inflection point.
Software Engineering Is Dead
Long live Software Engineering!
The exalted Father of many of the processes we know and lovehate, Tom Demarco, recently wrote an article describing his second thoughts on many of his prescriptions from his book, Controlling Software Projects: Management, Measurement and Estimation.
Much of what he says should ring true to most of us in the trenches. Attempting to code directly to software metrics is a fools errand. Not only do the current methods of collecting metrics frequently have relatively high costs to true value, but they also cause a game of metric cat and mouse where software increasingly fits what metrics say ‘looks good’ but loses all qualities associated with those metrics when developers begin coding ‘to’ the metric. It’s basically teaching to the test!
It’s refreshing to see such an intellectual giant in our field so humbly admit his faults in the past – surely many more could learn from him. We’ve had far too many methodologies come and go while their authors seem to continue to pretend that the few good ideas encased within those methodologies are worth all the cruft that’s built up over the years. Just like in many cases what a software project needs is a complete gut-job or rewrite, some methodologies could use the same.
Let’s not throw out the baby with the bathwater though. I’m convinced that nearly every ‘fad’ that’s appeared in the field of software HAS had some valuable things to teach us. It’s all about keeping the good and dismissing the bad, separating diamonds from the rough. The Waterfall method as a whole has shown to create projects drastically over schedule and over budget, but it has shown that many of our most expensive errors come from misunderstanding requirements/use cases. Likewise, coding to metrics in and of themselves is going to get us no where. It’s like writing until Word’s language analyzer gives me back a high ‘reading level’ for my paper. Word can’t analyze the content or tell whether I used the English language correctly. But, all other things being equal, these metrics – when combined with sound human judgement, can make our jobs easier.
If I’m brought on to a failing project as a firefighter, I want to know where the flames most likely are. I can either slowly and methodically scan over the code with my own eyes – most likely spending weeks chasing false positives or style issues (even the most disciplined of ‘real’ developer is going to confuse style with content every now and then), or I can use a host of automated metric gathering tools to give me hints on where to start.
A few object oriented metrics, like various definitions of coupling and cohesion, probably would give me a good clue on where to start refactoring for maintainability. Some good old basic metrics like line count per function or cyclomatic complexity may give me some good hints on where I ought to start tearing apart some unreadable or stagnant code. Simple metrics like test coverage and test count would give me an eyeball figure of how brittle I should expect this code to be.
The point is, metrics in and of themselves say nothing about software quality. When interpreted by a skillfull developer, they make that developer that much more useful and productive. Since, as I’ve mentioned before, Amdahl’s law says as much about software projects as it does about software itself (namely, that any particular ’solution’ a software product is to solved has an optimal number of different ‘threads’ of production, and any developers working on the project in addition to that will actually slow it down due to communication overhead), assuming we have an optimal number of developers, the only way we can speed up projects is to actually increase the speed of each individual developer rather than adding more.
But this is the rub, isn’t it? In fact, hasn’t this always been the rub? No one ever said metrics were all you needed, but it was in fact a misinterpretation by PHB (pointy-haired bosses) that we could automate and more easily manage software development with these metrics. A hammer will allow a craftsman to do a lot more work than his hands alone, but also allow someone unskilled to do a lot more damage. While the point that software is a people problem, not an engineering problem, is beyond the scope of this blog post, perhaps DeMarco’s later focus on the people problem and now his disemphasis on the ‘engineering’ problem will help the community lurch (ever so slowly) towards really understanding how to build software.