IoC without the fluff
Things like dependency injection and IoC are cool, and they do decouple your code and make it easier to extend at test. But the problem is that the software community always tends to reinvent the same solutions to the same problems and then give them a totally different name. I mean, what are the real differences between Haskell Type Classes and the now defunct C++1x ‘concept’… er… concept. (Seriously guys, we’re running out of different synonyms for ‘thing’. We’ve got object, class, type, type class, structure…) Here’s what IoC and DI are, how they interrelate, without using silly words like ‘container’, which is just another word for ‘thing’ pretty much in this context. We have a function, let’s say, and I’m going to use python notation since it’s short and sweet.
x = 0
def func(y):
return x+y
Everyone can figure out what that function does. But how will it react in a large system? Who the hell knows, it uses a global variable that can be modified at any time, from anywhere. Our function ‘func’ is now coupled to a global variable, as is, by associativity (or is it commutativity?) anyone who uses it. So we change it to this far more elegant version:
def func(x,y):
return x+y
Voila! In DI terms, we’ve ‘injected’ the ‘dependency’ of func on the variable x. Now func is only dependent on two locally defined parameter variables, a far more elegant and simple way to do things. Let’s fast forward to OO for a second. Assume we have a class foo…
class bar:
def __init__(self):
self.myFoo = foo()
def func(self, x):
return self.x + self.myFoo
This is OO, so it’s nice and decoupled right? Not really. If you unroll the object, thinking of it basically as a poor man’s closure, you realize that func is actually doing something like this (assuming no one changes self.myFoo)
def func(x):
return foo() + x
Global functions, when they’re pure, are nice to have. But you can still see the problem of the coupling between func and foo, now. If foo is not pure, then we have side effect issues, but moreover, we still have just plain maintainability/extensibility issues with the coupling between the two. The number one use case pointed to by dependency-injection types is that of testing. What if we want to test func, but we don’t want to test foo? Maybe foo takes a long time. Or maybe we haven’t even written foo yet! (Now you can see the connection between TDD, dependency injection, and why the two might change your style of design. You can test top level objects without even implementing ‘low-level’ ones yet.. But that’s tangential.) We can do the same thing we did in the first case though to break that coupling:
def func(x, foo_thing):
return x + foo_thing()
Oh hey, higher order programming. Who’d have thought that would show up here? Let’s roll our stuff back into familiar OO terms:
class bar:
def __init__(self, foo_thing):
self.myFoo = foo_thing()
def func(self, x):
return self.myFoo + x
Dependency, consider yourself injected. Ok, fine, what the hell is IoC then? Why does it have to have such a crazy name, and what are these containers, and wah wah wah! Well, let’s go back to our familiar func and make it a little more complex:
x = 0
y = 0
z = 0
a = 0
b = 0
c = 0
def func(var):
return var + a + b + c + x + y + z
Gross! Global dependencies on like, a bajillion different things! Let’s inject those dependencies!
def func(var, x, y, z, a, b, c):
return var + a + b + c + x + y + z
Ah, much better. Er, wait… now my buddy, who had code like this:
def func_2(var1, var2):
return func(var1) + func(var2)
Has had his interface explode to:
def func_2(var1, var2, a, b, c, x, y ,z):
return func(var1, a, b, c, x, y, z) + func(var2, a, b, c, x, y, z)
And his callers have their interfaces explode, and their callers have theirs, etc… But wait, any of you OO gurus notice that a,b,c and x,y,z never seem to make any sense without each other? They always seem passed with each other, they seem semantically related, who knows what else. Why, I’ll bet our design could be more… what’s that nice word elegant designs are described as? Oh yes, cohesive!
class point:
def __init__(self, x, y, z):
self.x = x; self.y = y; self.z = z #just leave them as plain-old-data and public
Oh wait, and two points, well, we can call that a…
class distance: #er, maybe not the best name…
def __init__(self, pt1, pt2):
self.pt1 = pt1; self.pt2 = pt2
Now are interfaces shrink down, only needing one extra parameter, a ‘distance’, which I’m sure in our func, foo, bar whatever application we’re building, make good sense semantically and is drawn from our domain. Likewise, let’s say that we’re using dependency injection to make our objects a lot less coupled.
class bar:
def __init__(self, obj1, obj2, obj3, obj4, obj5):
self.obj1 = obj1; self.obj2 = obj2; self.obj3 = obj3; self.obj4 = obj4; self.obj5 = obj5
Wow, those dependencies sure are injected all right. But that class has been injected so much, it’s beginning to look like a heroine addict. These objects all seem to have something to do with each other, why don’t we bundle them up in a cohesive object?
class bar:
def __init__(self, obj_builder_thingie): #TODO: think of a better name
self.obj1 = obj_builder_thingie.obj1(); …
Ah, now our design is decoupled and cohesive. Hallmarks of good design. (Of course, simply bundling arbitrary objects together isn’t necessarily cohesive. We’re assuming in this case, whenever we’ve bundled something together, those objects have had something to do with each other semantically or in the domain.) For some reason, whenever we bundle these objects together, our Java buddies (and C# buddies) want to call this ‘Inversion of Control’, mostly because it’s now the caller/creator’s problem of figuring out how it wants to ‘configure’ the innards of our bar class. They sometimes will go even further and make obj_builder_thingie not just a normal object, but a framework for going out and reading XML files to dynamically figure out what obj1() and obj2() should return. That way, instead of editing a nice clean python unit test file to configure and modify your tests, you can edit a huge ugly xml file with lots of angled brackets to configure and modify your tests. Sounds much more enterprisey to me! Job security here I come!
So, let’s recap. Dependency injection is moving a ‘global’ to a ‘parameter’, which insofar as primitive types and values, has always seemed like a good idea. But for OO guys, moving global functions (or an object representing those functions) seems like a brand spankin’ new idea, when really functional guys have been moving global functions to parameters for years. C’est la vie, at least we’re learning something from each other. The specific case bandied about is always how this makes testing easier because you can ‘inject’ which objects you’d like to collaborate with in the constructor (or a setter, etc… but setter’s are evil.) But this solution can be generalized to ‘inject’ any configurable behavior you want in the form of variables, functions or classes themselves in a dynamic language like python. The side effect is your parameter lists get huge, and while your code is decoupled, its decoupled kind of like a box of legos with no instructions on how to build your starship. So you use your ole’ object-oriented smarts to start grouping some of those parameters together into objects themselves, and now your ‘injected class’ begins looking through the objects that are passed into it to find what it needs, rather than everything being unrolled on the parameter list. IoC is basically, then, dependency injection (or the voracious removal of anything global) on a large scale and must, then, solve the problems (explosive parameter lists) that come with that.
That’s it in a nutshell. Hopefully I haven’t confused you further. Dependency injection is just another step in the good practice of “make things as local as possible”, or decoupling. Parameter lists are more local than global, even for functions like class constructors. IoC is the counterbalance to massive decoupling, and that’s to make the program more ‘cohesive’ again by grouping up many of these new things being passed in parameter lists into objects and hierarchies themselves. Any questions?
Smart Pointers for Dumb Developers
Edouard in his post claims that smart pointers, like shared_ptr and auto_ptr, are overused. While he actually makes quite a few legitimate critiques, I think he misses the fundamental problem that hits nearly all of the computer scientist’s beautiful abstractions – they’re all dangerous if they’re overused. C++’s smart pointers are no exception. I do wish to come to their defense, though, and some what might seem like obvious guidelines for their use.
Smart pointers are the poor man’s garbage collection. Or, in better terms, they are the average C++ developer’s garbage collection. C++ is a language that is a little bit too ‘expert friendly’, and dynamic allocation, similar to multithreaded programming, is hard(tm). Smart pointers in many cases allow your average developer to turn a horrible bug – memory leaks or null dereferences – into a less heinous inefficiency. That’s a win. They are no complete replacement for object lifecycle analysis though, which is Edouard’s main point. Of course, raw pointers that are intelligently managed will always outperform smart pointers. But what if you don’t have the smartest developers in the world? Or what if you don’t have the time/money to throw your smart developers at making sure you squeeze every last FLOP out of that processor, but instead need them to focus on implementing new features, etc? It’s all economics.
I’d like to offer my own taxonomy to budding developers. Kind of a road map to memory usage in C++ to find a good middle ground between performance and safety.
Your first rule of thumb, and the first place you should go to build any object, is the stack. The stack is well understood, putting objects on the stack is usually pretty cheap, and a good chunk of the time, most objects you need are transient for a few statements and then aren’t needed any more. For awhile there, it was simply considered more ‘OO’ to put things on the heap (thank you very much JAVA). Thankfully, those days are gone.
What CAN’T be put on the stack? Well, large objects run the risk of pushing the stack to its limits, since most of the time, the stack is smaller than the heap. In addition, the stack requires a class to be concrete, which makes polymorphism a little more tricky. Finally, again with large objects, is that passing these arguments to functions or methods can be expensive as their copy constructors are called.
For a large object that you want to put on the heap, enter the scoped pointer. The scoped pointer acts like a stack variable in its construction and cleanup, but is created on the heap. You should also question why you’re putting so many large objects on the heap.
To handle polymorphism, for the most part, you’re going to have to use pointers. An abstract base class pointer can point to a derived class, so it’s a question of what kind of management you want wrapping that pointer. Scoped pointers don’t really help any with polymorphism, but our two friends – one of which is disparaged by the Edouard – come to save us. In the case of ‘factory’ like functions, or functions that create an object on the heap and ‘pass off’ ownership to the returnee, we’d like to use an auto_ptr, since this type of smart pointer enforces this pattern. In the case where we’re getting a reference to an object, but the method called still wants to refer to a shared place in memory, then we usually want a shared_ptr. The original poster is right, though, in many cases, through careful analysis, both of these pointer types can be done away with by simply studying the expected lifecycles of the object underneath. But this is not always possible in the constraints of developing software.
Keep in mind the const reference trick, too. Returning a reference to a stack variable is a bad idea, since the stack variable’s probably been deleted. However, the standard states that getting a CONST REFERENCE extends the lifetime of whatever variable is being returned to the enclosing scope. That actually opens up efficient usage of some uses of polymorphism, since the memory’s allocated on the stack yet we can still get a polymorphic reference to it. I’d use this technique with care, though.
Ultimately, such low level control and the plethora of options available to the developer makes memory management in C++ tricky, yet powerful and abstract. There’s different use cases for all the potential views of memory, and subtleties to each use. Shared_ptrs can more or less cover almost all these uses, but there’s a cost associated with not doing any real analysis of your memory footprint. For the average developer, though, if you want to make sure they don’t introduce any risks, giving them a poor man’s garbage collector is probably your best option.
(*Though you still need to peer review their code and check for cycles, the notorious corner case that shared_ptr will still leak with.)
The Science of Science
“Metacognition” is a word that basically means, thinking about thinking. You take part in metacognition when you actively try and spot patterns in your own behavior that are acting against your goals, and then trying to create solutions that work around these problems. It’s self reflection, but also action that takes place due to that reflection. It might be uniquely human.
There is an analogue in society, though, when any sort of social movement begins to use it’s own doctrine to look back on itself. This is generally restricted, though, and does not include self reflective studies using other means, for example, a political movement using scientific evidence to rethink its position on things. I’d suppose we could call that ‘iterative’ progress. But instead, its when you turn a social movement with an intrinsic ability for self reflection back on itself. Say, let’s psychoanalyze Freud, for example. Or, more importantly, let’s apply the scientific method to science itself.
In science, let’s say medical science, we have a hypothesis that X causes B. Now, medical science is one of a few ’special’ branches of science who are motivated by more than just the search for truth itself. Generally, if X causes B, B is a bad thing and we want to stop it. Knowing that X causes it means that we can have more control over B and reduce suffering. So when we do medical science, we more or less want to do it as ‘best’ we can, to get to the bottom of various diseases and halt human suffering.
So we hypothesize that the current way we do medical science is the best, i.e., that it yields the most truth and that these truths are the most effective to reduce human suffering. If, through that hypothesis, a researcher were to be able to find a pattern in medical science itself that yielded weakening evidence, that researcher would be doing science on science itself.
But like medical science, who’s treatments and drugs improve (by some small degree) by every paper published, does science not itself recursively improve through it’s own self analysis? By applying the results of science and mathematics, in the case of the above link, network theory, to the processes that produced those results, science becomes STRONGER due to it. The scientific method as we know it from the enlightenment is a sound way to find truth about our world, we must always remember that the evidence we produce from any one experiment is only as good as the assumptions that went into it. And there are many assumptions in the scientific method itself, of which, we learn about more only through self reflection.
Medical science, not to pick on it, is rife with these enlightening moments. From discovering that taking a sugar pill makes people feel better (the Placebo effect), to realizing that people who follow doctor’s orders tend to be healthier than those who don’t, even if those doctor’s orders are non-sensical. We have invented the double blind placebo controlled trial to control for the scientists’ own bias as well as human’s predilection for sugar pills. We’ve invented meta-analysis to momentarily reflect on our own science to try and combine and better glean results from a group of studies. We’ve invented the idea of conflict of interest, the open access journal as well as many other tools and techniques that were the result of turning the scientific method back on itself.
You can see the result of this recursive growth in transistors. We’ve built computers, and then only months later, used those very computers to design the next generation. This same recursive growth is possible in science too – the very results mentioned above shows that we CAN do science better. Lives are at stake in many cases, and if not lives, then large sums of money. We can only do so much good science, produce so many real results, in a year for a given dollar. We should focus on making these dollars go as far as possible by applying the results of (and funding more of) these sorts of self reflective studies. The more bias we remove from our scientific process, the cheaper science gets for any dollar put in. The cheaper it gets, the more we can do. The more we can do, the more we learn. The more we learn, the more lives we save and the more wealth we create.
And the cycle continues towards singularity. Kurzweil is right, but he never mentions culture. Culture, and the products of human culture, are also capable of exponential growth, and I don’t think we’re anywhere near an inflection point.
Software Engineering Is Dead
Long live Software Engineering!
The exalted Father of many of the processes we know and lovehate, Tom Demarco, recently wrote an article describing his second thoughts on many of his prescriptions from his book, Controlling Software Projects: Management, Measurement and Estimation.
Much of what he says should ring true to most of us in the trenches. Attempting to code directly to software metrics is a fools errand. Not only do the current methods of collecting metrics frequently have relatively high costs to true value, but they also cause a game of metric cat and mouse where software increasingly fits what metrics say ‘looks good’ but loses all qualities associated with those metrics when developers begin coding ‘to’ the metric. It’s basically teaching to the test!
It’s refreshing to see such an intellectual giant in our field so humbly admit his faults in the past – surely many more could learn from him. We’ve had far too many methodologies come and go while their authors seem to continue to pretend that the few good ideas encased within those methodologies are worth all the cruft that’s built up over the years. Just like in many cases what a software project needs is a complete gut-job or rewrite, some methodologies could use the same.
Let’s not throw out the baby with the bathwater though. I’m convinced that nearly every ‘fad’ that’s appeared in the field of software HAS had some valuable things to teach us. It’s all about keeping the good and dismissing the bad, separating diamonds from the rough. The Waterfall method as a whole has shown to create projects drastically over schedule and over budget, but it has shown that many of our most expensive errors come from misunderstanding requirements/use cases. Likewise, coding to metrics in and of themselves is going to get us no where. It’s like writing until Word’s language analyzer gives me back a high ‘reading level’ for my paper. Word can’t analyze the content or tell whether I used the English language correctly. But, all other things being equal, these metrics – when combined with sound human judgement, can make our jobs easier.
If I’m brought on to a failing project as a firefighter, I want to know where the flames most likely are. I can either slowly and methodically scan over the code with my own eyes – most likely spending weeks chasing false positives or style issues (even the most disciplined of ‘real’ developer is going to confuse style with content every now and then), or I can use a host of automated metric gathering tools to give me hints on where to start.
A few object oriented metrics, like various definitions of coupling and cohesion, probably would give me a good clue on where to start refactoring for maintainability. Some good old basic metrics like line count per function or cyclomatic complexity may give me some good hints on where I ought to start tearing apart some unreadable or stagnant code. Simple metrics like test coverage and test count would give me an eyeball figure of how brittle I should expect this code to be.
The point is, metrics in and of themselves say nothing about software quality. When interpreted by a skillfull developer, they make that developer that much more useful and productive. Since, as I’ve mentioned before, Amdahl’s law says as much about software projects as it does about software itself (namely, that any particular ’solution’ a software product is to solved has an optimal number of different ‘threads’ of production, and any developers working on the project in addition to that will actually slow it down due to communication overhead), assuming we have an optimal number of developers, the only way we can speed up projects is to actually increase the speed of each individual developer rather than adding more.
But this is the rub, isn’t it? In fact, hasn’t this always been the rub? No one ever said metrics were all you needed, but it was in fact a misinterpretation by PHB (pointy-haired bosses) that we could automate and more easily manage software development with these metrics. A hammer will allow a craftsman to do a lot more work than his hands alone, but also allow someone unskilled to do a lot more damage. While the point that software is a people problem, not an engineering problem, is beyond the scope of this blog post, perhaps DeMarco’s later focus on the people problem and now his disemphasis on the ‘engineering’ problem will help the community lurch (ever so slowly) towards really understanding how to build software.
When do I abstract?
“Premature Flexibilization is the Root of whatever evil is left” takes a look at what I’ve also heard called ‘premature abstraction’, and in today’s high level languages, probably is on par with premature optimization for which causes the most headaches in maintenance code.
It’s also a really funny title.
First, a note about culture. I work mostly with EE’s (electrical engineers) who cut their teeth on after all these years still swear by Fortran. I’ve mentioned the odd distrust between EE’s and computer science majors: engineers think scientists don’t know how to get real work done, while the scientist don’t believe engineers understand any of the complexity of what they build. Ever find code that attempts to match a string by going through each character one by one against a switch statement? That’s an engineer at work. Ever work with a AbstractBuilderFactoryObserver that’s templated on five types? That’s a scientist at work. So as not to confuse these ’scientists’ with real scientists, I’m going to call them designers. Engineers thrive on seeing things work, designers thrive on building them. A designer’s work is done when she’s figured out how it should work, there’s no thrill in actually building it. An engineer’s work is done when it’s built and he can test it. He wants to see the capabilities of the thing when he’s done.
You can see that each of the major sins discussed so far, premature optimization and premature abstraction, also tend to be committed more by one community than the other. Code that’s prematurely optimized tends to come from the engineer’s world, while prematurely abstracted code tends to come from the designer. They each find different reasons to defend what they should know to be wrong. Frequently I’ll hear an engineer friend tell me that he has to code in such and such way because he’s targeting an embedded system. I’ll ask him for profile evidence, or try and reason with him that he’s avoiding a temporary variable that doesn’t appear to be in a tight loop. I’ll try and say that the compiler’s optimization will catch what he’s trying to do. No avail. For the most part, he doesn’t trust compiler optimizations.
The designer will claim that she’s building with maintainability in mind, and that the alternative is cowboy coding. She’s being rigorous, planning for change. You might ask, “What are the use cases for this abstraction?” which usually yield something like “well, maybe we want to target a completely different OS.” That’s true, maybe we DO want to target a different OS, but that’s not a change most projects take lightly. More often than not, some business rule is going to change. It’s an abstraction that’s going to cost developer time and energy for everyone who touches it, and most likely never have any pay off.
We all know (or we should) when to optimize. At the design phase, we should be considering and analyzing use cases, choosing efficient data structures and algorithms. We should draw on our experience in the field to predict certain hot spots, but not rely on this experience too much (in our rapidly changing field, experience has a pretty short shelf life). We should be aware of free optimizations, or those that don’t sacrifice readability, when coding. We should design with performance in mind, but realize that performance mostly comes from reducing work, rather than working more cleverly than normal. When we finally have something built, we should stress test it to find hot spots, and then we should begin working on our ‘clever’ approaches.
But it’s not chiefly taught when to abstract. This is because, for our go-lucky computer scientists coming out of their Java schools, abstraction is still pretty fresh in everyone’s mind. It took us about fifty years of being burned by prematurely optimized code to realize we should code first for readability and optimize later. It is only now, with our gigantic frameworks of frameworks and systems of systems that we’re realizing we’re creating huge spider web messes of code. Just look at the Eclipse framework – quite a marvel of design. Anyone can write a plug in and it WILL work, and you CAN extend or change virtually any part of it, but I’d claim that 90% of the parts of code that CAN change DON’T in any extension. Yet each part of code that is modifiable exacts a cost on the maintenance of the system.
So when should we abstract? When should we decide to use polymorphism or some other means to put a ‘point of inflection’ in our code, a point where we can extend it? I suspect we can derive some simple rules (indeed, many already have such as KISS and YAGNI) based more or less on the lessons of premature optimization.
First, in the design phase, for optimization, we take two things into account – we pick good data structures and algorithms that fit the expected usage and we let experience be our guide (to an extent). How would this apply to abstraction? To me, it says that our first pass of abstraction, or that which takes place during the design phase, should attempt to model the points of inflection that exist in the domain model itself. If the domain model actually contains the classic ‘animal’ base class, with cats and dogs being subsets, then that polymorphism certainly belongs in the code as well. How can experience guide us? This is where patterns come in, and a designer who’s solved similar problems before may be able to offer some insight as to what he or she expects to change in the future. BUT, just as experience has a short shelf life in optimization, it does as well in abstraction. Design pattern Hell is the result of searching a little too hard for these, and ultimately, experience tells us how to build the last system, not the next one.
The second point to learn from optimization is the “free optimizations”. These include compiler optimizations, and little tips like “pass big structures by reference” and what not. These are the idioms in the language, in addition to being the more efficient way to do things. What “free abstractions” can we expect from our languages? Well, first and foremost, are the abstractions the languages can get us – these are things like libraries and frameworks that are already built. Prefer these to home-rolled solutions, as they’ve already been abstracted in many useful ways without any work on your part. Secondly are the “best practices” and idioms that improve abstraction with very little cost to readability or understandability. This would include popular techniques like dependency injection and preferring to pass by interface rather than concrete type.
The third point is that we should design with performance in mind, but the best performance boost is work avoided rather than work done cleverly. In abstraction terms, this means that small, elegant solutions tend to, ironically, be more extensible than large abstract ones. The best example of this might be the object hierarchy mentioned above, our ‘animal’ base class and ‘cat’ and ‘dog’ concrete classes. Well, to be extensible, we might need to have a ‘multithreaded_cat’ and ‘multithreaded_dog’ class as well, or maybe even a ‘distributed_cat’ and… But all of this avoids the original point. We can avoid all this work in the first place by ensuring neither cat nor dog make any assumptions on the threading scheme. That way, one hierarchy describes threading and one describes our domain and they never need meet. We’ve avoided a whole lot of abstraction (an exponential explosion of classes) by simply designing elegantly in the first place.
Finally, to optimize, we measure and begin offering up some more fine tuned ‘clever’ code where we find our hot spots are. Similarly, in abstract code, we must refactor at our points of repetition. You can get some abstraction done at design time, but much of your abstraction is only going to become apparent as you code. You should be on constant gaurd for repetitive code, as this is a sure sign that the code should be encapsulated elsewhere. Similar to the optimizations that mostly sacrifice readability, the abstractions found purely by refactoring tend to be the ones that are most at risk for ‘leaking’, or, sacrificing understandability for a point of inflection. But, so long as we have proof that the code is used in multiple places, it will save us work in the long run. It is these sorts of abstractions that can only be ‘discovered’, rather than designed in, because we are so poor at predicting what code is going to repeat itself as we develop our system.
Optimization and abstraction both are important parts of performing, maintainable systems, but we poor humans have shown ourselves to be poor predictors of where the tools of optimization and abstraction best be leveraged. But we needn’t take another 50 years to learn how to best abstract.