The Skeptical Methodologist

Software, Rants and Management

Better Python: Abstract Factory in Python

Frequently design patterns are imported between languages with little thought to how a new language would better express them.  Witness Google’s C++ unit testing framework, which rather blindly follows JUnit even though Boost’s own unit testing framework, leveraging a few naughty macros and a lot of templating, reinvents unit testing in a more succinct and C++ way.  Or, along with testing, you can look at other’s Mock Object frameworks, mostly following the Java tradition versus Mock Objects using templates as proposed in this blog.

Another example would be the Abstract Factory, one of the more intimidating patterns, allowing you to write a GUI framework that somehow runs both on Windows and Macs (at least, that was in the brochure).  It provides a single point of interface to an entire collection of factories that produce new objects.  If you want a Windows Button, you ask the Windows factory.  If you are on a Mac, you get a Mac Button.  The point is, both of these Abstract Factories meet a required interface, and each of the functions on that interface return another abstract type a la the factory method pattern.  To put it in more modern parlance, it’s Dependency Injection, except on an insanely wide scale.

If we were trying to code Java in Python, we might come up with something like the following (forgive the spacing, I don’t really have the gumption to figure out the formatting right now…):

class GUI:

def MakeButton():

pass

def MakeWindow():

pass

class MacGUI(GUI):

def MakeButton():

return MacButton()

def MakeWindow():

return MacWindow()

And so on.  Then, to use the GUI abstract factory, we’d simply pass an instantiation of the factory itself to whoever needed to do GUI stuff.

def CreateChart(GUIType, *args, **kwargs):

GUIType.MakeButton()

GUIType.MakeWindow()

You get the point.  This is how you’d have to do things in a language like Java, however, in Python, we have a trick up our sleeve.  We already have a type that represents a class and function factory – it’s called a Module.  Modules can be passed around as first class types in Python, removing the need for the whole heavy Abstract Factory framework.  Simply pass the module you’d like your class to use in it’s constructor and boom, the ultimate in dependency injection.  This provides astonishing amounts of testability, decoupling and modifiability.

In one file, say Windows.py, we declare the following:

class Button(object):

#do button stuff

class Window(object):

#do window stuff

Then in another file, say Mac.py, we do almost the same thing – same names, except completely different implementation.  The minimal of repetition:

class Button(object):

#most of your changes occur here

At runtime, we detect whether we’re on a Windows or a Mac, and then inject this into whatever other classes we want to run!

def MakeChart(GUIType, *args, **kwargs):

x = GUIType.Button()

#somewhere else in code

import Windows

MakeChart(Windows, *args, **kwargs)

Basically, we’re taking advantage of the modules we’re already building to keep our programs separate, but explicitly telling our other functions and classes where to grab all their variable types.  It’s exactly the Abstract Factory pattern, but we leverage what we’ve already built instead of putting a whole framework on top of our modules to make them explicit.

I’m excited at the addition of Abstract Base Classes in Python 3.0.  Seeing the above example, I’d like to see the idea extended to making explicit interfaces on Modules themselves, fulfilling the ‘Abstract’ part of ‘Abstract Factory’.  Obviously, with Python’s duck typing today, this is not required.  But interfaces do more than simply catch interface bugs, they also communicate a lot of information to potential users.

Patterns usually are a sign that we’re missing something, that we want to express something that’s natural to solve a problem, but the language doesn’t support it.  Thankfully, many ‘patterns’ turn into idioms, or even better, simple one line expressions in high level languages like Python.  This is the reason these languages are so expressive.

January 30, 2009 Posted by | Uncategorized | Leave a comment

Procrastination and Creative Thought

Unfortunately, I can’t find the original article in regards to creative thought, but I did just read this in regards to procrastination.  It’s a study I’ve heard about a few times.  In summary, the conclusions are that people who are given an abstract task, or one without a clear-cut solution, tend to procrastinate more than people given a concrete task.  The lesson here, it’s claimed, is that simple things like setting up To-Do lists can really help you become more productive.

This is true, of course.  But I think the study missed another important conclusion, that is, why would humans approach abstract problems in such a non-productive way?  I’d argue that, in fact, with real abstract problems, procrastination is the best thing you can do.  That is because your brain has two modes, a focused/attentive mode, and a meandering mode.  Psychologists are just no realizing that we spend most of our time all day in this default, ‘meandering’ mode.[*citation needed]

Meandering mode is day dreaming, it’s not concentrating, it’s just kind of sitting there.  But despite what you look like your doing (i.e., nothing), your brain is actually incredibly active during this time.  And not just single parts, but a whole cadre of brain areas light up.  Psychologists wondered what we could be doing that required so much energy, and why our brain would so easily go into this ‘default’ mode that seemed so worthless.

The theory is that this sort of default thinking is where most of our creative thought comes from.  Creative thought requires linking together many seemingly unrelated ideas.  Indeed, creative thought is the exact opposite of clear cut, algorithmic thinking.  Just try and come up with a ‘process’ for being creative.  Process is what we’re good at in focused thinking, but is incapable of genuinely new thought.  For that, you just need to sit back, clear your mind and let ideas come to you.

Isn’t this the best strategy for dealing with abstract problems?  Abstract problems generally are abstract because they require creative solution.  They aren’t problems we’ve dealt with, in an exact way, before.  The abstract problem that slowed down the people in this study was writing a diary entry.  Most people put it off because they had yet to think of something to say.  The best thing they can do is to sit back and stop thinking about it.

So the real lesson of this study is that, while it’s true that things like ToDo lists keep us from turning concrete problems into abstract problems, we shouldn’t ignore the power of simply putting things off when it comes to solving truly abstract problems.  Problems we don’t know how to solve don’t need solutions right now, as we’re likely to implement the wrong ones.  Problems that you don’t immediately see a solution to need research, thought, and some good procrastination.

January 25, 2009 Posted by | Uncategorized | Leave a comment

What is OO?

Reading Paul Graham’s reading of Johnathon Rees on what is ‘OO’, or object-oriented made me wonder why we still treat this as such an ‘all-or-nothing’ affair.  First off, some personality clash.  While I don’t know Paul Graham’s opinion of OO first hand, his implied support of Johnathon Rees’ opinion, and thus the implied support of this lovely gem seem to betray some arrogance on his part:

“This is related to Lisp being oriented to the solitary hacker and discipline-imposing languages being oriented to social packs, another point you mention. In a pack you want to restrict everyone else’s freedom as much as possible to reduce their ability to interfere with and take advantage of you, and the only way to do that is by either becoming chief (dangerous and unlikely) or by submitting to the same rules that they do. If you submit to rules, you then want the rules to be liberal so that you have a chance of doing most of what you want to do, but not so liberal that others nail you.”

Eighty percent of our time, whether corporate or open source or just our own projects, is still, in this day and age, spent on finding defects.  Why spend a single second on that horrible job more than we have to?  Static type checking is a freebie, and if you really want to see it help you, don’t keep setting up C++/Java strawmen.  Check out Haskell and it’s insane type system and type inference.  Being able to eliminate some types of concurrency bugs with types alone is neat.  And before anyone mentions it, don’t forget to unit test too 🙂

(Although I complain, I do have to admit when I see Scala code, ‘bondage-and-discipline’ aren’t too far from my mind.)

Anywho, the between-the-lines reading of all that is that Johnathon Rees thinks people who like anything ‘OO’ are moronic Java slingers and could never hack real code.  Whatever.

Before I go off on a rant here, what is OO then?  Before Rees goes off on his own rant about corporate drudges versus leet Lisp haxors, he makes a list of what he considers OO.  I’ve been thinking about this too, because while I believe we’ve gotten some good things out of the OO train in the past decade, there’s still quite a bit more to squeeze out before we abandon all further development.  These features appear in this language or that, but rarely appear altogether in a nice package.  Some of these features appear in all languages claiming some support for object orientation, while other features seem to just appear and then disappear again like so many ideas in computer science.  The point is, not every OO language incorporates all these features, and you don’t have to incorporate all, or even most of these features, to claim to have support for OO.

  1. Encapsulation: Private, protected, public members and scoping.  But we seem to have lost faith in the encapsulation mantra, with a more modern object oriented language like Python forgoing the whole thing.  In prototyping languages like Python, private members can be seen as somewhat constraining for sure.  I’ll agree with the BDFL that private members shouldn’t prevent a user from mucking around with them.  I’d just like a way to tell the user ‘hey, dragons be here’ without having to resort to an ugly name mangling hack.  Plus, I’m tired of exploring a new library with dir() and help() and not knowing what I’m supposed to pay attention to because every library implementer has a slightly different way to separate interface from implementation.  Encapsulation combines with another aspect of object orientation, interfaces, to produce an insanely helpful design technique.  Why stop here, though?  Encapsulation has many more rich levels to explore, similar to C++’s friend notion, or special semantics for test code (and only test code) to have access to private members.  What about ‘dissallowed’ members?  In C++, you make your copy constructor’s private to prevent anyone from calling them if you want your class to have reference semantics.  Why not make this explicit?
  2. Everything is an object: This is just like saying “everything is first class”.  In a truly OO language, should there ever be one, functions are first class, values are first class, types are first class, classes are first class, code blocks are first class, etc.  This mindset feeds into another important aspect of the spirit of OO, higher level metaprogramming.  Since everything is an object, and every object has a type, this also allows you to make an incredibly expressive and safe type system.  Rees points out that Java, the quintessential OO language fails at this since it leaves primitives (doubles, ints) ‘unboxed’, and not true ‘objects’.
  3. Message Passing: Commercially successful OO languages like C++/Java adopted the semantics of message passing, object.method, but missed the computational model.  This leaves their real implementations still relatively imperative looking.  Message Passing is a subtle and powerful model, also called the Actor model, that lends incredibly well to some forms of concurrency and distribution.  Message passing can only be simulated in commercial languages like Java, and instead their techniques have to be degraded into ‘best practices’ like ‘tell, don’t ask’.  If anything grows in OO, this needs to.  Functional programming is clearly taking over imperative programming’s turf.  A new computational model can be used to supplement Functional programming where it has hiccups, like side-effects for example, which I think message passing would be great for (although this is just a hunch.)
  4. Interfaces: What started out as simply keeping your .h files and .cpp files seperate has become a full fledged design technique.  Yes, as Rees points out, interfaces and ‘discipline’ coding helps ties teams together.  But interfaces also dovetails Test-Driven-Development really well.  The methods you end up testing in TDD usually are derived from some sort of interface design, and even if an interface has just one single inheritor, in some cases it’s still worth it to break down a system into the thinnest interfaces possible.
  5. Inheritance: Rees also aptly points out that this, as we’ve learned, is less the be-all-and-end-all technique of good OO and more or less just a neat feature that’s good in some cases.  Many of it’s best uses would probably be better codified elsewhere, like what’s called traits in some languages, mixins, aspects, or the template-method pattern.  In other cases, specialization on a subtype really is what you need.  It’s probably overused, but inheritance still is the quintessential OO feature.
  6. Polymorphic Method Dispatch: The idea that the real function you are calling is decided upon by at least the real, run-time type of the first argument.  Animal declares method makeNoise, while cat and dog both implement it.  That means at run time, two animals calling make noise will call the methods declared by their real types – one a dog and one a cat.  While very useful in it’s own right, this has been properly criticized as being an awkward place to stop on method dispatch.  Notice I say ‘at least’.  This means that dispatch mechanisms that dispatch on more than the first type, I still consider OO.  In fact, method dispatching on the real, run-time type of any ‘polymorphic’ type, i.e., a type that inherits from some other type or interface, is OO.  Whether it’s just the first argument, in C++’s case, or all arguments, in Lisp’s generic functions, doesn’t matter.  In fact, I think it’s about time real OO languages began adopting more ways to dispatch on type without resorting to hacks like double-dispatch via visitors or something like that.
  7. Actor Model Concurrency: The other half of message passing is a type of free concurrency and distribution called the Actor model.  This will NOT solve all of our concurrency problems – but it can only go to compliment the other forms of concurrency being pushed by functional crowds.  Basically, the Actor model says that encapsulation doesn’t just encapsulate objects from each other, but should also encapsulate threads from each other.  An object’s state should only be changed by itself, on it’s own thread of control.  The ‘Active Object’ pattern also captures this.  There is no reason why OO languages should still resort to imperitive means of concurrency like threads and locks when we already are half way to a completely independent and powerful concurrency model.
  8. Reflection: Another feature that’s only implemented in a few OO languages, in some ways its added on hackishly.  Reflection is a very powerful step towards a high level metaprogramming that OO is capable of, code operating on itself.  I say high level because Lisp has had low level means of acting on itself all along.  Reflection is not just a way to figure out types at run-time manually, but also find out what methods are supported, what are the signatures of those methods, as well as anything else.  Reflection on it’s own is like inheritance, it’s kind of neat but not too incredibly useful other than specializing on various attributes.  Reflection really comes into its own when it’s combined with high level metaprogramming.
  9. High Level Metaprogramming:  Lisp has code that operates on itself, and it is able to do so because it’s syntactical model is so simple – it’s just a list.  Many of the most popular languages today, however, hard more difficult to parse since they’ve adopted Algol style syntax.  Moreover, there’s more to metaprogramming than simply operating on lists.  High Level Metaprogramming is like an API for metaprogramming.  I could be given the raw code for a class, and step through it slowly to figure out it’s method names.  Or I can rely on a built in .getMethods() method.  High Level Metaprogramming uses the same style, conventions and techniques as any other OO code, except you’re operating on code.  No magic, no tricks.  This is something I’ve only seen glimpses of, but sadly most metaprogramming supported in OO languages today is either not OO, so it breaks the conceptual model, or a little ad hoc.  I think C++ template magic is great, but it’s hardly OO.  We’re just going back to lists again!

Some of these ideas have been continued to be researched, while others haven’t.  While I certainly think object orientation  has come a long way in encroaching on traditional procedural/imperative coding, I hope I’ve shown that there’s still much to be done.  OO isn’t just some java drudges toy, it does have the potential to offer everything that the functional paradigm offers.  And most importantly, the two continue to compliment each other very well.  We just can’t give up on OO yet 🙂

January 18, 2009 Posted by | Uncategorized | Leave a comment

Conway’s Corollary

Conway’s Law states that a software’s architecture will inevitably resemble the organization that produced it.  An example being that if you have four groups of people building a compiler, you’ll get a four pass compiler.  Well, I posit that the opposite is true as well, that given any sort of software architecture, there is an optimal social organization to build it.  This seems trivially true, but at the same time, gives us some insight into the software ‘engineering’ process, or project management.

Right now, there is a strong inclination in software development organizations to have two chains of command, a ‘people’ chain, and a ‘technical’ chain.  The ‘people’ chain, commonly derided as ‘management’ in general, tends to deal with the contracts, the hiring, the firing, the personnel reviews and other ‘business’ stuff.  The technical chain decides how to architect a product, what technologies to use, how to implement it, and other more nitty gritty details.  This is not really because the separation of people management from technical management is a good idea, but because it is so incredibly hard to find talent in both fields.  Your most technical people tend to not get along with others, while your most social people tend to balk at technical work.

The problem is that work is broken down by the management side, not based on any sort of architectural guideline, but based on manpower and resources available.  The resulting organizational structure will be far more driven by office politics, historical relationships and other business partnerships than what is ideal for the product.  And due to Conway’s Law, we can suppose that the resulting organizational structure will most likely have more impact on the product and project than any other decision, technical or managerial.

Software development is, poetically, much like software itself.  The Mythical Man-Month states that one of the inherent falsehoods repeated in software engineering circles is the idea that effort equals productivity, and that measuring effort in man-months confusedly implies that men can be exchanged with months.  This is altogether too similar to another problem currently faced by the computer science world: concurrent programming.  Throwing another processor at a program will not necessarily double it’s speed.  In fact, attempting to make some things concurrent can slow you down when the problem is inherently sequential in nature.  The trick is decoupling processes that can be effectively completed in parallel.

The same truth can be said of software development.  There are some inherently sequential tasks, tasks that must be done step by step, and thus really can only be done by one person.  Splitting up these tasks results in less work done, as now two people can at best, go no faster than one person (if they were both to simply solve the problem independently) and at worst, slower due to communication overhead.  The trick to utilizing the most manpower for any project is to find the optimal number of ‘threads’ that run concurrently throughout the project – that require the least amount of communications overhead naturally.  Then that is your number of developers you can use.  If you try and find any more concurrency than this point, you’re going to be drowning your developers in meetings and other communications overhead.  The project will cost more, and take longer.

If we take Conway’s Corollary, that the best software architecture necessarily has a best organization to develop it, then we realize that project management cannot begin until primary architecture is in place.  This architecture must be done entirely from a technical point of view, since any manpower concerns will necessarily trick us into thinking we’re optimizing for our resources.  Conway’s Corollary says we are not.  The ‘best’ architecture can be split up into so many components, and so on subdivided until we reach a point where the sub-components are too coupled with each other to yield any more real concurrent development gain.  This, then, should be used to develop a project plan, estimate manpower and resource needs, rather than the other way around.

Conway’s law says that organization drives architecture.  If we turn it on its head, and try letting architecture drive organization, we might find out why what took the generation before us ten people and a time-share mainframe takes us now teams of dozens of developers with the best equipment.

January 17, 2009 Posted by | Social Commentary, Software Culture, Software Methodologies | 1 Comment