The Skeptical Methodologist

Software, Rants and Management


There are two P’s of work: productivity and prioritization. All too often, we focus on productivity: how can I do more with the same amount of inputs? Or, since productivity is so closely linked with quality, we ask: am I building the thing right?

More rarely, we ask: am I building the right thing? This question is answered by how we prioritize.

Most People’s Prioritization Strategies is “Don’t”

Most people don’t prioritize. Or, if they think they do, they abide by some overly naive protestant prioritization scheme that work matters more than fun or prioritizing studying over partying.

We’re not talking about work versus fun. Welcome to adulthood.

We’re talking about actually prioritizing the work that’s coming in, and when it comes to that most people fail to prioritize altogether.

When faced with multiple projects that have to get done, folks routinely fall into the scheme that all projects are equally important, and all projects must get done.

Usually, this is where they turn, wrongly, to questions of productivity. They need to do X, Y, and Z and can’t seem to get it done. Maybe there’s a more efficient way to do it such that they can do X, Y, and Z?

Or, far more likely, they fail to become more productive and simply work more hours.

If anyone is routinely working more than they should, they probably have a prioritization problem.

The Most Important Prioritization Question

Prioritization schemes abound, but they all attempt to answer the same question. This question causes ghastly shrieks among those naive protestant masses who simply put more and more and more effort into projects and refuse to believe a prioritization problem exists.

What can drop on the floor?

When people say “It all has to get done!” they’re refusing the difficult work of prioritizing. More importantly, they’re setting themselves up for a slow decline. By simply throwing more hours at a problem, they’re going to lower their productivity, and even worse, lower their quality. Eventually, they won’t even be able to do what they used to do, and they’re going to lose revenue because of it.

Doing everything is not an option.

“But,” you cry, exasperated, “I really do have to do everything! Everything is important! Each one of my projects is tied to a key client!”

If that’s what you’re thinking, then you’ve utterly failed to figure out what “key client” means. You can only ultimately juggle so much work. Clients taken on above that level are, by definition, not key. Revenue taken on above a level of sustainability can never be ‘key’ revenue.

Since you can’t decide which clients are key, which revenue is key, you may think that you’ve rendered them all key, but in fact, none of your clients are key.

None of your clients will achieve a level of service above any other, and none of your revenue will be protected above any other. None of it is key. To say something is key, to prioritize it, means that you will sacrifice something else to save it. That’s what it means for something to be key.

Once you figure out what can be dropped on the floor? You’re ready to prioritize.

“Wait!” you cry some more, “My projects have deadlines!!!

Then figure out what deadline you can miss. This question isn’t any different than what client you can drop – you have to focus on the things you can achieve or risk losing them all. If you don’t focus, you will wear yourself out, and drop all the balls. If you focus, you will continue to juggle the ones you focus on.

Two Strategic Prioritization Strategies

Okay, so you’re ready to prioritize. In fact, you’re at a strategic level, and you want the folks below you to prioritize. How do you do that?

The Embedded Customer

One means of enforcing priorities is to embed the high priority client with a development team itself. That way new work always comes through that customer, rendering it impossible for one customer to walk all over another customer’s project by priority conflict.

What teams get embedded customers can be reorganized once a quarter or once a year. It’s best, if using this approach, to make a ‘diversified portfolio’, with your portfolio ‘weightings’ determined by how important various customers are. Large customers with a lot of growth potential may get 60% of your teams, while difficult customers may only get 10% of your teams.

The Shared Vision

Another approach is to do the difficult job of prioritizing what the company ought to work on once a quarter or once a year, then communicate these priorities out to the teams. The teams are then empowered to chase these priorities as they see fit.

Often this is called a ‘strategy’ and it’s decided at a high level. Most often what we see in terms of ‘strategy’ isn’t so much a strategy at all as much as promotional puffery and overly abstract grand statements. When you realize that a lack of a strategy, or a bad strategy, is ultimately the same problem as a lack of priorities, or bad priorities, you shall be enlightened.

Strategies ought to be concrete and simple, with no more than 3-4 priorities and their ordering. While each priority needs to be broad enough to allow for interpretation by your different teams, they also need to be concrete enough to avoid dumb abstractions like “Our strategy is to do what it takes” or “Our strategy is to focus on business success”

I mean, really, those aren’t strategies. Strategies are decisions in an environment of tradeoffs (sounds like prioritization to me!). If your strategy does not inform the rest of the company of the trade-offs you intend to pursue – what you intend to focus on and what you intend to let drop – then your strategy is not a strategy.

You also need to make education a part of this vision – your teams need to have the knowledge and the tools to know whether their ideas successfully implement your strategy or not. If you’re asking your engineers to focus on a certain kind of client but not giving them the marketing know how to know who that client is, your strategy will fail.

The Combination

These two approaches can be combined, with an overarching strategy feeding into the portfolio mix – if your strategy is largely to go after one segment of customers rather than another, you can embed representatives of those customers on more of your teams.

The shared vision can then pushed to those teams as a focus point, and help guild what projects they come up with beyond simply chasing the latest and greatest thing that client may want.

You Have to Focus and Drop Something

Making more widgets per hour isn’t going to help you if the market is no longer buying widgets. You need to step back and think about what you actually have the resources to do (namely, time) and from there, decide what steps you’re going to take.

Attempting to do everything is a cliche model-student way to react to this problem. When the teacher throws more work at you, you often learn to party less, to even sleep less, and nail those grades. Well, no one’s actually looking out for you now like a teacher might – you literally are going to have to figure out what class you plan to fail to ensure you pass your others. This isn’t a lesson many people get in school.

Enforcing prioritization on a larger scale must be done too, and often crosses over into what’s called strategy. Embedding goal makers (client representatives) into teams is one way to streamline this, and then how you weight the teams becomes your prioritization problem. Alternatively, giving your teams a short list of large goals that they can independently figure out how to achieve can also eliminate priority conflicts.

Ultimately though, priorities and strategy are about trade-offs. And to make a trade-off, you have to let something go.


January 15, 2018 Posted by | Uncategorized | 1 Comment

Two Management Patterns

These tactics have come up in the past few days. They could be called patterns, or they could be called antidotes to various antipatterns.

First, Do No Harm: The Hippocratic Pattern

Hippocrates was a Greek physician and one of the first medical thinkers. Modern day doctors take the Hippocratic Oath, which begins “First, Do No Harm”.

What the oath tries to stress is that many options you have available to you can implicitly make the problem worse. By taking action, you’re taking risk. Often either due to optimism bias, wishful thinking or just panic mode, we take actions that we don’t fully think through the risks.

Does this mean never take action? No. If it meant that, doctors would never treat patients. What it means is that all action should be weighed against its pros and cons and that taking action often comes with risks – especially in a stable system – that go undervalued.

Loss Aversion: Or, Why the Feature You Have is Always Worth More than the Feature You Don’t

One big reason tech debt piles up is because we refuse to turn off features that don’t pay the rent. These features usually fall into one of two camps – either a feature that never really took off with heavy use in the market or a feature that was once popular but has since been eclipsed by something else.

Features are complexity. A product that supports ‘more’ features will, all other things being equal, always be more complex than a product that supports ‘fewer’ features.

These features are the source of revenue, and if the revenue doesn’t cover the costs of maintaining them, they need to be cut. This is usually a very hard decision due to a concept called ‘Loss Aversion’.

Basically, loss aversion is when it hurts you more to lose something than it would cause you pleasure to gain it. The best example can be seen in an experiment on coffee mugs. Scientists asked folks to rate how much they’d pay for a coffee mug. They, on average, put the price of the coffee mug at $2. That implies (if we’re in a perfectly rational world) that giving the coffee mug to a person would give them $2 worth of pleasure.

You can measure the same thing by giving them the coffee mug, and then offering to buy the mug from them. Again, in a rational world, they should be willing to take the $2 in exchange for the mug – the two things, $2 and a mug, should be fungible and of equal value. But they aren’t.

Folks wouldn’t give up the coffee mug, once they had it, for $2, but instead usually asked for $3 or more. The pain of losing the mug was worth more to them than the pleasure of gaining the mug.

Knowing this story can help you frame your argument, should you have to argue that a feature ought to be discontinued. Instead of saying “Feature X only gets us Z many customers, and costs us Y to maintain, we should cut it”, simply turning the phrase into “There’s a feature out there that’d cost Y to build and get us Z customers, should we do it?”

When folks say no, that there are better projects to chase down, then you mention “Well actually, this is a feature we already have. I’m wondering if its slowing us down from getting more valuable things.”

Avoid the idea of loss – focus on what you can gain (the opportunity cost of keeping the outdated feature are high, frame them in terms of new things you’re passing up), and also get your colleagues to commit to a rational belief that we shouldn’t pay Y dollars for Z customers before the reveal that you’re already doing so.

November 16, 2017 Posted by | Uncategorized | Leave a comment

SYWTLTC: (AB) Chapter 3.5 Type Checking

This is the final chapter in our software quality series.

We’re going to draw on object-oriented techniques below, so make sure you’re this far along in Python Programming:

  1. Programming Basics (ch 1, ch 2)
  2. Beginner Data Structures (ch 5, ch 11)
  3. Structured Programming (ch 7, 8)
  4. Procedural Programming (ch 6)
  5. Algorithms and Recursion (ch 13)
  6. Object Oriented Programming and Design (ch 4, ch 10, ch 9, ch 12)
  7. Numerical computing (ch 3)

Let’s recap.

We’ve talked about testing – that’s a tool you write yourself that verifies parts of your program.

We’ve talked about linting – that’s a tool someone else has written that will analyze your code for common errors.

We’ve talked about contracts and assertions – or the idea that our functions can and should make promises to each other, and if those promises are violated, our program should crash. This, in a way, embeds a program inside your program so that it can check itself.

Finally, we’ve talked about peer review and collaboration – this is the only non-programmatic means we’ve introduced to ensure quality. Have another programmer read over your code.

There are interesting crossovers and reinforcements.

  • You shouldn’t test your tests, so all you have are linters and peer review on test code.
  • Littering assertions and contracts through your code means every test you run checks many more things, so they build on each other.
  • Assertions can document assumptions, making peer reviews easier.
  • Linting can leave code more consistent, making peer reviews easier.
  • Much more…

There’s one more technique that is not as popular in Python but very dominant in other languages and this one technique, when applied well, can prove the absence of certain kinds of errors.

Eye of Newt, Hair of dog…

What would happen if we combined our linter with our contracts? In other words, what if we could have something check that our code never violated our contracts?

x = 3
assert x > 4

We want something that will flag the assertion above as false without running the code. Something that can reason about our code statically, and discover errors automatically.

Enter the Type Checker

First, a little random history.

Back in the early 1900’s, a philosophy known as Logical Positivism was having its prime. Logical Positivism claimed that logical sentences – sentences constructed via a specific mathematical process – made either true or false claims about the world. Sentences that violated the mathematical formulation were determined to be gibberish.

It was an attempt to place the entirety of human knowledge on the basis of mathematics. And at the very center, was a mathematics called Set Theory.

Sets are more or less just lists or collections of things. The set of sheep, for example, or the set of prime numbers. Primarily used for number theoretic questions, Set Theory – with enough hammering – could in and of itself define basic arithmetic (called Peano arithmetic), and thus begin to define the rest of mathematics.

There was a problem though – a huge hole in the middle of set theory that leads to a paradox. Anything can be a set, after all. What about this – is this a set?

“The set of all sets that do not contain themselves”

A set is a collection – certainly, it can be a collection of collections. So that checks out, and nothing in Set Theory says the sets can’t be self-referential – either containing or not containing themselves. So that checks out too. Seems like it’s a set.

Let’s call the above set X. And we’ll ask a very easy question that blows up all of Set Theory – does X contain itself?

If X is inside X – X contains itself – then by definition, it can’t – because it belongs to the set of all things that do not contain themselves.

Of course, if X is not inside X, i.e., X does not contain itself… then it does contain itself since it’s the set of all things that do not contain themselves!

Logical Positivists wanted all logical sentences to be True or False – not paradoxical. Thus began the great quest to figure out how to strengthen set theory to once again be sound enough to be a foundation for all mathematics, and thusly all human knowledge.

Two interesting things happened out of this – one, a weird offshoot called Type Theory came to be. The other interesting thing that happened was the Incompleteness Theorem which more or less said the whole quest was doomed from the start. No matter what you do to Set Theory – if it’s powerful enough to create Peano arithmetic, it will always contain paradoxes.

You can think about that second part for a while and the futility of ever organizing human knowledge or thinking that mathematics was a sound and complete system of reasoning. We’re going to talk about the first more minor blip.

What type of thing is a type?

Type Theory tried to categorize ‘things’ that can be in sets into different ‘types’ which can’t be compared. A set is one type, a set of sets is another type, and a set of sets containing itself would be a violation of type theory – since from one angle you’re talking about a set, and from another, you’re talking about a set of sets.

You know what? Let’s go back to the Incompleteness Theorem, cause it shows up again here of all places. While Gödel was working on the Incompleteness Theorem, another smart dude named Alan Turing was coming up with his Halting Problem.

See, the problem was he wanted to see if a program could be written to determine if another arbitrary program crashed. It’s tricky reasoning he used, but similar enough to the ‘sets that don’t contain themselves’ mind bender above. Basically, he proved that it was impossible. No program could determine if another arbitrary program ever stopped or went into an infinite loop.

Your linter will never detect whether your program crashes or not. It can detect certain kinds of crashes, based on patterns. But it can’t rule everything out. It’s mathematically impossible.

In fact, a lot of problems in computer science have been determined impossible by proxy. If you can show that by solving your problem X, you could solve the halting problem, you know that solving X must be impossible.

The idealized debugger is one of those programs. We’d love a program that could inspect ours, find all the bugs and fix them, without any human intervention. Unfortunately, one problem the idealized debugger could fix is infinite loops, and thus it’d solve the halting problem. The idealized debugger is impossible.

Back to types. In an effort to ensure types weren’t “too powerful” as to allow paradoxes so that they could properly constrain set theory, mathematicians invented a type system that a computer could implement. In other words, by reasoning about types, a computer could prove an absence of type errors and only type errors.

How did they do this? Types aren’t as powerful as sets. They’re restrained. You cannot implement Peano Arithmetic in types, and from a computability standpoint, strict type systems aren’t ‘Turing complete’. They’re a constrained form of programming that you can put on top of your less constrained program to borrow its safety when you need it.

What’s a type error? A type is a ‘category’ of thing. So in the case of Python, trying to add a number to a string is a type error. Numbers can’t be added to strings. Taking the square root of a string is similar – square roots expect a type of number, and so, it’s impossible.

Dynamic versus Static Typing

Python is a dynamic language though – in an effort to allow a little more power, Python doesn’t type check until runtime.  This makes type checking only as powerful as assertions/contracts. We know assertions and contracts are great to have in code, but they cannot guarantee the lack of certain kinds of errors – they only help to debug them when they happen in the wild.

There are other kinds of languages out there that use a ‘static’ type system – this type system is enforced and checked before the program is even run. Like a linter, the type checker is a program that runs over the code itself. More powerful than a linter, instead of merely looking for patterns in the code, the type checker actually interprets the code and builds a model of the code in memory. It then ‘proves’ various things mathematically about that model, such as the absence of type errors.

There’s further categorization – so-called strongly typed languages versus weakly typed languages. This basically is a measure of how much you’re allowed to break the type system. Languages like C are statically typed, but also weakly typed. This is due to the fact that there is a type checker, but you break it at any time by doing what’s called a void pointer cast. You have a string? You can trick C into thinking it’s a number pretty easy.

Python is a dynamic and reasonably strongly typed language. It won’t allow you to break the type system, but it doesn’t enforce type errors until runtime.

Haskell is a static and very strong type system. Haskell has many ways to reason about types such that if you model your program via types well, and it compiles, you have proven out a lot of bugs.

A New Fighter has Entered The Arena!

So if Python is dynamically typed… why talk about type checking?

Because the above is no longer the case. There are now type checkers for Python. We’re going to look at this one.


How to Annotate

The first thing you’d want to do on code that isn’t type annotated, or on new code, is add the types. Python already supports type annotations but just ignores them. So you can add annotations in the style of here to your code here.

The best bang for your buck will be annotating function signatures. I’ll tell you why in just a bit.

How to Run

Run mypy like a linter. Instructions are here.

How to Add New Types

We’re going to be doing basic Object Oriented programming in the code challenge, which you should be familiar with from Python Programming.

Check out how MyPy automatically turns all of your classes into types, respecting inheritance here.

Type Checking Versus Other Methods

Certainly, you can see how type checking is related to linting and assertions. It’s basically a combination of the two, solving a certain kind of problem.

It’s more powerful than a linter as it actually uses a proof engine to reason about your code and more powerful than assertions as it can rule things out statically, rather than just triggering dynamically.

It cannot find all the little things a linter can though, so the two should be combined. And most of your program needs to reason dynamically – not statically. This means that types cannot model your entire system, and you should fall back to assertions when you have to.

Types serve as a powerful form of documentation, enhancing peer reviews. They make code easier to reason about by assigning each variable a certain type. They also clean up variable names, as

def func(string_first_name):

is always going to seem less readable than this

def func(first_name : String):

How does type checking compare to testing? This is where things get interesting.

Unit Versus Integration Tests

There are two large classes of tests – what’s called unit, and what’s called integration. There are other kinds of tests, but these types of tests are most often written.

Unit tests are supposed to test a small bit of code in isolation, quickly. Dependencies like a database or file reading are ‘mocked out’ using special code that pretends to be a database or a file.

Integration tests put multiple pieces of code together, as well as third-party dependencies like databases. They tend to be slower and exercise much more code. They are also often more difficult to write.

Unit tests often chase ‘coverage’ – trying to get each line of your code run by at least one test. When attempting to increase coverage, unit tests are usually the easiest thing to spin up and write more of. A coverage of 70% is pretty good, 100% is the highest you can go.

There’s a goal in mind.

Integration tests try to test integration points, which can get hairy. Let’s say you have three components you use (a website, a database, and a script). We’ll call them X, Y, and Z. You’d need to write…

…an integration test of X to Y…

…an integration test of X to Z…

…an integration test of Y to Z…

…and an integration test of X to Y and Z.

four integration tests to test 3 components. Conversely, if you had three well-factored components and needed to unit test them, you’d only need to write… three unit tests. Unit tests scale linearly with the number of components you want to test, while integration tests scale with the size of the superset of all components. Which is bigger than linear.

With 4 components you’d have 11 integration tests you’d need to write, but only 4 unit tests.

It gets out of hand quickly, and often, no one writes that many integration tests. Unit tests are easier to write. So there’s this ‘black hole of integration’. Most people write a few integration tests – usually never enough.

Types fill the Black Hole of Integration

Types, especially types on function signatures, are promises along with a ‘boundary’. Function signatures are often the integration points between components. If you used a database, what you’d really do is use a database library, and call a function in it.

That function is where you want to put your types. This function – the gateway to the database – is the ‘boundary’.

Each integration point can be decorated easily with types. If you have component X and Y and Z, it’s a linear effort to add types to component X, then Y, then Z. You do not need to add types just for X talking to Y, or X talking to Z. It’s like unit tests.

The type checker can then generate all of your integration checks for you, ensuring that whenever X talks to Y, they’re talking in the same language. They’re using the same types.

Type checking can turn the overwhelming integration test problem into something that’s pretty easy to manage. Don’t test integration points, type check them.

Typeful Programming

Often you’ll see detractors of type checking argue that the number of times they’ve confused a ‘float’ type for a ‘string’ type is next to none. It’s dumb to have a check for something that never happens.

And they’re right – the built-in types of the language rarely conflict. Simply decorating your code with ‘string’ and ‘integer’ and the rest isn’t going to suddenly discover a lot of bugs, nor is it going to reduce the risk of introducing new ones.

The power of types in programming is realizing they’re a tool that you can use too. Integers and Strings are what the programming language designers wanted – you can create your own types and use the type checker to enforce it.

What types do you want? This is important from a design perspective in object-oriented programming, which more or less asks – “Pretend you already have the objects (types) you need to solve the problem, then write that program.”

If your program talks in terms of temperature, you’d better not have floats running around. You should have a Fahrenheit type and a Celsius type. Those types can be implemented-in-terms-of floats but should be represented in your code as fully fledged types.

This makes it impossible to do silly things like adding a zip code to a temperature, and possible to do useful things like automatic conversions between temperature types.

A heuristic here, especially since it takes years of trying to get a good intuition around object-oriented design, is looking for ‘primitive’ types and get rid of them. If you’re passing an ‘int’ or a ‘string’ – ask yourself. Are you really passing an int or a string? Or are you passing a count of vegetables and a name of a vegetable? If you have those things – and you don’t have a type defined in your code-named ‘Vegetable’ – add it and refactor!

Let’s take the following program for example:

age = get_raw_input("Please enter your age")
print("You are {0} years old.".format(age))

The age variable above is an integer. But is it really? No. It’s AN AGE!

Ages are represented by numbers, but only certain numbers. And they’re a concept we can think a lot about.

Consider adding the following class to the above program:

class Age(object):
    def __init__(self, raw_age):
        assert raw_age < 124,\ 
            "You can't be older than the oldest person alive!" 
        assert raw_age > 0,\ 
            "You can't be less than 0!"
        self._raw_age = raw_age

    def input_age(): -> Age
        return Age(get_raw_input("Please enter your age"))

    def print_age(self): -> None
        assert self._rage_age is not None
        print("You are {0} years old.".format(age))

The above program is a little longer – typeful programming like the above requires a little more overhead in small programs. But you see we’ve modeled a concept – Age – in our program, and it makes the program easier to reason about. We’ve now got ideas like enforcing a range on age. Human ages don’t go to a billion, and if one age in your program was at a billion, that probably means there’s a bug somewhere.

In large programs, typeful programming is actually far shorter. This is because you’ve built a large dictionary of ideas and concepts to build more advanced concepts from. The number of assertions and tests you need to write will shrink because you’ll be reusing all the assertions and tests you’ve written on all your small classes/types. And you’ll need far less defensive coding and integration tests since you can use the type checker to enforce most of the integration points.


Live Coding Sessions to Watch

Remember, when watching coding sessions don’t just look at the main content being covered, but watch what tools the coder uses. Look for typing techniques, where they lay their screens out, what plugins they might use on their editors, and so forth. Research things you find interesting, and try to incorporate them into your workflow!

The two below are a bit more ‘produced’ than I usually prefer, but keep in mind that real programmers make mistakes all the time, have to look things up, and so on.

Below is a quick live session that uses MyPy

To round this out, here’s another coding session out of left field – and introduction to the Web Framework “Flask”. This is a 7 part series if you’re interested, but for now, just get through the first introduction.

Code Reading / Review

For the reading, let’s look at MyPy itself – you’ll be looking at its options loader here.

Practice doing code reviews, what comments would you leave for this code? Think of these following questions to discuss with your mentor:

  1. Where did comments, style, and docstrings help or hinder your reading of the code?
  2. How much of the code could you loosely understand? Why or why not?
  3. How much did types help you understand the code?
  4. What did you like about the code? How might you replicate that in your own code?
  5. What did you not like about the code? How would you avoid those practices in your own code?

Code Challenge

You’ll be refactoring some old code that handles geometry lookups, or ‘geo-fencing’. It’s a prototype to see if a cell phone’s latitude and longitude falls within some boundary. The problem is, there’s a bug in it, and no one can figure out what it is.

Your first mission is to add at least two new classes/types: Latitude and Longitude. After creating these classes and removing the old floating point numbers that represented latitude and longitude, see if that along with MyPy doesn’t help you figure out where the bug is.

Finally, with the bug fixed, push test coverage up, clean up pylint, add assertions and prep the code for review.


When exploring code you inherited, use some of the other tools in your toolbox. We like to get tests in place first, but often it’s hard to figure out what the code is even doing, much less how to test it.

Try running the debugger on the code and stepping through it line by line – see what variables are changing. What are those variables supposed to be?

Feel free to add comments as you go – inherited code is not always well documented. You can make notes and annotations as you attempt to understand the code.

Feel free to change variable names or make formatting changes too – it may be best to get some pylint flags cleaned up first that deal with style. This can make ugly inherited code easier on the eyes.

Add assertions where they make sense – if you think the code works in a certain way, is there a way you can assert that? For instance “I think X and Y should be equal here… maybe I should add an assert that they are.” Assertions are your assumptions about code – if you add them to the code itself, you allow the program to check your assumptions.

As you understand the code more and more, you can take some larger steps to try to refactor it out into something testable.

The real lesson here is not to leave code this ugly for others to inherit! Remember to leave tests for others, maintain a sense of style with the linter, and over document! I’m giving you ugly code on purpose to help remind you why good-looking code matters. It’s a huge productivity benefit during maintenance, and maintenance is 90% of where you’ll spend your time coding.

For Mentors (And Coders Too)

Mentors – Ask your mentee about the live coding sessions and code readings. What questions did they have? What did they find interesting?

Review Checklist

  • Are types documented and MyPy clean?
  • Is test coverage at 100%
  • Is it pylint clean 10/10?
  • Does the code use assertions?
  • Is pylint doc strings clean?
  • Is the documentation readable?
  • Does the code use good names?
  • Does the code use good use of white space?
  • Does the code have consistent and idiomatic style?
  • Does the code include comments?
  • Does the code use git hooks for pylint, pylint docs, and git commit lints?
  • Does the Readme explain what the code does and how to install and test the code?
  • Can the coder give a ‘guided tour’ using the debugger through one of their test cases?

May 9, 2017 Posted by | Uncategorized | Leave a comment

Breakthrough Driven Development

Conveyor Belt Development

We all know the conveyor belt model of software development doesn’t work. Requirements don’t go in one end of the machine to be transformed into working software in a step by step fashion. This is the usual argument against most process – we’re not quite sure how we write software, so there’s no step by step process to do it right.

Still, we find ourselves moving back to the conveyor belt in one way or another. We measure effort in weeks, or SLOC or code points. We ask for status in terms of a certain percent “of the way there”.

Some argue that the only reason we do this is because pointy-haired-bosses refuse to give up on the dream of conveyor belt software, and stubbornly demand that development bend to their will and make their jobs easier.

I now believe this is false. The reason why people turn back to the conveyor belt isn’t because they believe it – it’s because they have nothing else.

We’ve taken our Luddite hammers to the conveyor belt and screamed “there is something more here!” when talking about software. “It’s not so easily captured in your charts and figures” we argue. But when we have to turn back to the practicalities of working with many other teams – those of which do have conveyor belts that work just fine – we have no new machine to replace our broken belts with.

And so the belts are mended, and the machines turned back on.

And we dance around the point – our pointy haired bosses says “I get it, I get it, it’s not so easy as to say that you’re 50% done. But I have to tell my pointy haired boss something. Is 50% okay?”

The more naive amongst us expect our overseers to gird their loins and go swinging into battle with their own overseers arguing against the metrics and belts and charts altogether.

They expect this, and then send their overseer into battle without any weapons.

To really destroy the belt, we have to offer up a replacement. And here is mine.

The Die is Cast

Software development is a stochastic process. Think of it like a game of D&D. The Dungeon Master tells you to defeat this enemy, you have to roll 16 or higher on a D20. That’s 16, 17, 18, 19 or 20 on a 20 sided die, giving you a 5/20 or 25% chance of victory.

Every day we come in and sit at our dev boxes, analyzing our designs and debugging our programs, we’re rolling a die. Will this be the day? Will we have a breakthrough today?

So much of software, from design to debugging as mentioned above, is pure random chance. There’s no process to assuredly find a bug, and there’s no process to assure a good design. Instead, we must try many hypotheses when finding a bug. We must try many designs in trying to find a good design. With each one, we get a certain chance of success.

This ends up looking deceivingly like the conveyor belt in some ways but is distinctly different in others.

The Belt and the Die

First, let’s say we estimate 20 days for a project. The conveyor belt model says after one day, we’ll be 1/20th complete, while on day 19, we’ll be 19/20ths complete. This, we know, is false.

Under the rolling a die model, a 20-day estimate is the same as saying we need to roll a 20 on a 20 sided die. Given 20 days, we can almost guarantee success. So the estimate looks the same. But look what happens on each day.

On the first day, we have a 1/20th chance of succeeding. On the second day, we have a 1/20th chance of succeeding. On the 19th day, we have a 1/20th chance of succeeding!

With the conveyor belt model, each day gets us closer to our goal. Under the D&D model, each day is just another chance to win. We can have early lucky breakthroughs, but we can also have projects that go on and on as the die stubbornly refuses to obey our wishes.

Is all work using the D20 in software? No. Clearly breaking down projects into milestones allows us to take some conveyor belt approaches – first we need to open the portcullis (roll a 6 or higher on a D20), then we need to sneak into the castle (roll an 18 or higher on a D20), then we need to defeat the dragon (roll a 12 or higher on a D20).

With these breakdowns, we can say that someone fighting the dragon does seem in certain ways ‘closer’ than someone still outside the castle gates. But it’s not in the same way that a car with a paint job, interior work and wheels is ‘closer’ to being finished than a just a frame. There’s still some chance that the guy outside of the gates gets lucky and defeats the dragon before the guy at the dragon.

It’s less likely than our dragon fighter finishing first, but not impossible.

Summing Up

Creating a design is rolling a die. It has a chance of being good, and a chance of being bad. Hard projects tend to have more chances of failure than easy projects. But breaking through abounds. Each project can be measured in some minimum number of breakthroughs it needs to succeed, and those chances of success can be easily turned into estimates. But in a breakthrough driven development model, having put 20 days into a 20-day project means nothing. It’s no closer to success than when it started. It’s still most likely 20 days away.

February 7, 2017 Posted by | Uncategorized | Leave a comment

SYWTLTC: (AB) Chapter 3.4 Collaboration

The fourth way we maintain quality in our code is via collaboration with others.

Nose Blindness

Think back to growing up, when you visited your friend’s houses. Each of them had a particular smell, right? The kind of food cooked, any animals kept, preferred cleaning products and aromas used in candles, wall plugs or incense all gave each house a particular smell.

Except yours, right? Your house had no smell. It was always your friend’s houses that smelled like something.

Well, yeah, kind of. The problem was, you were so used to how your house smelled, you didn’t notice it.

Code Smells Too

Often there are attributes of code that aren’t outright wrong, but ‘smell’. It makes people think something ‘rotten’ is nearby. But not always – sometimes a smell is just a smell.

Linters can tackle a lot of code smells when there’s a hard and fast rule to apply to the code. For instance, mixing camelCase and snake_case for various naming schemes is a code smell that linters can catch. What’s it smell like? It smells like two people wrote code in the same module and didn’t talk to each other.

A linter might catch these things and tell you to fix them, and lo and behold, nearby the mixing of code cases, you might catch other issues due to the two coders not talking to each other. The smell leads to something rotten.

You ARE the Tool

Take note of the example above – you might have thought we used a linter to find a bug in code we’re having to maintain, but it was the linter that actually just gave us a hint to where the bug was. It was our own eyes that found it.

In the above example, we’re maintaining code that two others wrote. By reading over the code, with some guidance provided by a tool, we spotted a bug using our own eyes and intuition. We basically collaborated with these former authors, even though we never met them, by analyzing the work they left behind and then changing it.

Because of the original author’s nose blindness, they didn’t smell the code they were writing, and the error seemed more obvious to you. Simply putting another human in the loop found an error that the tests nor linters nor assertions found.

Another way to think about it is that human beings are error generation machines – they’ll write code and put in bugs. But they’re not very correlated with each other. In other words, the bugs I tend to write are different than the bugs you tend to write. So if we work together to spot each other’s bugs, we will only let the small minority of bugs that we both tend to write get through.

This chapter is really two parts – how do you prepare your code and designs for review, and how do you review other’s code and designs.

Part 1: The Elements of Style

“Programs must be written for people to read, and only incidentally for machines to execute.”

Harold Abelson, Structure and Interpretation of Computer Programs

Readability versus Maintainability/Extensibility

So we’re going to focus on readability here rather than other ‘ility’ statements. The list of ‘ility’s for every programmer is different, but here’s one list you can think about:

  • Readability – more on this below
  • Maintainability – how easy is the program to fix bugs in?
  • Extensibility – how easy is the program to extend and add new features to?
  • Understandability – This is basically ‘readability’ in the large. How easy is it to read one part of the code, keep that in my head, and read another part of the code? How ‘coherent’ is the design?
  • Testability – how easy is the code to test?

So Just Readability, Then

Generally, readability only applies to the code itself, while the rest can either apply to the code or wider design (how the whole system is structured).

We’re going to go with a quality and quantity approach here, in that, you need to keep in the back of your head the ‘holistic idea’ of readability. “Is this code readable?” can’t be answered by a mere checklist.

That being said, we will go over a checklist. The checklist is necessary, but not sufficient for the code to be readable. In other words, code that violates statements below is most likely not readable. But code that doesn’t violate statements below is not necessarily readable – you should still consider your code holistically.

After all, the best measure of readability is to get someone else to read it. Even if you check all the boxes below, you need another person to walk into your house and tell you if they can smell the litter box.

The number one rule of readability is to write code as if you cannot comment or document anything.


Whitespace is literally the space between words and punctuation in your programs. Inconsistent use of whitespace can be distracting. Clumping too many characters together makes it hard to see the ‘atoms’ of a sentence.


Horizontal whitespace is where you place spaces, using the space bar, on a single line of code. White space lets people see where words begin and end in writing, and works with punctuation to let you know when sentences end. In Python, we use the end of a line to say when a line ends. Other languages use semicolons.

Horizontal whitespace can be helpful in creating a sense of symmetry:


is going to be a little less readable than

x = 1
y = x+4
z = foo()


The horizontal whitespace above emphasizes the equals sign and emphasizes the structure of all three lines. The three lines are similar, however in the first attempt, their similarity is thrown off a bit because each line ‘works’ a different way, and each line is a different length. Using white space has emphasized how they are similar – they are all assignment statements.

By emphasizing their similarity, we can very easily think about x, y and z being similar – they are all the variables being assigned to. And we can see 1, x+4 and foo() as being similar, they are all values that are being assigned.

All of this is clear from the top attempt as well, but you have to read each line and find the equals sign each time. White space allows your visual cortex to do that for you – no reading, no symbolic thinking. It’s all parsed out automatically for you to feed into your language and logic centers in your brain.

Another issue with horizontal white space is the fact that ‘nesting’ (using if statements, method definitions, and loops) tends to ‘shove you out’ four more spaces to the right with each layer.

if y:
    #first layer of nesting
if y:
    #first layer of nesting
    for x in z:
        #second layer of nesting
if y:
    #first layer of nesting
    for x in z:
        #second layer of nesting
        def func():
            #third layer of nesting

Compare the three code blobs above to see how nesting moves you to the right.

Nesting is supposed to move you to the right because nesting is textbook complexity. The more indented to the right your code is, the more complex it is. This is one major reason why whitespace can help your visual cortex identify complex code.

To reduce nesting, you can introduce helper functions:

if y:

#somewhere else
def do_my_loop():
    for x in z:

def create_my_function():
    def func():

In the above, we were able to collapse maximum 3 layers of nesting into maximum 2 layers nesting. But what else did we get?

By writing helper functions, we got to introduce a name, which means we make our code more ‘self-documenting’. Self-documenting means we use the parts of the language we define – i.e., the names of functions, variables, and classes, to refer to real English words that describe the system.

We also got to introduce a place to test, which makes our code more testable. If we’re trying to add assertions at the begin and ending of each function, we just introduced more opportunities to do that. Finally, we introduced more places to put doc strings to better document our function without comments.

As we’ll see in the rest of this article, all of these are great things. The best things, believe me.


Vertical spacing is the white space introduced between lines. Generally speaking, you should only put one complete ‘thought’ per line. Python more or less forces you to do this, though other languages that use semicolons as the line ending can sometimes force multiple things on one line.

Using blank lines can be a powerful way to group like constructs. To use a similar example to that above, consider the following two ways of writing:

x = 1
y = x+4
z = foo()
a = bar(x, y, z)
b = baz(x, y)
c = a + c


x = 1
y = x+4
z = foo()

a = bar(x, y, z)
b = baz(x, y)

c = a + c

In the top, we have a whole bunch of statements. We can tell they’re assignments, but we’re going to have to read each one, line by line, to see what’s actually going on.

In the bottom, though, we see there are three separate steps to whatever is going on. The first step is similar assignment statements. The second step, a and b, both seem to be some derivative values of the original x, y and z statements. Finally, a third and final step combines a and b.

The addition of vertical white space allowed us to break apart the program for our reader and draw attention to bits of the program that should be thought of together – steps 1, 2 and 3, whereas the first attempt jumbled them all together.

You can also use vertical whitespace to ‘convert’ horizontal space, using the \ key. This tells python to ignore the end of the line, and assume that the line is continuing on to the next. Alternatively, anything already inside a “[” or “(” style list automatically doesn’t end until python finds the “]” or “)”.

For example, making clever use of vertical space can make horizontal complexity clearer:

def very_long_method_def(person1, person2, account1, account2, irs_rules):

converts to

def very_long_method_def(

This uses more vertical space, but now your eye is more clearly drawn to each argument to the function, rather than having them jammed altogether. We were able to do this because all the arguments are inside of a parenthesis list, so python ignores newlines until it finds the closing ).

The Limits of Space!

Often linters and text editors can set line limits on methods or character limits on lines – such as flagging any method that uses more than 20 lines or something like that. Often line limits are enforced as ‘logical lines’ – i.e., lines with white space removed. But you should think in terms of total lines on the screen, even if your linter doesn’t.

You shouldn’t have a method take up more than one screen’s vertical length. Ideally, your methods would be even smaller than a page length, because people usually like to have a terminal window and a handful of other windows open on a screen at a time.

Being able to see your entire method on the screen at one time keeps a visual exercise from turning into a mechanical one. If it takes more than one screen length, then to read the entire method, you have to use your hand to scroll up and down. You can’t easily cross reference code entirely on the screen at one time, and you end up having to keep certain facts in your head.

If you have all the code on one screen at a time, that means you can use the screen as the tool it was meant to be used for and not try to remember anything – just read the method and use what’s on the screen to figure out what it’s doing.

Likewise, we have horizontal line lengths to allow us to have more than a single window open at a time (especially useful during peer reviews). Long lines also tend to get really hairy to read and figure out what they’re doing.

Your linter should enforce these limits for you. But one thing you must not do when hitting character or line lengths is remove white space! The white space serves a very valuable purpose!

When you properly use white space to ‘expand’ your code and use a linter to ‘limit’ its expansion, you have some pretty good heuristics on when you need to refactor code to make it less complex.

Naming Things

There are two hard problems in computer science: cache invalidation, naming things, and off by one errors.


You will often get a ‘feel for white space rules. Again, your visual cortex is going to tell you what’s complex and what’s not.

The other weapon you have in your arsenal is your ability to name things. And this is a very, very hard problem.

Motivating Example

Compare these two code blobs:

def calc_f(t1, t2, num):
    if t1.count > num:
        t1.count = t1.count - num
        t2.count = t2.count + num
        print "Warning, not enough funds!"
def transfer_funds(transferer_account, transferee_account, amount):
    if transferer_account.value > amount:
        transferer_account.value = transferer_account.value - amount
        transferee_account.value = transferee_account.value + amount
        print "Warning, not enough funds!"

The only difference between those two blobs is names.

Nouns and Verbs

First, variables should nearly always be a noun form. That is, they should be a ‘person’, ‘place’ or ‘thing’.

Methods/functions should nearly always be a verb form. They should be an ‘action’.

Methods and variables should try to be as simple as possible – one word if possible. The more words you add to a name, the more complex it is. When we get into object orientation, we’ll find more and more tricks to ’embed’ names into classes and objects, turning code that looks like this:

def transfer_funds(transferer_account, transferee_account, amount):

into code that looks like this:

class Account:
    def transfer(transferee : Account, amount : Cash):

That may not look like much now, but keep in mind there’s a lot of other code that will live in the Account class, and so on. The code will somehow be tighter, shorter, and more readable.

Your names should also be as concrete and specific as possible. Abstract names like “sensor” are almost always worse than more concrete ones like “radar”, or even “topRadar” if there are two of them.

However, to the above point, every word you use in a name expands its complexity. Each word should carry some weight. If there was only one radar, “topRadar” would be redundant, and “radar” would be a better name.

Almost Always Bad Names

Here’s a list of names that you should almost always avoid:

data, handler, handle, manager, mgr, object, obj, stuff, number, num, x,
y, z, foo, bar, baz, func, i, do, calc, calculate, perf, perform

I use these all the time in my examples explicitly for the fact that I’m talking about code structure and names don’t matter. If I had used good names, you might have gotten distracted into thinking that ‘foo’ actually did something.

Any variable name (noun) that ends in ‘er’ is also usually bad.

runner, doer, builder

These names aren’t going to carry much information to your reader, and are often signs that someone didn’t really think through the name they were using. If they didn’t think through the name, what else did they not think through?

Names that have logical words inside them, like “and” or “or” are also right out.

accountAndUser #this is a bad name and it should feel bad.

Avoid acronyms as well, as no one ever really has an acronym dictionary on hand when reading your code.

Avoid ‘Hungarian notation’, that is, using clever encoding schemes to tell you something about the variable such as “n_foo” to let you know that foo is a number. Let the language do that for you.

Named Parameters

Python is a language that allows named parameters, which really help with readability.

Let’s take our transfer_funds function above, and call it with named parameters.

transfer_funds(transferer_account=bob, transferee_account=sam, amount=500)

Named parameters allow a reader in the future to not need to look up the definition of a function to have a loose understanding of what’s going on. Let’s say you haven’t seen the transfer_funds definition in a few months, and you happen upon:

transfer_funds(bob, sam, 500)

So… which is it? Did bob transfer 500 to sam? Or did sam transfer 500 to bob?

The Ubiquitous Language

This is an idea from a design methodology called Domain Driven Design. We’ll cover a little bit of it later, but the basic idea is this: if you are having a conversation with your client or colleague and you keep describing your problem by using certain words, those words should be in your system somewhere.

If you’re doing business software, and you talk to your clients about ‘accounts’, are there ‘account’ variables and classes in your code? If you’re doing audits, does the word ‘audit’ have meaning in the code? Is it a function you can run on an account?

Borrowing language from your business domain is a great way to get inspiration for names. It also serves as a design check – if you can’t think of a name for the variable or object you just made that uses something from your domain… maybe it’s a bad variable or object?

The name is telling you that what you’ve created is intrinsically confusing. It’s a good heuristic to use to take a step back and see if you can’t come up with a less confusing way to solve the same problem.

Back to Readability

What’s left after the above checklist?

Readable code should be self-documenting. While we’ll get into documentation in a bit, you should always write code as if you could not comment, and could not write doc strings. How do you embed what problem you’re trying to solve in the names you get to chose in your code?

Readable code should teach the reader something about the domain. This ‘domain’ is back from the Ubiquitous Language idea – basically, reading well-written code should be (ideally) the most efficient way for a reader to understand the domain. If someone wants to know how you calculate taxes, showing them your algorithm should be the most effective way to do it.

To this second point, often you don’t even need to write code to run. But you can write it to get an idea down, make it rigorous, and then show someone else. Think about proposing a new process at your job – you can model the problem in code, write some examples as test cases, and see whether your code spits out the output you want. If it does, the code itself should be the best way to show someone else your new ideal process.

Idiomatic and Consistent Style Aids Peer Reviews

Above all when writing readable code, you should write it consistently. Inconsistency is a red flag for a peer reviewer to read further – it slows them down, and causes them to have to parse things they can’t.

A peer reviewer only has so much mental ‘gas’ before they move on and do other things. Depending on your environment, they may say that you can’t submit your code because they didn’t have the energy to review it, or worse, that you can submit your code because they don’t have the energy to do it.

If they just give you a rubber stamp on your peer review, then what’s the point?

Using consistency helps your reviewer be as efficient as possible with their mental gas. Using idioms in the language (like named parameters in python) that a python programmer would expect helps people identify patterns. Once they see patterns, they know what kind of aberrations to look for. If your code doesn’t fall into a set pattern, then they have to read it slowly, line by line, trying to keep it all in their head.

They’ll give up.

What is idiomatic, by the way, depends on the language, but you can Google ‘idiomatic javascript’ or ‘idiomatic C++’ to start getting some ideas.

Style Guides – we’re using a linter here

One final note that we’re mostly talking around is the idea of a style guide. This is a written document some teams have that defines how they’re going to use white space, naming schemes, and other rules that really don’t help or hurt readability unless they’re inconsistent.

We’re using PEP8 here, as enforced by Pylint. You can skim over PEP8 here, but we aren’t going to enforce things that Pylint won’t.

Part 2: Documentation

The other side of making code readable is just smacking on some English that goes along with the code. This is called ‘documentation’.

There’s a lot of different kinds of documentation that might go along with software. Design documentation notably will be absent in this discussion. We’re talking about code documentation, of which there are two main kinds – explicit documentation (doc strings, commit messages and readmes) and implicit documentation (comments).

The number one rule of documentation is: Document as if no one has access to your code.


The first kind of documentation we’re going to talk about is Python’s support for doc strings. Doc strings are pretty intuitive in terms of documentation and offer a few benefits.

First, they’re embedded with the code to document. The first line of any function call or module you write can be nothing but a string – Python automatically interprets this string to be the ‘doc string’ of the function. This means the documentation is right by the code, making it easy to cross-reference the two.

def function(x, y, z):
    """An example docstring."""

Second, doc strings are what Python uses to resolve the ‘help’ command in the REPL. Calling help on any object or function will, in turn, give you the doc string for that object or function. So this is a very convenient way to access doc strings on the fly while trying to prototype stuff.

>>> help(function) #to call the help function
# will create a window that prints out "An example docstring."

What goes in a doc string?

The first line of a doc string is usually a one line brief explanation of what an object represents, or what a function does.

You can add more detail in lines to follow.

Finally, you can give more documentation about the arguments themselves and the return value of a function.

def function(x, y, z):
    """An example docstring.

    I might add some more detail here.

        x: (int) What X is, including type
        y: (float) What y is, including type
        z: (string) What z is, including type

        float: Description and type of the return value

The full standard can be found here for your reference.

Pylint Docstring Checker

Pylint actually comes with a linter for your doc strings to ensure they follow a format similar to the above. You’ll need to add this extension to this and all future projects.


Doctesting is a pretty neat idea that is dominantly seen in Python and another language called Elixir. They are the idea that you can embed tests into your documentation. The doctest python module can read your doc strings, pull tests out, and execute them as unit tests.

The benefit of embedding tests in documentation is two-fold.

First, it ensures documentation doesn’t get out of sync with the code. If the documented test fails, it means it doesn’t reflect the code properly and needs to be updated.

Second, well-written tests are often some of the best forms of documentation – they serve as examples on how to run your code. People unfamiliar with your library will often cut and paste tests similar to what they want to accomplish, and then change them until they do what they want.

If you’re interested, check out the module and examples here. It’s not required, though.

GitHub Commits

A form of documentation that actually happens outside of Python is what you actually put in your Github commit messages. Every chance you have to put a message out is a chance to communicate intent to some future maintainer. Github commit messages are a fantastic pure English decoration to the patches and diffs that go along with the commits.

A GitHub commit message that is either too convoluted or too wordy means you probably did too many things in your commit at once, and if you can, it’d be better if you figured out a way to break down the commits.

Github commit messages ought to be in the imperative, i.e., “Fix web page bug” rather than the past tense, “Fixed web page bug”. This makes reading git logs – which print out all the commit messages – a little more intuitive.

Github commit messages should be short and to the point. You can make longer ones, but make sure the first line of the message – like a doc string – is a  brief and to the point description of what kind of work was done.

You’ll be installing this git commit linter to help you police your commit messages.

Another big issue to keep in mind in terms of GitHub commits and keeping things readable is that the smaller commits are, the more readable the changes tend to be. Keep your commits small, so that a reader can read through them in order if they like.


A file is important in a GitHub git project. This is because GitHub will display that file as the text on the web page when exploring a repo. Every directory you move into can have its own that GitHub will display.

We’ll only require a README at the root, but an effective readme should mention a few things:

  • What the code does – what problem does it solve
  • How to install the code
  • How to run the codes tests
  • Links to any docs, issue trackers or mailing lists

In addition, a section can be added on design rationale if warranted.

The md extension indicates the file uses Markdown syntax, which GitHub supports.

Inline Comments

As opposed explicit documentation – doc strings, GitHub messages and readme files – inline comments exist entirely within the code and are considered implied documentation.

Comments in python are anything after a # symbol.

If doc strings are for the help function and let readers know what your function can and cannot do, and commits are to help someone follow along with the changes to the code base, and Readmes are the highest level documentation you write for your project, what are comments for?

Docstrings are for the “what” something is and does. 

Code is for the “how” something does it.

That means that comments are for the “why”.

What do you mean, why?

There are a few reasons why you might want to put in a few comments on why the code is written the way it is:

  1. If there’s a known issue or bug – you can comment near the issue or bug, and describe what the workaround is.
  2. If there’s a more obvious way to write some code, but you had to write it in a way that was more performance oriented and unreadable. Explaining why the code is unreadable, and what it does, then falls to the comments.
  3. If there’s an interesting design tidbit on why the code is written the way it is, it can be in the comments.
  4. If there’s a requirement that’s unintuitive that changes the way code might work, put it in the comments near that code.

Notably, comments are not to describe what code is doing – unless the code itself can’t describe it. Then comments can describe what the code is doing, but only because it needs to explain why the code is not ‘self-documenting’.

Dangers of Documentation

That’s quite a bit of information about documentation, and it makes it sound much more rigorous than it is. What I haven’t talked about is the downsides of documentation, and there’s one major one:

Nothing can make sure that the documentation is in line with the code.

This is a ‘hard’ problem – and it requires peer reviewing humans to do it. And even those humans usually hate it. Finding documentation that’s out of date with the code is one of the most notorious problems in software development.

Tools like doctests and peer review can help. But the main way to make sure documentation doesn’t get out of date with the code is to not write it at all. The more intuitive your code itself is, the less documentation you need to write. That’s not permission to skip writing docs.

Instead, what I’m saying is that if you write your code as self-documenting as possible, then when you’re trying to write docs, you should have to struggle a bit. You should have trouble finding ways to describe what the system does that doesn’t just more or less restate the code itself.

Documentation Generators

In addition to being used by the help command, doc strings can also be gobbled up into documentation generators. These programs run over your entire code base and compile all the doc strings into a single HTML listing – automatically creating cross references and other supporting stuff for you.just letting you know these exist

Often, if you’re creating some open source software, you might host some website that allows quick perusing of this automatic documentation for people using your software. For example, most of these generators create indexes that allow easy searching of the docs, so that users of your software can quickly find some function they might need.

We won’t be using these for now, but letting you know they exist.

The Three Forms of External Docs

In the above, we described what documentation you might be expected to produce. But what kinds of docs are most worthwhile to you to read?

Documentation generated by third parties comes in all shapes and sizes. But currently, there are only three ‘best’ forms of documentation that you’re going to run across. They tie in tightly with some of the forms you’ll be asked to produce.

Note, below, we don’t actually mention the code, but the code is also one of the better places to look. If it’s readable and well commented, the code itself can often tell you exactly how something works. It is not a very efficient way to get started, as tutorials and reference guides listed below give a much higher level view. Sometimes, though, all you have is the code.

The “How To” Guide / Tutorial / Cookbook

How-tos are basically bits of code other people write to show you how to use a library to do something. This is the exact same thing as a well-documented test.

A test, like a doctest, in your code, shows an example of how to get something done. Amending this test with some decorative English explaining why things work the way they do finishes out a well-documented doctest.

How-to guides on the internet can come in blog forms, but you can also find them in generated docs. Finally, you can find a form of how-tos in third party library’s unit and integration tests. Read their tests to see how to set code up to do something, and then from that initial how-to, keep amending the example code until it does what you want it to.

The “Reference” Guide

This is docs like the Python Reference. These are often large, autogenerated web pages that draw heavily on doc strings as well as other hand written and hand edited language.

The reference guide is used to show you the potential of functions and objects you have to play with. It’s not meant to be read from snout to tail, but instead, to be drilled into deeply for one or two subjects and jumped around in.

While an example or tutorial might tell you about the existence of the ‘send_email’ function, it’s the reference guide that will tell you about all of its arguments and assumptions.

Speaking of which, the reference guide makes great use of your well-written doc strings. It expounds on every argument, giving details on the type and assumptions (assertions).

If you start exploring a library by copy-pasting one of the examples you find online, your explorations at that point will be diving into the reference guide to get details on all functions and objects used in your copy-pasted example.

The “Mailing List” / Design Discussion / FAQ

Finally, you have forums like discourse, mailing lists, and Stack Overflow. You’ll find your google searching leading you to these places quite a bit, and that’s for two reasons.

One, documenting every part of a system is incredibly expensive. And since the system is always evolving and changing, documentation can get out of date very quickly.

Mailing lists and forums allow a ‘profile guided’ approach. That means that since we don’t know what to document, and we can’t document everything, it’s best that we just document the things people have questions about. So mailing lists usually wait for the questions to come in.

The second major reason you might use mailing lists is that it’s the only way to get at the most common form of design documentation. Design docs, that is, discussions the implementors had amongst themselves, are rarely formalized. And when they are formalized, they’re almost so artificially created that you don’t really see how the sausage is made.

By going back into the mailing lists, you can see the many arguments had over every tiny detail, and it gives a background and context to every line of code. If you’re stuck maintaining something, you have only a few sources of information to go off of – the comments, if they added any, the code itself and any associated docs, and the mailing list.

Peer Reviews

Back to the example at the beginning, a peer review is when you intentionally bring someone else on to some code when you believe it’s done to read through it, attempt to understand it and find issues with it.

Benefits of Peer Reviews

Peer reviews have many benefits over mere quality, although that’s one of them – per hour invested, peer reviews are the most effective means of reducing defects. They are about 4 times faster at finding bugs than testing and can find about twice as many bugs overall.

Peer reviews also spread knowledge in two ways. First, a junior engineer having her code reviewed by a more senior engineer will learn new techniques and things to watch out for from the peer review. Going the other way, a senior engineer having her code reviewed by a junior engineer will help the junior engineer learn about what’s considered idiomatic and sound in that shop.

Peer reviews spread knowledge about requirements and design themselves, rather than just knowledge of how to build things. You may not know anything about what another team is doing, but if you review their code, you’re given some insight into what kinds of things they’re up to. In this way, peer review can help knock down silos.

Comparisons to Other Tools

Compared to most other quality tools, peer reviews are the best at finding logical bugs – i.e., issues where either the design doesn’t reflect the requirements (it works, but it doesn’t do what it’s supposed to). It’s also the best at finding issues with maintainability and readability of code. If you want to know how hard your code is to read, ask someone to read it.

Linting can be thought of as an automatic peer review by a very rigorous, yet dumb, junior engineer. They’ll spot every violation of a rule set but won’t spot anything that’s outside of that rule set.

It actually helps a human peer reviewer to lint a code base first to remove the obvious issues. This is for two reasons. Firstly, a human reviewer can be confident that certain stylistic issues don’t exist at all, and can spend their mental energy elsewhere. Secondly, code that somewhat consistent and idiomatic beforehand (using rules a linter can enforce) allows the reader of code to exercise their visual cortex in the reviews. Does certain code simply ‘look’ wrong?

For example, if you’re using consistent style everywhere, and one part of the code just ‘looks’ complex, it means the reviewer might spend more of her time there. Complex code tends to have more issues than simple code. Consistent styling usually helps make complex code ‘look’ complex, and simple code ‘look’ simple. This helps the reviewer know where to spend her time.

Testing and test driven development can help ensure that your code does what the tests say it should do. As we mentioned in other sections, tests can’t do anything to guarantee that the code doesn’t do what it shouldn’t. But assertions and types will help us there.

The problem with testing that peer review can help with is two-fold. First, do the tests actually reflect the requirements? Are there tests that should be in your test set that you missed, or are tests you wrote actually out of line with what the software is ‘supposed’ to do? Did you accidentally write the wrong software because you had the wrong tests?

Secondly, is the code being reviewed ‘testable’? This is one of those ‘ilities’ (testability, maintainability, readability, etc…) that peer review attempts to measure. A peer reviewer can tell if the code is testable by taking an internal measurement of their own “Rage Factor” when they ask themselves the question “if I had to test this code, how would I do it?”

With assertions, similar to testing, a peer reviewer can judge both when an assertion makes sense and is in line with expectations, and when an assertion may need to be added. Assertions also make code more readable by making assumptions explicit, so that a reviewer can assume certain things are impossible.

Finally, with types, which we’ll get into next section, as well as an addendum to tests and assertions above – when any of these techniques are used for design, peer review and only peer review can effectively say whether the code’s design as a whole is maintainable and extensible at a higher level.

Effective Peer Reviews

There are a few ways to make peer reviews more successful.

First, be familiar with the requirements. Knowing what problem the developer who’s code you are reviewing is trying to solve really helps you figure out whether they’re actually solving it.

Second, like requirements, be familiar with the design. Design, in this case, means a broad overview of the overall approach the developer took in solving her problem. Design also means that you need to be familiar with the systems the code changes touch – not just the lines of code that are changed themselves, but also, lines of code near them. If a function changes, who calls that function, and how is that caller impacted by the change?

Third, recognize many people take great pride in their work, and a peer review is a bit of a feeling of nakedness. Emotions will run high. Give critical feedback while being as generous to the developer as possible. Understand they want high-quality code that gets the job done, too, and that the reason they don’t see issues isn’t because they’re a hacker, or rushed, or are a bad programmer – it’s because of nose blindness. Give others feedback how you want to be given feedback.

Fourth, give concrete and specific advice. You can recommend solutions or workarounds, but only recommend these to provide clarification on what problem you see. For instance, instead of saying “Rewrite X as Y”, you should say “X seems to run afoul of [some problem]… if you were to rewrite it as Y, you’d avoid this problem.”

That phrasing leaves things open-ended so the developer can fix things in a way she sees fit, and folks don’t get bogged down comparing different solutions when they should just be agreeing on whether or not problems exist.

Fifth, similar to three, don’t be offended at other’s critical feedback. Ask questions, and in general, assume they’re right – it buys a lot of capital to implement other’s suggestions, and it’s good practice in ego-less coding. If you disagree with them but recognize how they are suggesting things won’t actually break anything, offer to do the work. Separate yourself from the product of the work not because you aren’t part of the work, but in recognition that the final product is a group effort.

Six, track requested changes. Bugs might get spotted early, but if no one makes sure they’re fixed, the peer review was worthless.

Seventh, offer stylistic and design feedback. Don’t just look for bugs. Obvious bugs are usually hard to find by the time you get to peer review – instead, look for things that irk a human (something you should be good at) and give feedback on that. Testing, linting, typing and assertions all can’t fix maintainability and readability issues. Humans pointing them out can.

Eight, seek feedback in your own peer reviews from multiple sources. Generally speaking, the more familiar people are with the requirements or design, the better the peer review will be. Likewise, getting someone who’s a language expert might get good peer review comments from a different angle.

Nine, focus on things that other tools can’t find. Peer review the tests, as the tests can’t be tested. Peer review the assertions. Peer review the documentation and comments – is it readable, is there enough of it, or too much? Are test cases or assertions missing?

Ten, think about instituting a coding standard. Having a checklist of things that a linter can’t find on top of your normal toolchain can help focus your efforts and structure your thought on peer reviews. You can amend and remove rules from this checklist as you see patterns emerge in your code base.

Pair Programming

We won’t really get a chance to do much of this during the modules, however, pair programming takes peer reviews “to the extreme”. In a pair programming situation, you and your partner are given a single computer and you both design and code together.

The peer review happens ‘live’, as you’re coding. Similarly with the design review.

Pair programming can really do wonders in some of peer review’s strengths – knocking down domain silos, and improving quality even more than traditional peer reviews can. Pairing juniors with seniors tends to be a good way to pair, or pairing people from different technology backgrounds. Diversity is better, as pairing two of the same person together won’t really get a lot of perspective into the design process.

Pair programming can also be pretty expensive since you don’t really double the output of a single coder by pairing with her. But output is not the only measure of success – if your long term productivity goes up or at least stays flat as your system grows, then you’ve avoided the pain of silos and that alone can be worth the short term productivity hit.

Code Challenge

For Coders

Live Coding Sessions

From here on out, I’m going to try and find a live coding session (or at least a video of one) for you to watch of another coder. Don’t feel bad if you can’t keep up, and don’t feel like you need to watch and rewatch until you understand every little bit.

The point of watching others code is to really get a feel for different work styles, see what tools others use, and understand their mental process. It’s to feel okay to google things and to hit issues because that’s what happens to everyone. It’s as good of practice pairing with a senior as you’re going to get outside of actually pairing with a senior.

Try to keep a notepad nearby and write down questions you have in the notepad. Then after the session, try to answer each of your questions through good Googling.

Check out this one here.

You do not need to know everything that’s going on, or fully understand. Use it as food for thought on things to Google, and follow along as close as you can.

Try to think of a few questions or comments on the live code challenge for your mentor.

Try to peer review other’s code

The other thing that’s changing after this chapter is that you’re the mentor now. You need to go out and find a mentee and help them get through Code Combat and these Chapters. Be their peer reviewer, let them ask you questions, and try to teach them. This will help you play the role of a senior engineer to someone else, and help you realize how dumb you are when you think you know something and then try to explain it to someone else.

Recognizing you’re dumb is the first step to getting smarter.

If your own mentor has code available that can be peer reviewed (be careful about cheating and seeing solutions to future chapters), read that too. Alternatively, you can find someone’s open source project and help them with pull request reviews, or perform an audit and simply read their code for errors and suggest corrections.

We’re going to try and give you some code reading to do each chapter from here on out as well, in addition to the code writing challenge.

For this time around, familiarize yourself with the howdoi open source project. Specifically, read and try and understand how the main module works.

Expect to spend about one hour per 100 lines of code. Feel free to download the project and tinker with it. Doing a ‘guided tour’ of the code – i.e., running the debugger and just stepping through each thing – is a great way to familiarize yourself with a new code base.

Have some questions and comments for mentor ready. What did you like about the code base, what didn’t you like?

Movie Ticketing System

You’re inheriting some old code that helps manage an electronic ticketing system for movies. The person who began work on the code wasn’t very good at writing readable code – he claimed it gave him ‘job security’. Unfortunately, it didn’t work and he was laid off.

The code needs to provide four main functions:

  • Tickets need to be purchased. This should debit the cost of the movie to a moviegoers’ account in the moviegoer database, add a ticket to that moviegoer’s account as well, and remove a ticket from the tickets database. This should return false if the moviegoer doesn’t have enough funds.
  • Tickets need to be refunded. This should credit the cost of the movie to a moviegoer’s account in the moviegoer database, remove the ticket from the moviegoer’s account, and add that ticket back into the tickets database. Basically, the opposite of purchased.
  • Tickets need to be consumed – basically, a ticket needs to be removed from a moviegoer’s account. This will be automatically called by the system, so you only have to write the function that removes the ticket and not worry about when it’s called.
  • Finally, an auditing function needs to be written that will pretty print both the moviegoer database and tickets databases.
  • The cost of a movie is 5 dollars.

You can find the code you’re inheriting here. The original author said he was in the middle of writing the purchasing function. You should use the following process:

  1. The original author has a test that he couldn’t get to work – diagnose the problem and get the test passing. Commit with a good commit comment.
  2. Pylint up the code base, and commit this with a good commit comment.
  3. Make the existing code more readable based on the principles above, ensure things are pylint clean and tests run. Commit with a good commit comment.
  4. Write a test for one of the other four functions. Make sure all other tests are successful, pylint is clean, and your new test is readable. Commit with a good commit comment.
  5. Write the function that satisfies the test. Make sure all tests are successful, pylint is clean, and your new function is readable. Commit with a good commit comment.
  6. Repeat 4-5 for the other functions.
  7. Submit your code for peer review from your mentor.

In addition to the above, you’ll be using pylint’s documentation checker and will need to show your mentor how you set that up, as well as the git commit linter.

For Mentors (And Coders Too)

Talk to your mentee about the live code session here and about the code reading they had to do. They should have a question or comment on each of them.

Ensure your mentee has the pylint documentation linter and git commit linter hooked up.

Use the following checklist on the final code, in addition, the cross-checking that the process the mentee used (based on commit history) was what was outlined above.

Review Checklist

  • Is test coverage at 100%
  • Is it pylint clean 10/10?
  • Does the code use assertions?
  • Is pylint doc strings clean?
  • Is the documentation readable?
  • Does the code use good names?
  • Does the code use good use of white space?
  • Does the code have consistent and idiomatic style?
  • Does the code include comments?
  • Does the code use git hooks for pylint, pylint docs, and git commit lints?
  • Does the Readme explain what the code does and how to install and test the code?
  • Can the coder give a ‘guided tour’ using the debugger through one of their test cases?

Sources / Extra Reading if Interested:



January 24, 2017 Posted by | Uncategorized | Leave a comment