Conveyor Belt Development
We all know the conveyor belt model of software development doesn’t work. Requirements don’t go in one end of the machine to be transformed into working software in a step by step fashion. This is the usual argument against most process – we’re not quite sure how we write software, so there’s no step by step process to do it right.
Still, we find ourselves moving back to the conveyor belt in one way or another. We measure effort in weeks, or SLOC or code points. We ask for status in terms of a certain percent “of the way there”.
Some argue that the only reason we do this is because pointy-haired-bosses refuse to give up on the dream of conveyor belt software, and stubbornly demand that development bend to their will and make their jobs easier.
I now believe this is false. The reason why people turn back to the conveyor belt isn’t because they believe it – it’s because they have nothing else.
We’ve taken our Luddite hammers to the conveyor belt and screamed “there is something more here!” when talking about software. “It’s not so easily captured in your charts and figures” we argue. But when we have to turn back to the practicalities of working with many other teams – those of which do have conveyor belts that work just fine – we have no new machine to replace our broken belts with.
And so the belts are mended, and the machines turned back on.
And we dance around the point – our pointy haired bosses says “I get it, I get it, it’s not so easy as to say that you’re 50% done. But I have to tell my pointy haired boss something. Is 50% okay?”
The more naive amongst us expect our overseers to gird their loins and go swinging into battle with their own overseers arguing against the metrics and belts and charts altogether.
They expect this, and then send their overseer into battle without any weapons.
To really destroy the belt, we have to offer up a replacement. And here is mine.
The Die is Cast
Software development is a stochastic process. Think of it like a game of D&D. The Dungeon Master tells you to defeat this enemy, you have to roll 16 or higher on a D20. That’s 16, 17, 18, 19 or 20 on a 20 sided die, giving you a 5/20 or 25% chance of victory.
Every day we come in and sit at our dev boxes, analyzing our designs and debugging our programs, we’re rolling a die. Will this be the day? Will we have a breakthrough today?
So much of software, from design to debugging as mentioned above, is pure random chance. There’s no process to assuredly find a bug, and there’s no process to assure a good design. Instead, we must try many hypotheses when finding a bug. We must try many designs in trying to find a good design. With each one, we get a certain chance of success.
This ends up looking deceivingly like the conveyor belt in some ways but is distinctly different in others.
The Belt and the Die
First, let’s say we estimate 20 days for a project. The conveyor belt model says after one day, we’ll be 1/20th complete, while on day 19, we’ll be 19/20ths complete. This, we know, is false.
Under the rolling a die model, a 20-day estimate is the same as saying we need to roll a 20 on a 20 sided die. Given 20 days, we can almost guarantee success. So the estimate looks the same. But look what happens on each day.
On the first day, we have a 1/20th chance of succeeding. On the second day, we have a 1/20th chance of succeeding. On the 19th day, we have a 1/20th chance of succeeding!
With the conveyor belt model, each day gets us closer to our goal. Under the D&D model, each day is just another chance to win. We can have early lucky breakthroughs, but we can also have projects that go on and on as the die stubbornly refuses to obey our wishes.
Is all work using the D20 in software? No. Clearly breaking down projects into milestones allows us to take some conveyor belt approaches – first we need to open the portcullis (roll a 6 or higher on a D20), then we need to sneak into the castle (roll an 18 or higher on a D20), then we need to defeat the dragon (roll a 12 or higher on a D20).
With these breakdowns, we can say that someone fighting the dragon does seem in certain ways ‘closer’ than someone still outside the castle gates. But it’s not in the same way that a car with a paint job, interior work and wheels is ‘closer’ to being finished than a just a frame. There’s still some chance that the guy outside of the gates gets lucky and defeats the dragon before the guy at the dragon.
It’s less likely than our dragon fighter finishing first, but not impossible.
Creating a design is rolling a die. It has a chance of being good, and a chance of being bad. Hard projects tend to have more chances of failure than easy projects. But breaking through abounds. Each project can be measured in some minimum number of breakthroughs it needs to succeed, and those chances of success can be easily turned into estimates. But in a breakthrough driven development model, having put 20 days into a 20-day project means nothing. It’s no closer to success than when it started. It’s still most likely 20 days away.
The fourth way we maintain quality in our code is via collaboration with others.
Think back to growing up, when you visited your friend’s houses. Each of them had a particular smell, right? The kind of food cooked, any animals kept, preferred cleaning products and aromas used in candles, wall plugs or incense all gave each house a particular smell.
Except yours, right? Your house had no smell. It was always your friend’s houses that smelled like something.
Well, yeah, kind of. The problem was, you were so used to how your house smelled, you didn’t notice it.
Code Smells Too
Often there are attributes of code that aren’t outright wrong, but ‘smell’. It makes people think something ‘rotten’ is nearby. But not always – sometimes a smell is just a smell.
Linters can tackle a lot of code smells when there’s a hard and fast rule to apply to the code. For instance, mixing camelCase and snake_case for various naming schemes is a code smell that linters can catch. What’s it smell like? It smells like two people wrote code in the same module and didn’t talk to each other.
A linter might catch these things and tell you to fix them, and lo and behold, nearby the mixing of code cases, you might catch other issues due to the two coders not talking to each other. The smell leads to something rotten.
You ARE the Tool
Take note of the example above – you might have thought we used a linter to find a bug in code we’re having to maintain, but it was the linter that actually just gave us a hint to where the bug was. It was our own eyes that found it.
In the above example, we’re maintaining code that two others wrote. By reading over the code, with some guidance provided by a tool, we spotted a bug using our own eyes and intuition. We basically collaborated with these former authors, even though we never met them, by analyzing the work they left behind and then changing it.
Because of the original author’s nose blindness, they didn’t smell the code they were writing, and the error seemed more obvious to you. Simply putting another human in the loop found an error that the tests nor linters nor assertions found.
Another way to think about it is that human beings are error generation machines – they’ll write code and put in bugs. But they’re not very correlated with each other. In other words, the bugs I tend to write are different than the bugs you tend to write. So if we work together to spot each other’s bugs, we will only let the small minority of bugs that we both tend to write get through.
This chapter is really two parts – how do you prepare your code and designs for review, and how do you review other’s code and designs.
Part 1: The Elements of Style
“Programs must be written for people to read, and only incidentally for machines to execute.”
Readability versus Maintainability/Extensibility
So we’re going to focus on readability here rather than other ‘ility’ statements. The list of ‘ility’s for every programmer is different, but here’s one list you can think about:
- Readability – more on this below
- Maintainability – how easy is the program to fix bugs in?
- Extensibility – how easy is the program to extend and add new features to?
- Understandability – This is basically ‘readability’ in the large. How easy is it to read one part of the code, keep that in my head, and read another part of the code? How ‘coherent’ is the design?
- Testability – how easy is the code to test?
So Just Readability, Then
Generally, readability only applies to the code itself, while the rest can either apply to the code or wider design (how the whole system is structured).
We’re going to go with a quality and quantity approach here, in that, you need to keep in the back of your head the ‘holistic idea’ of readability. “Is this code readable?” can’t be answered by a mere checklist.
That being said, we will go over a checklist. The checklist is necessary, but not sufficient for the code to be readable. In other words, code that violates statements below is most likely not readable. But code that doesn’t violate statements below is not necessarily readable – you should still consider your code holistically.
After all, the best measure of readability is to get someone else to read it. Even if you check all the boxes below, you need another person to walk into your house and tell you if they can smell the litter box.
The number one rule of readability is to write code as if you cannot comment or document anything.
Whitespace is literally the space between words and punctuation in your programs. Inconsistent use of whitespace can be distracting. Clumping too many characters together makes it hard to see the ‘atoms’ of a sentence.
Horizontal whitespace is where you place spaces, using the space bar, on a single line of code. White space lets people see where words begin and end in writing, and works with punctuation to let you know when sentences end. In Python, we use the end of a line to say when a line ends. Other languages use semicolons.
Horizontal whitespace can be helpful in creating a sense of symmetry:
x=1 y=x+4 z=foo()
is going to be a little less readable than
x = 1 y = x+4 z = foo()
The horizontal whitespace above emphasizes the equals sign and emphasizes the structure of all three lines. The three lines are similar, however in the first attempt, their similarity is thrown off a bit because each line ‘works’ a different way, and each line is a different length. Using white space has emphasized how they are similar – they are all assignment statements.
By emphasizing their similarity, we can very easily think about x, y and z being similar – they are all the variables being assigned to. And we can see 1, x+4 and foo() as being similar, they are all values that are being assigned.
All of this is clear from the top attempt as well, but you have to read each line and find the equals sign each time. White space allows your visual cortex to do that for you – no reading, no symbolic thinking. It’s all parsed out automatically for you to feed into your language and logic centers in your brain.
Another issue with horizontal white space is the fact that ‘nesting’ (using if statements, method definitions, and loops) tends to ‘shove you out’ four more spaces to the right with each layer.
if y: #first layer of nesting
if y: #first layer of nesting for x in z: #second layer of nesting
if y: #first layer of nesting for x in z: #second layer of nesting def func(): #third layer of nesting
Compare the three code blobs above to see how nesting moves you to the right.
Nesting is supposed to move you to the right because nesting is textbook complexity. The more indented to the right your code is, the more complex it is. This is one major reason why whitespace can help your visual cortex identify complex code.
To reduce nesting, you can introduce helper functions:
if y: do_my_loop() #somewhere else def do_my_loop(): for x in z: create_my_function() def create_my_function(): def func(): ....
In the above, we were able to collapse maximum 3 layers of nesting into maximum 2 layers nesting. But what else did we get?
By writing helper functions, we got to introduce a name, which means we make our code more ‘self-documenting’. Self-documenting means we use the parts of the language we define – i.e., the names of functions, variables, and classes, to refer to real English words that describe the system.
We also got to introduce a place to test, which makes our code more testable. If we’re trying to add assertions at the begin and ending of each function, we just introduced more opportunities to do that. Finally, we introduced more places to put doc strings to better document our function without comments.
As we’ll see in the rest of this article, all of these are great things. The best things, believe me.
Vertical spacing is the white space introduced between lines. Generally speaking, you should only put one complete ‘thought’ per line. Python more or less forces you to do this, though other languages that use semicolons as the line ending can sometimes force multiple things on one line.
Using blank lines can be a powerful way to group like constructs. To use a similar example to that above, consider the following two ways of writing:
x = 1 y = x+4 z = foo() a = bar(x, y, z) b = baz(x, y) c = a + c
x = 1 y = x+4 z = foo() a = bar(x, y, z) b = baz(x, y) c = a + c
In the top, we have a whole bunch of statements. We can tell they’re assignments, but we’re going to have to read each one, line by line, to see what’s actually going on.
In the bottom, though, we see there are three separate steps to whatever is going on. The first step is similar assignment statements. The second step, a and b, both seem to be some derivative values of the original x, y and z statements. Finally, a third and final step combines a and b.
The addition of vertical white space allowed us to break apart the program for our reader and draw attention to bits of the program that should be thought of together – steps 1, 2 and 3, whereas the first attempt jumbled them all together.
You can also use vertical whitespace to ‘convert’ horizontal space, using the \ key. This tells python to ignore the end of the line, and assume that the line is continuing on to the next. Alternatively, anything already inside a “[” or “(” style list automatically doesn’t end until python finds the “]” or “)”.
For example, making clever use of vertical space can make horizontal complexity clearer:
def very_long_method_def(person1, person2, account1, account2, irs_rules):
def very_long_method_def( person1, person2, account1, account2, irs_rules ):
This uses more vertical space, but now your eye is more clearly drawn to each argument to the function, rather than having them jammed altogether. We were able to do this because all the arguments are inside of a parenthesis list, so python ignores newlines until it finds the closing ).
The Limits of Space!
Often linters and text editors can set line limits on methods or character limits on lines – such as flagging any method that uses more than 20 lines or something like that. Often line limits are enforced as ‘logical lines’ – i.e., lines with white space removed. But you should think in terms of total lines on the screen, even if your linter doesn’t.
You shouldn’t have a method take up more than one screen’s vertical length. Ideally, your methods would be even smaller than a page length, because people usually like to have a terminal window and a handful of other windows open on a screen at a time.
Being able to see your entire method on the screen at one time keeps a visual exercise from turning into a mechanical one. If it takes more than one screen length, then to read the entire method, you have to use your hand to scroll up and down. You can’t easily cross reference code entirely on the screen at one time, and you end up having to keep certain facts in your head.
If you have all the code on one screen at a time, that means you can use the screen as the tool it was meant to be used for and not try to remember anything – just read the method and use what’s on the screen to figure out what it’s doing.
Likewise, we have horizontal line lengths to allow us to have more than a single window open at a time (especially useful during peer reviews). Long lines also tend to get really hairy to read and figure out what they’re doing.
Your linter should enforce these limits for you. But one thing you must not do when hitting character or line lengths is remove white space! The white space serves a very valuable purpose!
When you properly use white space to ‘expand’ your code and use a linter to ‘limit’ its expansion, you have some pretty good heuristics on when you need to refactor code to make it less complex.
There are two hard problems in computer science: cache invalidation, naming things, and off by one errors.
You will often get a ‘feel for white space rules. Again, your visual cortex is going to tell you what’s complex and what’s not.
The other weapon you have in your arsenal is your ability to name things. And this is a very, very hard problem.
Compare these two code blobs:
def calc_f(t1, t2, num): if t1.count > num: t1.count = t1.count - num t2.count = t2.count + num else: print "Warning, not enough funds!"
def transfer_funds(transferer_account, transferee_account, amount): if transferer_account.value > amount: transferer_account.value = transferer_account.value - amount transferee_account.value = transferee_account.value + amount else: print "Warning, not enough funds!"
The only difference between those two blobs is names.
Nouns and Verbs
First, variables should nearly always be a noun form. That is, they should be a ‘person’, ‘place’ or ‘thing’.
Methods/functions should nearly always be a verb form. They should be an ‘action’.
Methods and variables should try to be as simple as possible – one word if possible. The more words you add to a name, the more complex it is. When we get into object orientation, we’ll find more and more tricks to ’embed’ names into classes and objects, turning code that looks like this:
def transfer_funds(transferer_account, transferee_account, amount):
into code that looks like this:
class Account: def transfer(transferee : Account, amount : Cash):
That may not look like much now, but keep in mind there’s a lot of other code that will live in the Account class, and so on. The code will somehow be tighter, shorter, and more readable.
Your names should also be as concrete and specific as possible. Abstract names like “sensor” are almost always worse than more concrete ones like “radar”, or even “topRadar” if there are two of them.
However, to the above point, every word you use in a name expands its complexity. Each word should carry some weight. If there was only one radar, “topRadar” would be redundant, and “radar” would be a better name.
Almost Always Bad Names
Here’s a list of names that you should almost always avoid:
data, handler, handle, manager, mgr, object, obj, stuff, number, num, x, y, z, foo, bar, baz, func, i, do, calc, calculate, perf, perform
I use these all the time in my examples explicitly for the fact that I’m talking about code structure and names don’t matter. If I had used good names, you might have gotten distracted into thinking that ‘foo’ actually did something.
Any variable name (noun) that ends in ‘er’ is also usually bad.
runner, doer, builder
These names aren’t going to carry much information to your reader, and are often signs that someone didn’t really think through the name they were using. If they didn’t think through the name, what else did they not think through?
Names that have logical words inside them, like “and” or “or” are also right out.
accountAndUser #this is a bad name and it should feel bad.
Avoid acronyms as well, as no one ever really has an acronym dictionary on hand when reading your code.
Avoid ‘Hungarian notation’, that is, using clever encoding schemes to tell you something about the variable such as “n_foo” to let you know that foo is a number. Let the language do that for you.
Python is a language that allows named parameters, which really help with readability.
Let’s take our transfer_funds function above, and call it with named parameters.
transfer_funds(transferer_account=bob, transferee_account=sam, amount=500)
Named parameters allow a reader in the future to not need to look up the definition of a function to have a loose understanding of what’s going on. Let’s say you haven’t seen the transfer_funds definition in a few months, and you happen upon:
transfer_funds(bob, sam, 500)
So… which is it? Did bob transfer 500 to sam? Or did sam transfer 500 to bob?
The Ubiquitous Language
This is an idea from a design methodology called Domain Driven Design. We’ll cover a little bit of it later, but the basic idea is this: if you are having a conversation with your client or colleague and you keep describing your problem by using certain words, those words should be in your system somewhere.
If you’re doing business software, and you talk to your clients about ‘accounts’, are there ‘account’ variables and classes in your code? If you’re doing audits, does the word ‘audit’ have meaning in the code? Is it a function you can run on an account?
Borrowing language from your business domain is a great way to get inspiration for names. It also serves as a design check – if you can’t think of a name for the variable or object you just made that uses something from your domain… maybe it’s a bad variable or object?
The name is telling you that what you’ve created is intrinsically confusing. It’s a good heuristic to use to take a step back and see if you can’t come up with a less confusing way to solve the same problem.
Back to Readability
What’s left after the above checklist?
Readable code should be self-documenting. While we’ll get into documentation in a bit, you should always write code as if you could not comment, and could not write doc strings. How do you embed what problem you’re trying to solve in the names you get to chose in your code?
Readable code should teach the reader something about the domain. This ‘domain’ is back from the Ubiquitous Language idea – basically, reading well-written code should be (ideally) the most efficient way for a reader to understand the domain. If someone wants to know how you calculate taxes, showing them your algorithm should be the most effective way to do it.
To this second point, often you don’t even need to write code to run. But you can write it to get an idea down, make it rigorous, and then show someone else. Think about proposing a new process at your job – you can model the problem in code, write some examples as test cases, and see whether your code spits out the output you want. If it does, the code itself should be the best way to show someone else your new ideal process.
Idiomatic and Consistent Style Aids Peer Reviews
Above all when writing readable code, you should write it consistently. Inconsistency is a red flag for a peer reviewer to read further – it slows them down, and causes them to have to parse things they can’t.
A peer reviewer only has so much mental ‘gas’ before they move on and do other things. Depending on your environment, they may say that you can’t submit your code because they didn’t have the energy to review it, or worse, that you can submit your code because they don’t have the energy to do it.
If they just give you a rubber stamp on your peer review, then what’s the point?
Using consistency helps your reviewer be as efficient as possible with their mental gas. Using idioms in the language (like named parameters in python) that a python programmer would expect helps people identify patterns. Once they see patterns, they know what kind of aberrations to look for. If your code doesn’t fall into a set pattern, then they have to read it slowly, line by line, trying to keep it all in their head.
They’ll give up.
Style Guides – we’re using a linter here
One final note that we’re mostly talking around is the idea of a style guide. This is a written document some teams have that defines how they’re going to use white space, naming schemes, and other rules that really don’t help or hurt readability unless they’re inconsistent.
We’re using PEP8 here, as enforced by Pylint. You can skim over PEP8 here, but we aren’t going to enforce things that Pylint won’t.
Part 2: Documentation
The other side of making code readable is just smacking on some English that goes along with the code. This is called ‘documentation’.
There’s a lot of different kinds of documentation that might go along with software. Design documentation notably will be absent in this discussion. We’re talking about code documentation, of which there are two main kinds – explicit documentation (doc strings, commit messages and readmes) and implicit documentation (comments).
The number one rule of documentation is: Document as if no one has access to your code.
The first kind of documentation we’re going to talk about is Python’s support for doc strings. Doc strings are pretty intuitive in terms of documentation and offer a few benefits.
First, they’re embedded with the code to document. The first line of any function call or module you write can be nothing but a string – Python automatically interprets this string to be the ‘doc string’ of the function. This means the documentation is right by the code, making it easy to cross-reference the two.
def function(x, y, z): """An example docstring."""
Second, doc strings are what Python uses to resolve the ‘help’ command in the REPL. Calling help on any object or function will, in turn, give you the doc string for that object or function. So this is a very convenient way to access doc strings on the fly while trying to prototype stuff.
>>> help(function) #to call the help function # will create a window that prints out "An example docstring."
What goes in a doc string?
The first line of a doc string is usually a one line brief explanation of what an object represents, or what a function does.
You can add more detail in lines to follow.
Finally, you can give more documentation about the arguments themselves and the return value of a function.
def function(x, y, z): """An example docstring. I might add some more detail here. Args: x: (int) What X is, including type y: (float) What y is, including type z: (string) What z is, including type Returns: float: Description and type of the return value """
The full standard can be found here for your reference.
Pylint Docstring Checker
Pylint actually comes with a linter for your doc strings to ensure they follow a format similar to the above. You’ll need to add this extension to this and all future projects.
Doctesting is a pretty neat idea that is dominantly seen in Python and another language called Elixir. They are the idea that you can embed tests into your documentation. The doctest python module can read your doc strings, pull tests out, and execute them as unit tests.
The benefit of embedding tests in documentation is two-fold.
First, it ensures documentation doesn’t get out of sync with the code. If the documented test fails, it means it doesn’t reflect the code properly and needs to be updated.
Second, well-written tests are often some of the best forms of documentation – they serve as examples on how to run your code. People unfamiliar with your library will often cut and paste tests similar to what they want to accomplish, and then change them until they do what they want.
If you’re interested, check out the module and examples here. It’s not required, though.
A form of documentation that actually happens outside of Python is what you actually put in your Github commit messages. Every chance you have to put a message out is a chance to communicate intent to some future maintainer. Github commit messages are a fantastic pure English decoration to the patches and diffs that go along with the commits.
A GitHub commit message that is either too convoluted or too wordy means you probably did too many things in your commit at once, and if you can, it’d be better if you figured out a way to break down the commits.
Github commit messages ought to be in the imperative, i.e., “Fix web page bug” rather than the past tense, “Fixed web page bug”. This makes reading git logs – which print out all the commit messages – a little more intuitive.
Github commit messages should be short and to the point. You can make longer ones, but make sure the first line of the message – like a doc string – is a brief and to the point description of what kind of work was done.
You’ll be installing this git commit linter to help you police your commit messages.
Another big issue to keep in mind in terms of GitHub commits and keeping things readable is that the smaller commits are, the more readable the changes tend to be. Keep your commits small, so that a reader can read through them in order if they like.
A README.md file is important in a GitHub git project. This is because GitHub will display that file as the text on the web page when exploring a repo. Every directory you move into can have its own README.md that GitHub will display.
We’ll only require a README at the root, but an effective readme should mention a few things:
- What the code does – what problem does it solve
- How to install the code
- How to run the codes tests
- Links to any docs, issue trackers or mailing lists
In addition, a section can be added on design rationale if warranted.
The md extension indicates the file uses Markdown syntax, which GitHub supports.
As opposed explicit documentation – doc strings, GitHub messages and readme files – inline comments exist entirely within the code and are considered implied documentation.
Comments in python are anything after a # symbol.
If doc strings are for the help function and let readers know what your function can and cannot do, and commits are to help someone follow along with the changes to the code base, and Readmes are the highest level documentation you write for your project, what are comments for?
Docstrings are for the “what” something is and does.
Code is for the “how” something does it.
What do you mean, why?
There are a few reasons why you might want to put in a few comments on why the code is written the way it is:
- If there’s a known issue or bug – you can comment near the issue or bug, and describe what the workaround is.
- If there’s a more obvious way to write some code, but you had to write it in a way that was more performance oriented and unreadable. Explaining why the code is unreadable, and what it does, then falls to the comments.
- If there’s an interesting design tidbit on why the code is written the way it is, it can be in the comments.
- If there’s a requirement that’s unintuitive that changes the way code might work, put it in the comments near that code.
Notably, comments are not to describe what code is doing – unless the code itself can’t describe it. Then comments can describe what the code is doing, but only because it needs to explain why the code is not ‘self-documenting’.
Dangers of Documentation
That’s quite a bit of information about documentation, and it makes it sound much more rigorous than it is. What I haven’t talked about is the downsides of documentation, and there’s one major one:
Nothing can make sure that the documentation is in line with the code.
This is a ‘hard’ problem – and it requires peer reviewing humans to do it. And even those humans usually hate it. Finding documentation that’s out of date with the code is one of the most notorious problems in software development.
Tools like doctests and peer review can help. But the main way to make sure documentation doesn’t get out of date with the code is to not write it at all. The more intuitive your code itself is, the less documentation you need to write. That’s not permission to skip writing docs.
Instead, what I’m saying is that if you write your code as self-documenting as possible, then when you’re trying to write docs, you should have to struggle a bit. You should have trouble finding ways to describe what the system does that doesn’t just more or less restate the code itself.
In addition to being used by the help command, doc strings can also be gobbled up into documentation generators. These programs run over your entire code base and compile all the doc strings into a single HTML listing – automatically creating cross references and other supporting stuff for you.just letting you know these exist
Often, if you’re creating some open source software, you might host some website that allows quick perusing of this automatic documentation for people using your software. For example, most of these generators create indexes that allow easy searching of the docs, so that users of your software can quickly find some function they might need.
We won’t be using these for now, but letting you know they exist.
The Three Forms of External Docs
In the above, we described what documentation you might be expected to produce. But what kinds of docs are most worthwhile to you to read?
Documentation generated by third parties comes in all shapes and sizes. But currently, there are only three ‘best’ forms of documentation that you’re going to run across. They tie in tightly with some of the forms you’ll be asked to produce.
Note, below, we don’t actually mention the code, but the code is also one of the better places to look. If it’s readable and well commented, the code itself can often tell you exactly how something works. It is not a very efficient way to get started, as tutorials and reference guides listed below give a much higher level view. Sometimes, though, all you have is the code.
The “How To” Guide / Tutorial / Cookbook
How-tos are basically bits of code other people write to show you how to use a library to do something. This is the exact same thing as a well-documented test.
A test, like a doctest, in your code, shows an example of how to get something done. Amending this test with some decorative English explaining why things work the way they do finishes out a well-documented doctest.
How-to guides on the internet can come in blog forms, but you can also find them in generated docs. Finally, you can find a form of how-tos in third party library’s unit and integration tests. Read their tests to see how to set code up to do something, and then from that initial how-to, keep amending the example code until it does what you want it to.
The “Reference” Guide
This is docs like the Python Reference. These are often large, autogenerated web pages that draw heavily on doc strings as well as other hand written and hand edited language.
The reference guide is used to show you the potential of functions and objects you have to play with. It’s not meant to be read from snout to tail, but instead, to be drilled into deeply for one or two subjects and jumped around in.
While an example or tutorial might tell you about the existence of the ‘send_email’ function, it’s the reference guide that will tell you about all of its arguments and assumptions.
Speaking of which, the reference guide makes great use of your well-written doc strings. It expounds on every argument, giving details on the type and assumptions (assertions).
If you start exploring a library by copy-pasting one of the examples you find online, your explorations at that point will be diving into the reference guide to get details on all functions and objects used in your copy-pasted example.
The “Mailing List” / Design Discussion / FAQ
Finally, you have forums like discourse, mailing lists, and Stack Overflow. You’ll find your google searching leading you to these places quite a bit, and that’s for two reasons.
One, documenting every part of a system is incredibly expensive. And since the system is always evolving and changing, documentation can get out of date very quickly.
Mailing lists and forums allow a ‘profile guided’ approach. That means that since we don’t know what to document, and we can’t document everything, it’s best that we just document the things people have questions about. So mailing lists usually wait for the questions to come in.
The second major reason you might use mailing lists is that it’s the only way to get at the most common form of design documentation. Design docs, that is, discussions the implementors had amongst themselves, are rarely formalized. And when they are formalized, they’re almost so artificially created that you don’t really see how the sausage is made.
By going back into the mailing lists, you can see the many arguments had over every tiny detail, and it gives a background and context to every line of code. If you’re stuck maintaining something, you have only a few sources of information to go off of – the comments, if they added any, the code itself and any associated docs, and the mailing list.
Back to the example at the beginning, a peer review is when you intentionally bring someone else on to some code when you believe it’s done to read through it, attempt to understand it and find issues with it.
Benefits of Peer Reviews
Peer reviews have many benefits over mere quality, although that’s one of them – per hour invested, peer reviews are the most effective means of reducing defects. They are about 4 times faster at finding bugs than testing and can find about twice as many bugs overall.
Peer reviews also spread knowledge in two ways. First, a junior engineer having her code reviewed by a more senior engineer will learn new techniques and things to watch out for from the peer review. Going the other way, a senior engineer having her code reviewed by a junior engineer will help the junior engineer learn about what’s considered idiomatic and sound in that shop.
Peer reviews spread knowledge about requirements and design themselves, rather than just knowledge of how to build things. You may not know anything about what another team is doing, but if you review their code, you’re given some insight into what kinds of things they’re up to. In this way, peer review can help knock down silos.
Comparisons to Other Tools
Compared to most other quality tools, peer reviews are the best at finding logical bugs – i.e., issues where either the design doesn’t reflect the requirements (it works, but it doesn’t do what it’s supposed to). It’s also the best at finding issues with maintainability and readability of code. If you want to know how hard your code is to read, ask someone to read it.
Linting can be thought of as an automatic peer review by a very rigorous, yet dumb, junior engineer. They’ll spot every violation of a rule set but won’t spot anything that’s outside of that rule set.
It actually helps a human peer reviewer to lint a code base first to remove the obvious issues. This is for two reasons. Firstly, a human reviewer can be confident that certain stylistic issues don’t exist at all, and can spend their mental energy elsewhere. Secondly, code that somewhat consistent and idiomatic beforehand (using rules a linter can enforce) allows the reader of code to exercise their visual cortex in the reviews. Does certain code simply ‘look’ wrong?
For example, if you’re using consistent style everywhere, and one part of the code just ‘looks’ complex, it means the reviewer might spend more of her time there. Complex code tends to have more issues than simple code. Consistent styling usually helps make complex code ‘look’ complex, and simple code ‘look’ simple. This helps the reviewer know where to spend her time.
Testing and test driven development can help ensure that your code does what the tests say it should do. As we mentioned in other sections, tests can’t do anything to guarantee that the code doesn’t do what it shouldn’t. But assertions and types will help us there.
The problem with testing that peer review can help with is two-fold. First, do the tests actually reflect the requirements? Are there tests that should be in your test set that you missed, or are tests you wrote actually out of line with what the software is ‘supposed’ to do? Did you accidentally write the wrong software because you had the wrong tests?
Secondly, is the code being reviewed ‘testable’? This is one of those ‘ilities’ (testability, maintainability, readability, etc…) that peer review attempts to measure. A peer reviewer can tell if the code is testable by taking an internal measurement of their own “Rage Factor” when they ask themselves the question “if I had to test this code, how would I do it?”
With assertions, similar to testing, a peer reviewer can judge both when an assertion makes sense and is in line with expectations, and when an assertion may need to be added. Assertions also make code more readable by making assumptions explicit, so that a reviewer can assume certain things are impossible.
Finally, with types, which we’ll get into next section, as well as an addendum to tests and assertions above – when any of these techniques are used for design, peer review and only peer review can effectively say whether the code’s design as a whole is maintainable and extensible at a higher level.
Effective Peer Reviews
There are a few ways to make peer reviews more successful.
First, be familiar with the requirements. Knowing what problem the developer who’s code you are reviewing is trying to solve really helps you figure out whether they’re actually solving it.
Second, like requirements, be familiar with the design. Design, in this case, means a broad overview of the overall approach the developer took in solving her problem. Design also means that you need to be familiar with the systems the code changes touch – not just the lines of code that are changed themselves, but also, lines of code near them. If a function changes, who calls that function, and how is that caller impacted by the change?
Third, recognize many people take great pride in their work, and a peer review is a bit of a feeling of nakedness. Emotions will run high. Give critical feedback while being as generous to the developer as possible. Understand they want high-quality code that gets the job done, too, and that the reason they don’t see issues isn’t because they’re a hacker, or rushed, or are a bad programmer – it’s because of nose blindness. Give others feedback how you want to be given feedback.
Fourth, give concrete and specific advice. You can recommend solutions or workarounds, but only recommend these to provide clarification on what problem you see. For instance, instead of saying “Rewrite X as Y”, you should say “X seems to run afoul of [some problem]… if you were to rewrite it as Y, you’d avoid this problem.”
That phrasing leaves things open-ended so the developer can fix things in a way she sees fit, and folks don’t get bogged down comparing different solutions when they should just be agreeing on whether or not problems exist.
Fifth, similar to three, don’t be offended at other’s critical feedback. Ask questions, and in general, assume they’re right – it buys a lot of capital to implement other’s suggestions, and it’s good practice in ego-less coding. If you disagree with them but recognize how they are suggesting things won’t actually break anything, offer to do the work. Separate yourself from the product of the work not because you aren’t part of the work, but in recognition that the final product is a group effort.
Six, track requested changes. Bugs might get spotted early, but if no one makes sure they’re fixed, the peer review was worthless.
Seventh, offer stylistic and design feedback. Don’t just look for bugs. Obvious bugs are usually hard to find by the time you get to peer review – instead, look for things that irk a human (something you should be good at) and give feedback on that. Testing, linting, typing and assertions all can’t fix maintainability and readability issues. Humans pointing them out can.
Eight, seek feedback in your own peer reviews from multiple sources. Generally speaking, the more familiar people are with the requirements or design, the better the peer review will be. Likewise, getting someone who’s a language expert might get good peer review comments from a different angle.
Nine, focus on things that other tools can’t find. Peer review the tests, as the tests can’t be tested. Peer review the assertions. Peer review the documentation and comments – is it readable, is there enough of it, or too much? Are test cases or assertions missing?
Ten, think about instituting a coding standard. Having a checklist of things that a linter can’t find on top of your normal toolchain can help focus your efforts and structure your thought on peer reviews. You can amend and remove rules from this checklist as you see patterns emerge in your code base.
We won’t really get a chance to do much of this during the modules, however, pair programming takes peer reviews “to the extreme”. In a pair programming situation, you and your partner are given a single computer and you both design and code together.
The peer review happens ‘live’, as you’re coding. Similarly with the design review.
Pair programming can really do wonders in some of peer review’s strengths – knocking down domain silos, and improving quality even more than traditional peer reviews can. Pairing juniors with seniors tends to be a good way to pair, or pairing people from different technology backgrounds. Diversity is better, as pairing two of the same person together won’t really get a lot of perspective into the design process.
Pair programming can also be pretty expensive since you don’t really double the output of a single coder by pairing with her. But output is not the only measure of success – if your long term productivity goes up or at least stays flat as your system grows, then you’ve avoided the pain of silos and that alone can be worth the short term productivity hit.
Live Coding Sessions
From here on out, I’m going to try and find a live coding session (or at least a video of one) for you to watch of another coder. Don’t feel bad if you can’t keep up, and don’t feel like you need to watch and rewatch until you understand every little bit.
The point of watching others code is to really get a feel for different work styles, see what tools others use, and understand their mental process. It’s to feel okay to google things and to hit issues because that’s what happens to everyone. It’s as good of practice pairing with a senior as you’re going to get outside of actually pairing with a senior.
Try to keep a notepad nearby and write down questions you have in the notepad. Then after the session, try to answer each of your questions through good Googling.
Check out this one here.
You do not need to know everything that’s going on, or fully understand. Use it as food for thought on things to Google, and follow along as close as you can.
Try to think of a few questions or comments on the live code challenge for your mentor.
Try to peer review other’s code
The other thing that’s changing after this chapter is that you’re the mentor now. You need to go out and find a mentee and help them get through Code Combat and these Chapters. Be their peer reviewer, let them ask you questions, and try to teach them. This will help you play the role of a senior engineer to someone else, and help you realize how dumb you are when you think you know something and then try to explain it to someone else.
Recognizing you’re dumb is the first step to getting smarter.
If your own mentor has code available that can be peer reviewed (be careful about cheating and seeing solutions to future chapters), read that too. Alternatively, you can find someone’s open source project and help them with pull request reviews, or perform an audit and simply read their code for errors and suggest corrections.
We’re going to try and give you some code reading to do each chapter from here on out as well, in addition to the code writing challenge.
Expect to spend about one hour per 100 lines of code. Feel free to download the project and tinker with it. Doing a ‘guided tour’ of the code – i.e., running the debugger and just stepping through each thing – is a great way to familiarize yourself with a new code base.
Have some questions and comments for mentor ready. What did you like about the code base, what didn’t you like?
Movie Ticketing System
You’re inheriting some old code that helps manage an electronic ticketing system for movies. The person who began work on the code wasn’t very good at writing readable code – he claimed it gave him ‘job security’. Unfortunately, it didn’t work and he was laid off.
The code needs to provide four main functions:
- Tickets need to be purchased. This should debit the cost of the movie to a moviegoers’ account in the moviegoer database, add a ticket to that moviegoer’s account as well, and remove a ticket from the tickets database. This should return false if the moviegoer doesn’t have enough funds.
- Tickets need to be refunded. This should credit the cost of the movie to a moviegoer’s account in the moviegoer database, remove the ticket from the moviegoer’s account, and add that ticket back into the tickets database. Basically, the opposite of purchased.
- Tickets need to be consumed – basically, a ticket needs to be removed from a moviegoer’s account. This will be automatically called by the system, so you only have to write the function that removes the ticket and not worry about when it’s called.
- Finally, an auditing function needs to be written that will pretty print both the moviegoer database and tickets databases.
- The cost of a movie is 5 dollars.
You can find the code you’re inheriting here. The original author said he was in the middle of writing the purchasing function. You should use the following process:
- The original author has a test that he couldn’t get to work – diagnose the problem and get the test passing. Commit with a good commit comment.
- Pylint up the code base, and commit this with a good commit comment.
- Make the existing code more readable based on the principles above, ensure things are pylint clean and tests run. Commit with a good commit comment.
- Write a test for one of the other four functions. Make sure all other tests are successful, pylint is clean, and your new test is readable. Commit with a good commit comment.
- Write the function that satisfies the test. Make sure all tests are successful, pylint is clean, and your new function is readable. Commit with a good commit comment.
- Repeat 4-5 for the other functions.
- Submit your code for peer review from your mentor.
For Mentors (And Coders Too)
Talk to your mentee about the live code session here and about the code reading they had to do. They should have a question or comment on each of them.
Ensure your mentee has the pylint documentation linter and git commit linter hooked up.
Use the following checklist on the final code, in addition, the cross-checking that the process the mentee used (based on commit history) was what was outlined above.
- Is test coverage at 100%
- Is it pylint clean 10/10?
- Does the code use assertions?
- Is pylint doc strings clean?
- Is the documentation readable?
- Does the code use good names?
- Does the code use good use of white space?
- Does the code have consistent and idiomatic style?
- Does the code include comments?
- Does the code use git hooks for pylint, pylint docs, and git commit lints?
- Does the Readme explain what the code does and how to install and test the code?
- Can the coder give a ‘guided tour’ using the debugger through one of their test cases?
Sources / Extra Reading if Interested:
- How to do a Github pull request
- Self Documenting Code via C2
- More readability tips and techniques
- Examples of Good and Bad Comments from Coding Horror
- Why Coding Style Matters
“There are no Black Swans”
How do you ensure the above in a computer program?
Right now, we have two methods to ensure certain quality prepositions, tests: and linting. Tests, if you recall, set up a scenario and check that things work exactly like you expect. To ensure that there are no black swans in our program, our tests would end up looking like…
def test_scenario_1(): go_to_north_america() for swan in get_all_swans(): assert swan.color != "black" def test_scenario_2(): go_to_south_america() for swan in get_all_swans(): assert swan.color != "black"
Testing this way can become onerous. You have to think of all possible situations (or places) where black swans may occur. These tests can also take a long time to run. And what do you get in the end? Well, it turns out there was an enclave of black swans living on the dark side of the Moon that you didn’t anticipate. For all that testing, you didn’t ensure there were no black swans.
What about linting? Linting, unlike testing, can guarantee the absence of a certain subset of errors in our code. But these are largely stylistic errors – linting cannot understand our code’s logic, it only processes it as text.
Syntax vs Logic Errors
There are two kinds of errors we might run into while programming – syntax errors and logic errors. Syntax errors are errors in the way the program is actually encoded. They happen when we’ve literally failed to write a program. They’re usually easy to find, and linters expand the universe of things we can consider syntax errors.
Logic errors, however, are harder to find. For instance, can you think of a linter that would catch the error below:
#This program prints "hello coit" print("Hello Eric!")
What linter could catch the error above – that is, the comments are out of line with the behavior, and it leaves us wondering what exactly the program is supposed to do. Maybe the comments are wrong, maybe the program is wrong. Maybe both are wrong. A linter that would be able to spot the error above would have to know what the requirements are (what the program was supposed to do) as well as be able to parse and understand English to recognize the comments are out of line with the program.
Suffice it to say, such linters don’t exist. And many attempts to create programs that can understand requirements and English have been made – perhaps in the future, we’ll get programs smart enough to find the error above. But we don’t have them yet.
Back to Black Swans
Our black swans are encoded into the logic of our program. Testing can’t find them, and linters can’t rule them out.
We have two remaining arrows in our quiver – assertions and types. Types we’ll get to in the next module, and they are the only thing that can actually guarantee we don’t have black swans. For now, though, let’s look at assertions.
What’s an assertion
Assertions and the ‘assert’ keyword are things that should seem familiar to you. You use the assert keyword in your unit tests.
assert x == 1, "X should equal 1"
In Python, the above assertion implements the following behavior: it checks to see if the variable one equals the number 1. If it does, then we keep running our program. If it doesn’t, the program halts and the message “X should equal 1” is printed. That’s it. Our program is toast.
In our tests, our assert keyword is actually rewritten by the test framework to also keep track of some test code stuff. However, it works more or less the same – check something, if it’s false, report an error and crash.
Assertions and Black Swans
How do assertions help us catch black swans?
Well, in the above code we were told that they found an enclave of black swans on the dark side of the moon. Who’s they? Our clients, unfortunately. And they’re pissed because we said there were no black swans. That’s the reason they went with us rather than their competitor.
Assertions help us make promises like this. If instead of all the testing above, which again, ultimately didn’t turn up any black swans, we wrote code like this:
def handle_swan(swan): assert swan.color != "black", "We can't handle black swans!" ...do other stuff...
Then we’d guarantee not so much that we won’t have black swans, but instead, the program will do something predictable and safe if we do. That is, crash in a safe way, and let us know why. This ends up being incredibly valuable.
Some real life examples
Let’s say you’re writing a script that parses a text file. You glance at the file and it looks like the format is pretty regular, so you write some stuff and assume the rest of the file looks the same.
This program reads in some data does some transformations on it and writes it back out to the same file. It’s a big file, and it has to be fast, so you can’t keep a lot of stuff in memory.
Only someone accidentally put an error half way through the file. The next day, after your script runs, you realize you deleted half the file because you were off by one character in your parsing.
In this case, you could have asserted that things lined up in the file like you expect. If they ever deviated, you’d crash the program, immediately, and leave the rest of the file unchanged. Then the next day you can debug what’s going on, and pat yourself on the back for not accidentally deleting your project.
Let’s say you’re writing some embedded code for an X-Ray machine. Your machine takes X-Rays of small children and finds things that might be cancer. To stay safe, the machine has to administer a dose dependent on the size and weight of the child.
However, that idiot Josh made some changes last minute and turns out, if you set the machine up just right, it thinks the kid is about the size of three elephants.
Ooops, you just killed a kid.
You could have asserted that the dosage never exceeds a certain amount, and now little Sally’s parents wouldn’t be casket shopping.
These two examples illustrate two strengths of assertions, discussed below.
First, Assertions Document Your Assumptions
When you assert in your text file that you should find the letter ‘c’ about 4 characters into each line, you’re basically saying to the reader “If C isn’t 4 letters in, then something is dreadfully wrong. I have no idea what’s going on and I should stop writing to this file”.
This ends up being a very valuable tool for two reasons.
First, any assumptions you are making in your code are now assumptions that every other reader and maintainer of the code (i.e., you in six months) now are aware of. This is invaluable communication, as many assumptions like these are so often either never written down, or written down in comments.
Comments are better than nothing, but comments aren’t executable. Assertions are. Assertions will crash your program if the assumptions change, and require you to rewrite bits. That’s okay and is often the desired behavior – wouldn’t you want to know when your assumptions need to be updated?
Second, it allows you to make more assumptions. Often you try and code for corner cases, errors that may or may not ever happen, or weird things that you’re not sure are impossible and thus want to handle. This makes code really complex. You could be lazy and just assume none of these things ever happen – but when they do, your stuff will break in unexpected ways and be very hard to debug.
Or you can just assert that they don’t happen. Then you’ve let the future maintainer know you didn’t handle that corner case, you fail in a known good way if it does happen, and it allows your code to cleanly do the thing it should do, and assert that all the other stuff never happens.
Basically, if you’re reading code and you are thinking “this should never happen” or “this is impossible” – then assert it. If you’re thinking “this must be the case”, then assert it! Some people say “that’s impossible to do, that assertion will never trigger and its a waste of time” – that’s precisely why you write the assert! Because you believe it’s impossible, you believe the assertion will never fire, so you should check that belief and write the assert.
You’ll be surprised how often it fires.
Second, Fail Fast and Fail Often
By littering your code with assertions, you can begin to adopt a ‘fail fast and often’ design mentality.
Often, when trying to make our code robust to violations of our assumptions, we try and think of every possible thing and handle it. This is called defensive coding. This makes the code very complicated, and make the ‘essential’ nature of what you’re doing buried under lines and lines of error handling.
Erlang is a programming language invented in the late ‘80’s by Ericsson (the mobile phone company). They wanted something that would keep their phone switches up with better reliability than what they had been using. When phone switches go down, people can’t talk. So they started trying to figure out how to make their switches never go down, or at least have very high uptime.
The standard advice of the day was defensive coding – make sure the switches never go down in the first place. Ericson, through Erlang, actually tried a different approach – instead of trying to limit failure, they just made sure their programs were really good and really fast at coming back up.
Erlang programs don’t try to handle errors. They just crash as soon as they can with something informative and then restart. This has lead to switches that have very high uptime, because while they can and do crash all the time, there’s always a backup running and the programs themselves start in milliseconds.
What we can learn from this is that code becomes greatly simplified if instead of trying to handle errors, we just crash in a known safe way when we encounter them. Assertions allow us to do that. Assertions tell us two things. First, they tell us when code fails, what went wrong. They give us something much more informative than code normally does when it fails – instead of a backtrace where we have to start theorizing what might
Assertions tell us two things. First, they tell us when code fails, what went wrong. They give us something much more informative than code normally does when it fails – instead of a backtrace that ultimately is where the program (already in an error state) finally did something that the OS killed it for, a backtrace where we have to start theorizing what might have lead the program to its bad state, we get a crash and a message of what exactly violated some assumption exactly where it first occurred.
The second thing they tell us is that if our program doesn’t crash, then all of our assumptions were true for that run of the program! This gives you confidence that your program works. So often, programs appear to work, but later we ask ‘how could this have ever worked?’ Assertions get rid of those sorts of errors. If the program worked, it worked in precisely the way you intended it to work.
Not when there’s nothing left to add, but rather when there’s nothing left to take away
Art, Antoine de Saint Exupéry tells us, isn’t done when there’s ‘nothing left to add’, but rather when there’s ‘nothing left to take away’. This is important with software design too.
Often people are enticed by languages that make it easy to do ‘a lot of things’. They’re considered powerful languages – it’s easy to do anything in them, and with very little work. C is considered powerful for this reason.But all that power can actually be very limiting. In contrast to the French Poet quoted above, the great Philosopher, Spiderman, has taught us that “With Great Power comes Great Responsibility”. Sometimes this responsibility is too much.
How do we make our languages less powerful? How do we make our programs capable of less rather than more? With assertions. Assertions tell us that the program is now incapable of a whole way of working. If we assert X > 0, that means the program is incapable of doing anything if X is less than or equal to 0.
Theoretically, when we’re done, we have a program that’s capable of only one thing, and that one thing is what it was designed to do.
If poetry and comics don’t convince you, perhaps science will. This study, done by Microsoft in 2006, shows that as assertion density went up (assertions per 1000 lines of code), defect density went down (defects per 1000 lines of code).
It also showed that many of the defects that were eventually found in the bug database for these projects were found via the use of assertions.
Debugging can take up the lion’s share of your time. Hopefully, you’ve already found through code combat or the simple code exercises in the chapters so far that your initial coding doesn’t take too long. What takes a long time is when something doesn’t go as planned and you have to figure out what.
Every bug is different, and so often your mentor might be powerless to help you. Instead, she probably has to sit down, step through your code, and try and replicate the error herself. How to write a for loop? She knows off the top of her head. Why your for loop doesn’t work? That she doesn’t know, and she won’t until she spends a while debugging it. Debugging takes a long time, and it’s not fun.
Assertions reduce the number of bugs you generate and reduce the amount of time it takes to debug the ones you do generate. They’re very valuable.
Code contracts as an idiom. More on this later…
We’ll get into ‘design by contract’ later, but a quick introduction is in order to help you think of ways to introduce assertions into your code.
First, the precondition. Preconditions are things that should be true of your program’s state at the ‘beginning’ of a function. If I have a function that takes in two arguments and one needs to be larger than the other, I can assert that with a precondition. Preconditions most closely model ‘assumptions’ in code.
def foo(x, y): assert x > y, "X should be larger than y!" ...rest of foo...
Second, the postcondition is similar, but makes promises about the return value of functions rather than arguments of a function. For example, maybe foo has to return an integer larger than 10. Post-conditions most closely model “promises” you can make to other parts of code.
def foo(x, y): assert x > y, "X should be larger than y!" ....other parts of foo... assert return_value > 10, "Return of foo needs to be larger than 10!" return return_value
Try to think in terms of preconditions and postconditions – what should you assume of the arguments of every function you write? Better yet, what can you assume to make writing the function easier? Assert it!
What should you promise? Can you promise more? If so, do it!
Another pattern for adding assertions is the ‘sanity check’. This idea weakens the idea of something that ‘must be’ true, or ‘should be’ true to something that ‘really ought to be true, I think’.
If you’re a bathtub, it really ought to be the case that the temperature can’t be set to above scalding. If you’re a microwave, nothing really ought to be in there for 99 hours. These kinds of assertions may end up firing more often than others and need some ‘tailoring’ to work. But they also can serve as great ‘canaries in the coal mine’ – they’ll almost always fire first, before anything is ‘technically wrong’, and give you a situation that really ought to be looked at to see if it’s a problem.
The downside of sanity checks is often we might have tests that test corner conditions to ensure things work – these corner case tests are more likely to trigger sanity checks. It’s a design trade-off on whether you want to ban some corner conditions outright or make sure you work properly through them.
No side effects
Side effects are when code actually ‘does’ something, like print to the screen, write to a file, or send a message over the network. They should never occur in assertions!
This is because, in some languages, assertions can be turned off. Moreover, it usually is more readable when assertions are simply true or false, or functions that can return true or false. Assertions shouldn’t ‘do’ anything in and of themselves.A common idiom is when a function returns a value saying whether it worked or not. The wrong way to check the return value would be to assert directly.
A common idiom is when a function returns a value saying whether it worked or not. The wrong way to check the return value would be to assert directly.
assert foo_prints_to_screen(1,2), "Foo should return true!"
The right way is to pull off the return value and assert just on that.
val = foo_prints_to_screen(1,2) assert val, "Foo should return True!"
If debugging, someone should be able to skip your assertions or even comment them out and not have any code no longer work because you no longer read correctly from a file or something.
A special kind of function that does nothing and just returns true or false is called a predicate and is a-ok to put in an assertion. Many times these predicate helpers make code more readable.
Nothing that takes a long time
Likewise, assertions are just supposed to represent quick promises about the code. They shouldn’t take too long themselves, as that would screw up performance numbers between when assertions are on and when they are off.
So only check variables or run predicates that you believe run quickly. I wouldn’t invert a thousand matrices to check an assertion. Using assertions to check long-running behavior is probably something better done as a test.
Consider writing predicate library helpers
Alluded to above, predicates can make assertions easier to sprinkle throughout your code as well as more readable. Don’t shy away from writing and using functions that are only used in assertions.
For example, comparing two floating point numbers directly like 3.14 and 3.15 is notoriously dangerous to do. The Numpy numerical computing library for Python has a function to do so – compare two floating point numbers and return true or false. This reads very well inside assertions.
def foo(x, y): assert numpy.isclose(x * 3.14, y * 3.15), "X and Y should be close after modifiers" ...other code...
Don’t be afraid to use other’s predicates or write your own!
What about exceptions or ‘known failures’?
What if your program is ‘supposed’ to tell the user their input was bad and ask them for more input? What if, on disconnect from the server, it’s not supposed to crash but reconnect? What if something happens that you can actually recover from?
We will talk about this – error and exception handling – later. For now, just assert that these things don’t happen.
Why? Shouldn’t we learn how to ‘do it right’ first?
There are a few reasons.
First, error handling is notoriously difficult to do. There are a few patterns I’ll introduce that already rely on a good understanding of assertions.
Second, you will handle preciously few errors in your code well. This goes with the above but extends it. Not only is error handling hard, and unless you test your error handling code it’s almost certainly wrong, but there’re just too many possible errors to handle.
In C, the printf can fail. Hardly anyone knows this. And even fewer know what it means when it fails or how to recover from it. Instead, they write code as if printf didn’t fail. What asserts allow you to do is find a middle ground. They allow you to say “I don’t know how to handle this error, but I don’t want it to happen, and I’m not going to pretend it never happens.”
Error handling code, if taken to the extreme, can dominate your code base and make it incredibly difficult to read. It’s surprisingly tricky to get right. And most errors won’t be handled anyway, even if you try. You’ll produce higher quality code if you learn to assert as much as you can and convert those assertions to error handling code on a case by case basis.
We’ll discuss that later.
Relationship with TDD
Finally, assertion density has a very synergistic relationship with test driven development. Likewise, as we’ll get into later when we focus more on design, test driven design is also very synergistic with design by contract!
There are two things to think about when working with assertions and tests.
First, each assertion is only worth the number of times it’s executed. If I have 100 assertions in my code, and 10 in my test, but I only have one test, then I’ll execute 110 assertions over my code.
However, if I have 100 assertions in my code, and have 20 split between two tests… That means the 100 assertions in my code are exercised twice, with potentially different values. That results in 100 + 10 for test 1 and 100 + 10 for test 2 = 220 assertions exercised.
Thus, the higher your assertion density, the more valuable writing another test is, and the higher number of tests you have, the more valuable it is to add assertions in the code. They go hand in hand!
Second, related to the above, assertions help squash the ‘exponential integration test’ problem.
Unit tests are the tests you’re most familiar with writing. They call a single function with known inputs and check that the outputs are what you expect.
Integration tests are tests which call code you wrote which call other code you wrote. It tests that all the code that you or your team has written works together, or integrates.
The problem is maybe you worked on unit 1, and your colleague has worked on unit 2. We then have to write one integration test that tests the connection between unit 1 and 2.
That’s pretty easy.
But when a third unit is written, now we have to write two more tests – one to test unit 1 and 3, and another to test unit 2 and 3. When a fourth unit is added, three more tests need to be added – units 1 and 4, 2 and 4 and 3 and 4. We can actually test a lot more than this for every unit added to the system.
This quickly gets out of hand, ensuring that you can’t really have complete ‘integration coverage’ of tests like you can have line coverage from unit tests. There’s no way you can write a test for all possible combinations of units for any system of any level of complexity.
There’s a third level of testing we’ll call system testing – that’s when you test everything together. You don’t mock, stub, or fake out anything. So you test all units, 1, 2, 3 and 4. System level testing is pretty easy to set up, but hard to increase coverage on. Usually, I’d advise you write one or two high-level system tests, and then the rest as units to up your coverage.
A funny thing happens, though, when you have system tests and assertions. Let’s say unit 1 has a few preconditions and postconditions, unit 2 as the same, and so on. Our unit tests all exercise these preconditions and post conditions just fine. But every system test we write suddenly benefits from all of the assertions we have through our code. So we get a lot more assertions checked (as we stated above).
But there’s more! All of these assertions are along unit ‘boundaries’. That is to say, we’re writing assertions at the beginning and ends of all of our units (the ‘preconditions’ and ‘postconditions’, which is precisely where they talk to each other (often called the boundary). This is what integration tests are supposed to check.
To sum up, assertions basically provide a basic level of integration testing ‘built in’.
To apply what I just said, if you find yourself struggling to figure out how to test something – often this happens when your code is complex enough that you’re trying to set up an integration level test of a few moving parts and it’s getting hairy – think about some set of assertions you can embed in the code instead. Often assertions are much easier to add than integration tests are to write!
Don’t type check… too much
We’re going to get into types (which eliminate black swans) in the next module. You can do some dynamic type checking with python using assertions by insisting that variables are certain types, or inherit from certain types. These are often good ideas for assertions at this stage, but don’t go overboard.
One of the strengths of python is that its code can be called on a lot of different ‘stuff’. In other words, if you have code that reverses a string, and you call it on a list, you probably have reversed your list. This is a good thing and is called ‘polymorphism’.If you had done an assertion in there that the code only works on strings, you would have made the code needlessly dedicated to strings.
If you had done an assertion in there that the code only works on strings, you would have made the code needlessly dedicated to strings.
While we haven’t gotten into object orientation yet, or typeful programming, suffice it to say you should only assert on things that are ‘reversible’ – basically, things on which your algorithm would work. So beware making things too concrete.
Moreover, it’s considered ‘pythonic’ to ask forgiveness rather than permission. That means code should try (from typing perspective) to get as far as it can before crashing. This is not the same as failing fast on a value perspective.
(Types are ints, strings, floats, lists, dictionaries or the ‘category of container’ of a variable, while values are what the value actually is. For example, you can check the type of something in python with the ‘type’ function:
>>> x = 1 >>> type(x) <type 'int'>
In the above, x has a type of int and a value of 1. Asserting on values is usually better than types, though some type-checking is fine!)
Don’t test your assertions
Writing a test to make sure assertions fire is a waste of time. It’s also impossible in some languages. In python, assertions are ‘exceptions’, which can be ‘caught’ and handled. In C++, assertions crash the program. Testing assertions is a great way to double the amount of work you have to do. Plus, assertions deeply embedded in some nasty complex code are hard to test, despite being some of the most valuable assertions!
From a psychological perspective, you’re going to find yourself eventually finding reasons not to test your code because it’s too complex. Testing can be, at times, painful. If you make assertions that painful, you’ll find excuses not to assert. Make assertions as easy as possible.
(TDD, by the way, also makes tests easier to write by doing them first. Unless a developer is particularly skilled in writing testable code, testing after development is done can be painful indeed.)
Instead, assertion ‘quality’ is most easily measured through peer review and even static analysis tools. There’s not a linter written right now for this, but it’d be trivial to write a linter that checked that there were assertions at the beginning and end of functions, as well as reporting the assertion density.
Note how interesting that is! We said above that linters only work on syntax issues – not logic. Assertions can work on logic. But how do we know if we’ve asserted well? That’s a syntax issue! We’ve ‘bootstrapped’ ourselves by combining tools together to make a formerly very hard problem only slightly hard.
Read this for more examples of how to use assertions in python.
While not required, this is sometimes a useful syntax construct to keep assertions as ‘one-liners’.
Clone this repo. It has the beginnings of some simple statistics functions. Use the following development process:
- Pick one of the functions (mean, median, range or standard deviation) and write a test for it.
- Ensure your test fails, and your code is pylint clean.
- Think of assertions you can run as a precondition for your function. Add those.
- Ensure your test still fails, and your code is pylint clean.
- Implement the function.
- Ensure your test passes, the assertion doesn’t fire, and your code is pylint clean.
- Pick another function and go back to step 1 until you’re done with all four.
- Open up a pull request and ask your mentor for peer review
Definitions of the functions can be found here as well as in the code.
Check out the ‘answer_key’ branch of the repo above to see some examples of good assertions. For students, don’t look at that branch – you’ll have more or less wasted your time reading this if you just cheat and look at the branch. You’ve already gotten this far, might as well give it a shot without looking at the answers right?
Also, don’t read ahead unless you’re a mentor – again, it just ruins things for you.
- Did your mentee use the Python statistics module? If so, ask them to reimplement the algorithms from requirements. Let them know that they did it absolutely right and in the real world, you would always use libraries. Commend them for doing research. Let them know that this challenge is also to practice implementing an algorithm from a definition, and that’s why we’re doing it that way.
- If they did not use the statistics module, ask why not? Good research at the beginning of any project to see what’s already been solved is valuable.
- Did your mentee implement the population standard deviation or the sample standard deviation? Whichever one they did, ask them to reimplement using the other way. Talk about how errors and defects can be in the requirements themselves – in this case, the requirements weren’t detailed enough for them to know which one to use.
- Discuss whether the tests and assertions helped in their initial development, and whether it helped in any ensuing refactors? Did the tests and assertions help them understand the problem, as well as shrink the problem space? Did they help ensure that everything still worked after refactors?
Peer reviews accomplish a number of things:
- They are one of the most cost-effective means of ensuring quality
- They spread general system knowledge across a team
- They spread best practices
The ideal peer review size in terms of ensuring quality alone is one person. The marginal benefit of adding more than one person to a review, at least in terms of quality, is low. However, there are effective ways to increase peer review team size if you have different reviewers focus on different things.
You ensure that reviewers have different foci by ensuring they come from different perspectives. I’ve seen three general patterns in perspective.
The Architect, in this peer review role, will be a senior or very senior engineer who’s had her hands in a lot of parts of the system. They’ll tend to be a jill of all trades, master of none, and be most interested in larger issues.
They’ll best satisfy the “most cost-effective means of ensuring quality”.
Architects will focus on:
- Whether or not the code’s external API matches other project’s expectations
- Whether or not the code abides by a coding and design standard, for instance covering requirements such as
Architects also, to a degree, help ‘spread general system knowledge across a team’. They can bring other projects knowledge into a peer review, as well as export project specific knowledge out of this project into other peer reviews.
The Junior Developer
Junior-Senior pairs, which are one pairing pattern that seems to work very well (and which I have begun calling ‘Haseltine Pairs’), also provide another perspective in peer review.
They best satisfy “spreading general knowledge about the system” in peer views.
Specifically, having a junior peer review code will
- Help them better understand the larger project
- If they’re part of a Haseltine Pair on the project, help them understand how to debug, test, and document the project
Also important is the Junior Developer’s influence on the process itself. Having juniors on a review, and ensuring they’re confident in asking ‘dumb’ questions, helps code become more maintainable and readable. The Senior now knows she’s writing for an ‘audience’, the junior dev, who may or may not understand everything.
It’s important to remember that we’re all juniors when we first set foot in legacy code, even our own if it has been a few months! Having code written for a junior in mind very much helps keeps things maintainable.
The Language Lawyer
The final person on a great peer review would be what’s been called the ‘Language Lawer’. Language lawyers are your subject matter experts in the programming languages and technologies being used.
An important part of a good peer review is access to supporting documentation such as designs and requirements that help the reviewer understand “what problem is this trying to solve” and “what are the broad swaths of the solution I should be looking for?” This can be done via some supporting documentation, however, ‘lore’ (or informal, baked in knowledge in a team that isn’t documented) is much stronger with the two people listed above. The architect knows the broader problem the code tries to solve, whereas the embedded junior / Haseltine junior has a rough idea of how the design is supposed to work.
What’s left for the language lawyer?
Well, the language lawyer ends up being effective precisely because he doesn’t need formal knowledge of the project to have an impact. Language lawyers are good ‘in the small’ – they know about the APIs, the libraries, the language rules, and idiomatic code. Thus, many of their comments won’t be on why a certain design was used versus something else, but rather, why the actual code was written the way it was.
The Language Lawyer best solves the “spreading best practices” problem by using Peer Reviews as a way to help the author become a better programmer.
They’ll look for:
- Nitty gritty bugs such as null pointer dereferences
- Small performance gains that don’t hurt readability
- Idiomatic code structures and improvements to code’s modularity that takes advantage of a specific language
They also help improve quality, especially in error-prone languages.
Putting it all together
Small teams don’t always have many people to draw on to build the ideal peer review. And even in large teams, you don’t always have the make-up required to get your junior, your architect and your lawyer all in a peer review.
But working with this pattern can help you make quick decisions about who’s needed. If someone is a great programmer in Ruby, then the value added by a Haseltine Junior or Architect is higher than that of a language lawyer.
If you’re most big picture senior person is pushing out some code, getting the big picture perspective probably isn’t going to get you as far as ensuring a language expert is involved.
And if someone’s spitting out code without help, getting a dedicated peer reviewer to play the part of a junior to ask the ‘dumb’ questions is going to help more than an architect or lawyer.
If you can build your ideal team, that’s great. If you can’t, look for what weaknesses remain and cover those to get the most out of your peer reviews.
Per the 5 pillars of quality, up next is static analysis. As always, treat all links as required reading unless stated otherwise.
What is Static Analysis?
Static analysis is a broad term used to categorize all tools that you can run on code to tell you whether it’s correct or not. It’s a program that you run on your program which tells you whether you’re making mistakes.
It’s called a static analyzer because it doesn’t run your code to figure out what’s wrong – it analyzes it “statically”, or unchanging.
What kinds of Static Analysis are there?
There are three broad categories of static analysis tools out there. Linters, static analyzers proper, and model checkers / theorem provers.
The most prevalent, and the one you’ll be using from here on out, is called a linter. Linters “remove lint” from programs. They operate primarily on the text of the program itself, looking for simple stylistic mistakes. Think of them like spellcheckers. They more or less look at your program line by line and give you warnings if for example, you use a variable name that is hard to understand, or if you switch between spaces and tabs.
The other categories (static analyzers, model checkers, theorem provers) can all eliminate harder and harder bugs to suss out, but require substantially more work. Python is a ‘dynamic language’, which means the entire program isn’t really defined until it’s running, and so ‘static’ analysis of the code itself tends to have too many unknowns to be worthwhile.
We’ll be investigating Pylint in particular which is primarily a linter but also does some more difficult checks as well by attempting to interpret your python without actually running it.
Benefits of Static Analysis
There are a number of benefits of static analysis as well as drawbacks, but most important to note is that many of these benefits and drawbacks tend to be complementary to testing, peer review, types / contracts and design (our other pillars). Static analysis in and of itself is of limited power, but combined with the other pillars of quality can be very powerful.
First off, stylistic checking ensures a code base has a single, consistent style. This helps maintainers as they can expect certain patterns in the whitespace, variable names and other parts of the code to read it more easily.
It also helps peer reviewers since, again, a single way to use whitespace, variable names, and other stylistic concerns make code easier to read than many different styles.
Absolute removal of certain kinds of bugs
Some bugs, such as variable misspellings, which would end up crashing your program at runtime can be absolutely eliminated from your code base.
This is in contrast to testing. Testing can only show that the one path through the code that the test executes does not fail in any way that the test doesn’t expect. In other words, you can never really prove your program works via tests alone, since each test only proves that that one, single scenario worked.
Linters can prove that your program is free of certain kinds of bugs, completely and absolutely.
Very low cost in terms of time; quick turn around
Compared to contracts, peer reviews or tests, linting takes nearly no time at all to run. Tests take a lot of time to write, and later, to maintain. Peer reviews can involve multiple person hours as other developers look at your code.
Linting takes, usually, on the order of seconds. This is great for two reasons.
First, it means that the level of effort you have to get a clean lint is minimal compared to testing. You can squash a lot of bugs very quickly with linting, a lot more than you would via testing.
Second, it means you can lint often. In the previous chapter, I showed you how to automatically run your tests as files change. This is a great productivity tool as you can find out if you broke a test very early.
Trying to make sure that “bad thing” gets feedback ASAP is a key to learning, and it’s also a key to fixing “bad thing” fast. The mistake you just made is still fresh in your mind, so getting feedback on it means you don’t have to go looking for the bug – it’s right there, right where you were already working.
Tests and testing still require some care – tests can easily take minutes or hours, which means you have to start splitting up what tests run when. Usually, we like ‘unit’ tests to be our fast tests, the ones we can run automatically on changes, whereas other tests we may run nightly.
Linters, however, are super fast. They can be run faster than unit tests even. Many linters are actually built into text editors and IDEs so that when you save your file, the linter automatically runs and tells you what errors it has found (again, like spellcheck).
For static languages like C++ or Java, it’s often said that just getting your program to compile is like one big test. We don’t get that luxury in python – however, we can get most of it back by linting early and often. A clean lint is like a version of a test that runs quickly and eradicates many kinds of errors.
Can’t test your tests
Speaking of testing, it’s hard to test your tests, and not always value added to do so. TDD ensures you do some minimal testing of your tests – this is why you make sure the test fails first and then passes when you do code changes. All too often have I written tests after the fact, then when a bug crept in, I realize that the way I wrote my test would have never found it because I screwed up writing the test.
How do you ensure your test code is high quality then? With the other pillars – static analysis in particular. Ensuring your test code has a clean lint gives some assurance that your tests are maintainable and readable, as well as free of certain kinds of errors. This, in turn, makes your tests more easy to peer review for other issues.
It’s a virtuous cycle!
There are some downsides to expect from linting.
High false positive rate
Linters are going to find a lot of issues that just aren’t that important. Whitespace issues may hurt readability, but they’ll never crash a program. Variable names are nice to get consistent, but the interpreter doesn’t care.
Most of what you’ll be fixing will be things that may have never ended up crashing your program.
Fixing them, though, is often very simple. And you’ll get into a habit of breathing a sigh of relief when the linter runs and finds no issues in your code. You’ll become more confident as a coder, and be much more willing to take risks.
Types of bugs found usually aren’t that nefarious
Along with the above, the worst bugs are often those hardest to catch via linters. If you’re handling credit cards, making sure you debit the right account isn’t going to be something a linter can help you with. Making sure you don’t leak personally identifiable information is something linters would struggle to help you with too.
Often the bugs found are simpler readability and maintenance errors, as well as some actual defects that are pretty quick to learn how to avoid. On the other hand, linters prepare the code for people who can find those bugs in peer review and can give more assurance to test code that it’s correctly exercising your credit card and PII functionality.
Hard to do in dynamic languages
One final drawback is that linting is hard to do in dynamic languages, as discussed above. This means things that some languages can spot via static analysis alone like resource leaks (you grabbed memory from the operating system and forgot to give it back) aren’t going to be things Pylint can find, though.
On the other hand, linters end up being of about equivalent power to the compiler in dynamic languages – which is a great first step towards ensuring your program works. If another program reads it and says “I don’t see anything obviously wrong with this”, that’s some assurance.
Despite the drawbacks mentioned above, often we go along with fixing all the false positives as you don’t really know whether or not something is wrong until you try to fix it.
Code with lots of Pylint errors can be said to be ‘smelly’ code – we don’t know something is wrong for sure, but we need to check it out. Check out the write up here, and then skim a few of the code smells classified on C2.
Often you might fix one or two Pylint errors and three more will pop up. This is a sign that there’s actually a fundamental design flaw that leads the code to be brittle and hard to understand – even if on the surface it just seems like a few small warnings from Pylint.
If we keep the code squeaky clean, we’ll avoid any smell.
Pylint is pretty much the industry standard linter for python. It does a lot of stylistic checking based on what’s considered true idiomatic python (called PEP8) as well as some deeper analysis.
We’re going to loosely follow it’s tutorial, which involves the Ceaser cipher which you’ll want to read up on.
First, fork this repo. Then create a branch in your forked repo where we’re going to do some work.
Then, go ahead and clean up the
ceaser_script.py file using Pylint according to this tutorial.
When it’s clean, commit.
That is a workflow you might use if you inherit some code and want to clean it up – often running a linter on inherited code is a good way to both improve its readability as well as get familiar with it.
Next, we’ll work on a workflow that combines both linting and test driven development. In the future, you’ll be required to use the workflow practiced below!
The next step will be a little more difficult – create a new file,
ceaser_test.py – we’re going to refactor or change the script you worked on before to be more reusable.
Write a test for a function you haven’t written yet in
ceaser_test.py, the function will have the following signature:
encode(message, offset)so you can call it like this,
encode("beware the ides of march", 3)and get a message with the Ceaser cipher offset 3. (You’ll probably have to create a test message by hand)
Ensure this tests fails. You may have to put an empty function in
ceaser.pythat does nothing.
- Ensure pylint is clean.
ceaser_script.py, copy and paste some of the functionality into your encode function in
- Debug it until your test passes.
- Ensure pylint is clean.
Write a test for a function you haven’t written yet in
ceaser_test.py, the function will have the following signature:
decode(encoded_message, offset)so you can call it like this
decode("jewlrp ajk ippf kl aqjrk", 9)and get a decoded, English message using the Ceasar cipher. (Again, you’ll have to create a message by hand, the above was just random letters I made up, it’s not an actual message.)
- Ensure pylint is clean.
ceaser_script.py, copy and paste some of the functionality into your
- Debug until your test passes.
- Ensure pylint is clean.
- Open a pull request on your branch.
The above illustrates a pattern – in Test Driven Development with Static Analysis, every commit should either be adding a test or code. Every commit has 10/10 on pylint and 100% coverage.
When a test fails and you can’t figure out why, then break out the debugger. Also, often running the debugger the first time you want to walk through your code can also be a good practice.
Hook up Pylint to your Text Editor
Fixing things as soon as they happen creates a tight feedback loop that both makes you more productive and accelerates learning. It’s easiest to see during testing.
If you make a change to your code, and your tests fail, you know what you just changed. All the context is still in your head and you’re much more quickly able to debug code and get the test passing again. Moreover, you know that the changes you made in the code ended up affecting tests that you may have not predicted. You learned something about the code.
Compare that to making a lot of code changes, then days later, running the tests. A few fail. You have no idea what changes are tied to which failures. You can try taking a debugging approach, and you can look at your git diffs to see what’s changed, but this is a much more complex problem than above. You’ve already moved on, mentally, to other things. Debugging the same issues could take two to ten times longer.
The lesson? Debug as close to possible to when you added the bug.
Linters work like fast unit tests – something that can run in the background of your editor and let you know about issues as soon as possible. Again, since they work like a compiler for a dynamic language, they’re a single large global test for things like misspellings, syntax errors and other things you’d otherwise have to wait for your tests to run. Catching them as soon as possible speeds you up, and allows you to focus your testing efforts on things the linter can’t catch, like actual logic errors rather than merely running the code looking for syntax problems.
Go ahead and use the instructions below to hook up Pylint to your editor of choice:
Hook up Pylint to Git
Another approach is to have git automatically reject any commit that doesn’t have a 10/10 out of Pylint.
When running from inside a text editor, Pylint decorates the current file. If you make changes to that file, and Pylint gives you a clean bill of health, that doesn’t mean that your changes didn’t suddenly break other files.
For example, you may rename a function, and forget to rename other places it is used. Pylint would flag your current file as clean, but other files where that function as being used as having errors.
Putting a Pylint check-on-commit allows you to do a whole project Pylint at the last moment to prevent adding any erroneous code to the repo.
Some additional resources can be found here.
What about false positives?
For the duration of these chapters, we’ll treat every pylint error as a real error. You’ll be expected to fix everyone, whether you agree or not unless your mentor explicitly tells you to ignore them.
That being said, in the real world, often you have to make compromises. For that purpose, there are configuration files to turn off families of checks, suppression files to suppress warnings line by line, as well as in line suppressions. No task is tied to these, but go ahead and skim these links so you have a cursory understanding of how to squelch a Pylint error.
To move on…
When you’re done, you’ll need to provide your mentor with the following…
- show your mentor a 100% coverage report
- show your mentor a 10/10 Pylint report
- open a pull request on your code, and clean up any comments your mentor has.
- show your mentor that you have pylint installed in your text editor
- show your mentor that you have a pylint hook in your git repo
- In addition to the above, check out each commit and ensure that each one is pylint clean.