The Skeptical Methodologist

Software, Rants and Management

SYWTLTC: (AB) Chapter 3.3: Assertions

“There are no Black Swans”

How do you ensure the above in a computer program?

Right now, we have two methods to ensure certain quality prepositions, tests: and linting. Tests, if you recall, set up a scenario and check that things work exactly like you expect. To ensure that there are no black swans in our program, our tests would end up looking like…

def test_scenario_1():
    go_to_north_america()
    for swan in get_all_swans():
        assert swan.color != "black"

def test_scenario_2():
    go_to_south_america()
    for swan in get_all_swans():
        assert swan.color != "black"

Testing this way can become onerous. You have to think of all possible situations (or places) where black swans may occur. These tests can also take a long time to run. And what do you get in the end? Well, it turns out there was an enclave of black swans living on the dark side of the Moon that you didn’t anticipate. For all that testing, you didn’t ensure there were no black swans.

What about linting? Linting, unlike testing, can guarantee the absence of a certain subset of errors in our code. But these are largely stylistic errors – linting cannot understand our code’s logic, it only processes it as text.

Syntax vs Logic Errors

There are two kinds of errors we might run into while programming – syntax errors and logic errors. Syntax errors are errors in the way the program is actually encoded. They happen when we’ve literally failed to write a program. They’re usually easy to find, and linters expand the universe of things we can consider syntax errors.

Logic errors, however, are harder to find. For instance, can you think of a linter that would catch the error below:

#This program prints "hello coit"
print("Hello Eric!")

What linter could catch the error above – that is, the comments are out of line with the behavior, and it leaves us wondering what exactly the program is supposed to do. Maybe the comments are wrong, maybe the program is wrong. Maybe both are wrong. A linter that would be able to spot the error above would have to know what the requirements are (what the program was supposed to do) as well as be able to parse and understand English to recognize the comments are out of line with the program.

Suffice it to say, such linters don’t exist. And many attempts to create programs that can understand requirements and English have been made – perhaps in the future, we’ll get programs smart enough to find the error above. But we don’t have them yet.

Back to Black Swans

Our black swans are encoded into the logic of our program. Testing can’t find them, and linters can’t rule them out.

We have two remaining arrows in our quiver – assertions and types. Types we’ll get to in the next module, and they are the only thing that can actually guarantee we don’t have black swans. For now, though, let’s look at assertions.

What’s an assertion

Assertions and the ‘assert’ keyword are things that should seem familiar to you. You use the assert keyword in your unit tests.

assert x == 1, "X should equal 1"

In Python, the above assertion implements the following behavior: it checks to see if the variable one equals the number 1. If it does, then we keep running our program. If it doesn’t, the program halts and the message “X should equal 1” is printed. That’s it. Our program is toast.

In our tests, our assert keyword is actually rewritten by the test framework to also keep track of some test code stuff. However, it works more or less the same – check something, if it’s false, report an error and crash.

Assertions and Black Swans

How do assertions help us catch black swans?

Well, in the above code we were told that they found an enclave of black swans on the dark side of the moon. Who’s they? Our clients, unfortunately. And they’re pissed because we said there were no black swans. That’s the reason they went with us rather than their competitor.

Assertions help us make promises like this. If instead of all the testing above, which again, ultimately didn’t turn up any black swans, we wrote code like this:

def handle_swan(swan):
    assert swan.color != "black", "We can't handle black swans!"
    ...do other stuff...

Then we’d guarantee not so much that we won’t have black swans, but instead, the program will do something predictable and safe if we do. That is, crash in a safe way, and let us know why. This ends up being incredibly valuable.

Some real life examples

Let’s say you’re writing a script that parses a text file. You glance at the file and it looks like the format is pretty regular, so you write some stuff and assume the rest of the file looks the same.

This program reads in some data does some transformations on it and writes it back out to the same file. It’s a big file, and it has to be fast, so you can’t keep a lot of stuff in memory.
Only someone accidentally put an error half way through the file. The next day, after your script runs, you realize you deleted half the file because you were off by one character in your parsing.

Oops.

In this case, you could have asserted that things lined up in the file like you expect. If they ever deviated, you’d crash the program, immediately, and leave the rest of the file unchanged. Then the next day you can debug what’s going on, and pat yourself on the back for not accidentally deleting your project.

Let’s say you’re writing some embedded code for an X-Ray machine. Your machine takes X-Rays of small children and finds things that might be cancer. To stay safe, the machine has to administer a dose dependent on the size and weight of the child.

However, that idiot Josh made some changes last minute and turns out, if you set the machine up just right, it thinks the kid is about the size of three elephants.

Ooops, you just killed a kid.

You could have asserted that the dosage never exceeds a certain amount, and now little Sally’s parents wouldn’t be casket shopping.

These two examples illustrate two strengths of assertions, discussed below.

First, Assertions Document Your Assumptions

When you assert in your text file that you should find the letter ‘c’ about 4 characters into each line, you’re basically saying to the reader “If C isn’t 4 letters in, then something is dreadfully wrong. I have no idea what’s going on and I should stop writing to this file”.
This ends up being a very valuable tool for two reasons.

First, any assumptions you are making in your code are now assumptions that every other reader and maintainer of the code (i.e., you in six months) now are aware of. This is invaluable communication, as many assumptions like these are so often either never written down, or written down in comments.

Comments are better than nothing, but comments aren’t executable. Assertions are. Assertions will crash your program if the assumptions change, and require you to rewrite bits. That’s okay and is often the desired behavior – wouldn’t you want to know when your assumptions need to be updated?

Second, it allows you to make more assumptions. Often you try and code for corner cases, errors that may or may not ever happen, or weird things that you’re not sure are impossible and thus want to handle. This makes code really complex. You could be lazy and just assume none of these things ever happen – but when they do, your stuff will break in unexpected ways and be very hard to debug.

Or you can just assert that they don’t happen. Then you’ve let the future maintainer know you didn’t handle that corner case, you fail in a known good way if it does happen, and it allows your code to cleanly do the thing it should do, and assert that all the other stuff never happens.

Basically, if you’re reading code and you are thinking “this should never happen” or “this is impossible” – then assert it. If you’re thinking “this must be the case”, then assert it! Some people say “that’s impossible to do, that assertion will never trigger and its a waste of time” – that’s precisely why you write the assert! Because you believe it’s impossible, you believe the assertion will never fire, so you should check that belief and write the assert.

You’ll be surprised how often it fires.

Second, Fail Fast and Fail Often

By littering your code with assertions, you can begin to adopt a ‘fail fast and often’ design mentality.

Often, when trying to make our code robust to violations of our assumptions, we try and think of every possible thing and handle it. This is called defensive coding. This makes the code very complicated, and make the ‘essential’ nature of what you’re doing buried under lines and lines of error handling.

Erlang is a programming language invented in the late ‘80’s by Ericsson (the mobile phone company). They wanted something that would keep their phone switches up with better reliability than what they had been using. When phone switches go down, people can’t talk. So they started trying to figure out how to make their switches never go down, or at least have very high uptime.

The standard advice of the day was defensive coding – make sure the switches never go down in the first place. Ericson, through Erlang, actually tried a different approach – instead of trying to limit failure, they just made sure their programs were really good and really fast at coming back up.

Erlang programs don’t try to handle errors. They just crash as soon as they can with something informative and then restart. This has lead to switches that have very high uptime, because while they can and do crash all the time, there’s always a backup running and the programs themselves start in milliseconds.

What we can learn from this is that code becomes greatly simplified if instead of trying to handle errors, we just crash in a known safe way when we encounter them. Assertions allow us to do that. Assertions tell us two things. First, they tell us when code fails, what went wrong. They give us something much more informative than code normally does when it fails – instead of a backtrace where we have to start theorizing what might

Assertions tell us two things. First, they tell us when code fails, what went wrong. They give us something much more informative than code normally does when it fails – instead of a backtrace that ultimately is where the program (already in an error state) finally did something that the OS killed it for, a backtrace where we have to start theorizing what might have lead the program to its bad state, we get a crash and a message of what exactly violated some assumption exactly where it first occurred.

The second thing they tell us is that if our program doesn’t crash, then all of our assumptions were true for that run of the program! This gives you confidence that your program works. So often, programs appear to work, but later we ask ‘how could this have ever worked?’ Assertions get rid of those sorts of errors. If the program worked, it worked in precisely the way you intended it to work.

Not when there’s nothing left to add, but rather when there’s nothing left to take away

Art, Antoine de Saint Exupéry tells us, isn’t done when there’s ‘nothing left to add’, but rather when there’s ‘nothing left to take away’. This is important with software design too.

Often people are enticed by languages that make it easy to do ‘a lot of things’. They’re considered powerful languages – it’s easy to do anything in them, and with very little work. C is considered powerful for this reason.But all that power can actually be very limiting. In contrast to the French Poet quoted above, the great Philosopher, Spiderman, has taught us that “With Great Power comes Great Responsibility”. Sometimes this responsibility is too much.

How do we make our languages less powerful? How do we make our programs capable of less rather than more? With assertions. Assertions tell us that the program is now incapable of a whole way of working. If we assert X > 0, that means the program is incapable of doing anything if X is less than or equal to 0.

Theoretically, when we’re done, we have a program that’s capable of only one thing, and that one thing is what it was designed to do.

Assertion Density

If poetry and comics don’t convince you, perhaps science will. This study, done by Microsoft in 2006, shows that as assertion density went up (assertions per 1000 lines of code), defect density went down (defects per 1000 lines of code).

It also showed that many of the defects that were eventually found in the bug database for these projects were found via the use of assertions.

Debugging can take up the lion’s share of your time. Hopefully, you’ve already found through code combat or the simple code exercises in the chapters so far that your initial coding doesn’t take too long. What takes a long time is when something doesn’t go as planned and you have to figure out what.

Every bug is different, and so often your mentor might be powerless to help you. Instead, she probably has to sit down, step through your code, and try and replicate the error herself. How to write a for loop? She knows off the top of her head. Why your for loop doesn’t work? That she doesn’t know, and she won’t until she spends a while debugging it.  Debugging takes a long time, and it’s not fun.

Assertions reduce the number of bugs you generate and reduce the amount of time it takes to debug the ones you do generate. They’re very valuable.

Best Practices

Code contracts as an idiom. More on this later…

We’ll get into ‘design by contract’ later, but a quick introduction is in order to help you think of ways to introduce assertions into your code.

First, the precondition. Preconditions are things that should be true of your program’s state at the ‘beginning’ of a function. If I have a function that takes in two arguments and one needs to be larger than the other, I can assert that with a precondition. Preconditions most closely model ‘assumptions’ in code.

def foo(x, y):
    assert x > y, "X should be larger than y!"
    ...rest of foo...

Second, the postcondition is similar, but makes promises about the return value of functions rather than arguments of a function. For example, maybe foo has to return an integer larger than 10. Post-conditions most closely model “promises” you can make to other parts of code.

def foo(x, y):
    assert x > y, "X should be larger than y!"
    ....other parts of foo...
    assert return_value > 10, "Return of foo needs to be larger than 10!"
    return return_value

Try to think in terms of preconditions and postconditions – what should you assume of the arguments of every function you write? Better yet, what can you assume to make writing the function easier? Assert it!

What should you promise? Can you promise more? If so, do it!

Sanity Checks

Another pattern for adding assertions is the ‘sanity check’. This idea weakens the idea of something that ‘must be’ true, or ‘should be’ true to something that ‘really ought to be true, I think’.

If you’re a bathtub, it really ought to be the case that the temperature can’t be set to above scalding. If you’re a microwave, nothing really ought to be in there for 99 hours. These kinds of assertions may end up firing more often than others and need some ‘tailoring’ to work. But they also can serve as great ‘canaries in the coal mine’ – they’ll almost always fire first, before anything is ‘technically wrong’, and give you a situation that really ought to be looked at to see if it’s a problem.

The downside of sanity checks is often we might have tests that test corner conditions to ensure things work – these corner case tests are more likely to trigger sanity checks. It’s a design trade-off on whether you want to ban some corner conditions outright or make sure you work properly through them.

No side effects

Side effects are when code actually ‘does’ something, like print to the screen, write to a file, or send a message over the network. They should never occur in assertions!

This is because, in some languages, assertions can be turned off. Moreover, it usually is more readable when assertions are simply true or false, or functions that can return true or false. Assertions shouldn’t ‘do’ anything in and of themselves.A common idiom is when a function returns a value saying whether it worked or not. The wrong way to check the return value would be to assert directly.

A common idiom is when a function returns a value saying whether it worked or not. The wrong way to check the return value would be to assert directly.

assert foo_prints_to_screen(1,2), "Foo should return true!"

The right way is to pull off the return value and assert just on that.

val = foo_prints_to_screen(1,2)
assert val, "Foo should return True!"

If debugging, someone should be able to skip your assertions or even comment them out and not have any code no longer work because you no longer read correctly from a file or something.

A special kind of function that does nothing and just returns true or false is called a predicate and is a-ok to put in an assertion. Many times these predicate helpers make code more readable.

Nothing that takes a long time

Likewise, assertions are just supposed to represent quick promises about the code. They shouldn’t take too long themselves, as that would screw up performance numbers between when assertions are on and when they are off.

So only check variables or run predicates that you believe run quickly. I wouldn’t invert a thousand matrices to check an assertion. Using assertions to check long-running behavior is probably something better done as a test.

Consider writing predicate library helpers

Alluded to above, predicates can make assertions easier to sprinkle throughout your code as well as more readable. Don’t shy away from writing and using functions that are only used in assertions.

For example, comparing two floating point numbers directly like 3.14 and 3.15 is notoriously dangerous to do. The Numpy numerical computing library for Python has a function to do so – compare two floating point numbers and return true or false. This reads very well inside assertions.

def foo(x, y):
    assert numpy.isclose(x * 3.14, y * 3.15), "X and Y should be close after modifiers"
    ...other code...

Don’t be afraid to use other’s predicates or write your own!

What about exceptions or ‘known failures’?

What if your program is ‘supposed’ to tell the user their input was bad and ask them for more input? What if, on disconnect from the server, it’s not supposed to crash but reconnect? What if something happens that you can actually recover from?

We will talk about this – error and exception handling – later. For now, just assert that these things don’t happen.

Why? Shouldn’t we learn how to ‘do it right’ first?

There are a few reasons.

First, error handling is notoriously difficult to do. There are a few patterns I’ll introduce that already rely on a good understanding of assertions.

Second, you will handle preciously few errors in your code well. This goes with the above but extends it. Not only is error handling hard, and unless you test your error handling code it’s almost certainly wrong, but there’re just too many possible errors to handle.

In C, the printf can fail. Hardly anyone knows this. And even fewer know what it means when it fails or how to recover from it. Instead, they write code as if printf didn’t fail. What asserts allow you to do is find a middle ground. They allow you to say “I don’t know how to handle this error, but I don’t want it to happen, and I’m not going to pretend it never happens.”

Error handling code, if taken to the extreme, can dominate your code base and make it incredibly difficult to read. It’s surprisingly tricky to get right. And most errors won’t be handled anyway, even if you try. You’ll produce higher quality code if you learn to assert as much as you can and convert those assertions to error handling code on a case by case basis.

We’ll discuss that later.

Relationship with TDD

Finally, assertion density has a very synergistic relationship with test driven development. Likewise, as we’ll get into later when we focus more on design, test driven design is also very synergistic with design by contract!

There are two things to think about when working with assertions and tests.
First, each assertion is only worth the number of times it’s executed. If I have 100 assertions in my code, and 10 in my test, but I only have one test, then I’ll execute 110 assertions over my code.

However, if I have 100 assertions in my code, and have 20 split between two tests… That means the 100 assertions in my code are exercised twice, with potentially different values. That results in 100 + 10 for test 1 and  100 + 10 for test 2 = 220 assertions exercised.

Thus, the higher your assertion density, the more valuable writing another test is, and the higher number of tests you have, the more valuable it is to add assertions in the code. They go hand in hand!

Second, related to the above, assertions help squash the ‘exponential integration test’ problem.

Unit tests are the tests you’re most familiar with writing. They call a single function with known inputs and check that the outputs are what you expect.

Integration tests are tests which call code you wrote which call other code you wrote. It tests that all the code that you or your team has written works together, or integrates.
The problem is maybe you worked on unit 1, and your colleague has worked on unit 2. We then have to write one integration test that tests the connection between unit 1 and 2.

That’s pretty easy.

But when a third unit is written, now we have to write two more tests – one to test unit 1 and 3, and another to test unit 2 and 3. When a fourth unit is added, three more tests need to be added – units 1 and 4, 2 and 4 and 3 and 4. We can actually test a lot more than this for every unit added to the system.

This quickly gets out of hand, ensuring that you can’t really have complete ‘integration coverage’ of tests like you can have line coverage from unit tests. There’s no way you can write a test for all possible combinations of units for any system of any level of complexity.

There’s a third level of testing we’ll call system testing – that’s when you test everything together. You don’t mock, stub, or fake out anything. So you test all units, 1, 2, 3 and 4. System level testing is pretty easy to set up, but hard to increase coverage on. Usually, I’d advise you write one or two high-level system tests, and then the rest as units to up your coverage.

A funny thing happens, though, when you have system tests and assertions. Let’s say unit 1 has a few preconditions and postconditions, unit 2 as the same, and so on. Our unit tests all exercise these preconditions and post conditions just fine. But every system test we write suddenly benefits from all of the assertions we have through our code. So we get a lot more assertions checked (as we stated above).

But there’s more! All of these assertions are along unit ‘boundaries’. That is to say, we’re writing assertions at the beginning and ends of all of our units (the ‘preconditions’ and ‘postconditions’, which is precisely where they talk to each other (often called the boundary). This is what integration tests are supposed to check.

To sum up, assertions basically provide a basic level of integration testing ‘built in’.

To apply what I just said, if you find yourself struggling to figure out how to test something – often this happens when your code is complex enough that you’re trying to set up an integration level test of a few moving parts and it’s getting hairy – think about some set of assertions you can embed in the code instead. Often assertions are much easier to add than integration tests are to write!

Don’t type check… too much

We’re going to get into types (which eliminate black swans) in the next module. You can do some dynamic type checking with python using assertions by insisting that variables are certain types, or inherit from certain types. These are often good ideas for assertions at this stage, but don’t go overboard.

One of the strengths of python is that its code can be called on a lot of different ‘stuff’. In other words, if you have code that reverses a string, and you call it on a list, you probably have reversed your list. This is a good thing and is called ‘polymorphism’.If you had done an assertion in there that the code only works on strings, you would have made the code needlessly dedicated to strings.

If you had done an assertion in there that the code only works on strings, you would have made the code needlessly dedicated to strings.

While we haven’t gotten into object orientation yet, or typeful programming, suffice it to say you should only assert on things that are ‘reversible’ – basically, things on which your algorithm would work. So beware making things too concrete.

Moreover, it’s considered ‘pythonic’ to ask forgiveness rather than permission. That means code should try (from typing perspective) to get as far as it can before crashing. This is not the same as failing fast on a value perspective.

(Types are ints, strings, floats, lists, dictionaries or the ‘category of container’ of a variable, while values are what the value actually is. For example, you can check the type of something in python with the ‘type’ function:

>>> x =  1
>>> type(x)
<type 'int'>

In the above, x has a type of int and a value of 1. Asserting on values is usually better than types, though some type-checking is fine!)

Don’t test your assertions

Writing a test to make sure assertions fire is a waste of time. It’s also impossible in some languages. In python, assertions are ‘exceptions’, which can be ‘caught’ and handled. In C++, assertions crash the program. Testing assertions is a great way to double the amount of work you have to do. Plus, assertions deeply embedded in some nasty complex code are hard to test, despite being some of the most valuable assertions!

From a psychological perspective, you’re going to find yourself eventually finding reasons not to test your code because it’s too complex. Testing can be, at times, painful. If you make assertions that painful, you’ll find excuses not to assert. Make assertions as easy as possible.

(TDD, by the way, also makes tests easier to write by doing them first. Unless a developer is particularly skilled in writing testable code, testing after development is done can be painful indeed.)

Instead, assertion ‘quality’ is most easily measured through peer review and even static analysis tools. There’s not a linter written right now for this, but it’d be trivial to write a linter that checked that there were assertions at the beginning and end of functions, as well as reporting the assertion density.

Note how interesting that is! We said above that linters only work on syntax issues – not logic. Assertions can work on logic. But how do we know if we’ve asserted well? That’s a syntax issue! We’ve ‘bootstrapped’ ourselves by combining tools together to make a formerly very hard problem only slightly hard.

Code Challenge

Read this for more examples of how to use assertions in python.

For some ideas on how to do useful type checks in python, check out the third option on the first answer here and the whole of the first answer here.

While not required, this is sometimes a useful syntax construct to keep assertions as ‘one-liners’.

Clone this repo. It has the beginnings of some simple statistics functions. Use the following development process:

  1. Pick one of the functions (mean, median, range or standard deviation) and write a test for it.
  2. Ensure your test fails, and your code is pylint clean.
  3. Commit.
  4. Think of assertions you can run as a precondition for your function. Add those.
  5. Ensure your test still fails, and your code is pylint clean.
  6. Commit.
  7. Implement the function.
  8. Ensure your test passes, the assertion doesn’t fire, and your code is pylint clean.
  9. Commit
  10. Pick another function and go back to step 1 until you’re done with all four.
  11. Open up a pull request and ask your mentor for peer review

Definitions of the functions can be found here as well as in the code.

Mean
Median
Standard Deviation
Range

For Mentors

Check out the ‘answer_key’ branch of the repo above to see some examples of good assertions. For students, don’t look at that branch – you’ll have more or less wasted your time reading this if you just cheat and look at the branch. You’ve already gotten this far, might as well give it a shot without looking at the answers right?

Also, don’t read ahead unless you’re a mentor – again, it just ruins things for you.

  • Did your mentee use the Python statistics module? If so, ask them to reimplement the algorithms from requirements. Let them know that they did it absolutely right and in the real world, you would always use libraries. Commend them for doing research. Let them know that this challenge is also to practice implementing an algorithm from a definition, and that’s why we’re doing it that way.
  • If they did not use the statistics module, ask why not? Good research at the beginning of any project to see what’s already been solved is valuable.
  • Did your mentee implement the population standard deviation or the sample standard deviation? Whichever one they did, ask them to reimplement using the other way. Talk about how errors and defects can be in the requirements themselves – in this case, the requirements weren’t detailed enough for them to know which one to use.
  • Discuss whether the tests and assertions helped in their initial development, and whether it helped in any ensuing refactors? Did the tests and assertions help them understand the problem, as well as shrink the problem space? Did they help ensure that everything still worked after refactors?
Advertisements

December 19, 2016 - Posted by | Uncategorized

No comments yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: