The Skeptical Methodologist

Software, Rants and Management

SYWTLTC: (AB) Chapter 3.2 Quality : Static Analysis

Go here if you want the prolog and table of contents to the SYWTLTC series!

Per the 5 pillars of quality, up next is static analysis. As always, treat all links as required reading unless stated otherwise.

What is Static Analysis?

Static analysis is a broad term used to categorize all tools that you can run on code to tell you whether it’s correct or not. It’s a program that you run on your program which tells you whether you’re making mistakes.

It’s called a static analyzer because it doesn’t run your code to figure out what’s wrong – it analyzes it “statically”, or unchanging.

What kinds of Static Analysis are there?

There are three broad categories of static analysis tools out there. Linters, static analyzers proper, and model checkers / theorem provers.

The most prevalent, and the one you’ll be using from here on out, is called a linter. Linters “remove lint” from programs. They operate primarily on the text of the program itself, looking for simple stylistic mistakes. Think of them like spellcheckers. They more or less look at your program line by line and give you warnings if for example, you use a variable name that is hard to understand, or if you switch between spaces and tabs.

The other categories (static analyzers, model checkers, theorem provers) can all eliminate harder and harder bugs to suss out, but require substantially more work. Python is a ‘dynamic language’, which means the entire program isn’t really defined until it’s running, and so ‘static’ analysis of the code itself tends to have too many unknowns to be worthwhile.

We’ll be investigating Pylint in particular which is primarily a linter but also does some more difficult checks as well by attempting to interpret your python without actually running it.

Benefits of Static Analysis

There are a number of benefits of static analysis as well as drawbacks, but most important to note is that many of these benefits and drawbacks tend to be complementary to testing, peer review, types / contracts and design (our other pillars). Static analysis in and of itself is of limited power, but combined with the other pillars of quality can be very powerful.

Consistent style

First off, stylistic checking ensures a code base has a single, consistent style. This helps maintainers as they can expect certain patterns in the whitespace, variable names and other parts of the code to read it more easily.

It also helps peer reviewers since, again, a single way to use whitespace, variable names, and other stylistic concerns make code easier to read than many different styles.

Absolute removal of certain kinds of bugs

Some bugs, such as variable misspellings, which would end up crashing your program at runtime can be absolutely eliminated from your code base.

This is in contrast to testing. Testing can only show that the one path through the code that the test executes does not fail in any way that the test doesn’t expect. In other words, you can never really prove your program works via tests alone, since each test only proves that that one, single scenario worked.

Linters can prove that your program is free of certain kinds of bugs, completely and absolutely.

Very low cost in terms of time; quick turn around

Compared to contracts, peer reviews or tests, linting takes nearly no time at all to run. Tests take a lot of time to write, and later, to maintain. Peer reviews can involve multiple person hours as other developers look at your code.

Linting takes, usually, on the order of seconds. This is great for two reasons.

First, it means that the level of effort you have to get a clean lint is minimal compared to testing. You can squash a lot of bugs very quickly with linting, a lot more than you would via testing.

Second, it means you can lint often. In the previous chapter, I showed you how to automatically run your tests as files change. This is a great productivity tool as you can find out if you broke a test very early.

Trying to make sure that “bad thing” gets feedback ASAP is a key to learning, and it’s also a key to fixing “bad thing” fast. The mistake you just made is still fresh in your mind, so getting feedback on it means you don’t have to go looking for the bug – it’s right there, right where you were already working.

Tests and testing still require some care – tests can easily take minutes or hours, which means you have to start splitting up what tests run when. Usually, we like ‘unit’ tests to be our fast tests, the ones we can run automatically on changes, whereas other tests we may run nightly.

Linters, however, are super fast. They can be run faster than unit tests even. Many linters are actually built into text editors and IDEs so that when you save your file, the linter automatically runs and tells you what errors it has found (again, like spellcheck).

For static languages like C++ or Java, it’s often said that just getting your program to compile is like one big test. We don’t get that luxury in python – however, we can get most of it back by linting early and often. A clean lint is like a version of a test that runs quickly and eradicates many kinds of errors.

Can’t test your tests

Speaking of testing, it’s hard to test your tests, and not always value added to do so. TDD ensures you do some minimal testing of your tests – this is why you make sure the test fails first and then passes when you do code changes. All too often have I written tests after the fact, then when a bug crept in, I realize that the way I wrote my test would have never found it because I screwed up writing the test.

How do you ensure your test code is high quality then? With the other pillars – static analysis in particular. Ensuring your test code has a clean lint gives some assurance that your tests are maintainable and readable, as well as free of certain kinds of errors. This, in turn, makes your tests more easy to peer review for other issues.

It’s a virtuous cycle!

Drawbacks

There are some downsides to expect from linting.

High false positive rate

Linters are going to find a lot of issues that just aren’t that important. Whitespace issues may hurt readability, but they’ll never crash a program. Variable names are nice to get consistent, but the interpreter doesn’t care.

Most of what you’ll be fixing will be things that may have never ended up crashing your program.

Fixing them, though, is often very simple. And you’ll get into a habit of breathing a sigh of relief when the linter runs and finds no issues in your code. You’ll become more confident as a coder, and be much more willing to take risks.

Types of bugs found usually aren’t that nefarious

Along with the above, the worst bugs are often those hardest to catch via linters. If you’re handling credit cards, making sure you debit the right account isn’t going to be something a linter can help you with. Making sure you don’t leak personally identifiable information is something linters would struggle to help you with too.

Often the bugs found are simpler readability and maintenance errors, as well as some actual defects that are pretty quick to learn how to avoid. On the other hand, linters prepare the code for people who can find those bugs in peer review and can give more assurance to test code that it’s correctly exercising your credit card and PII functionality.

Hard to do in dynamic languages

One final drawback is that linting is hard to do in dynamic languages, as discussed above. This means things that some languages can spot via static analysis alone like resource leaks (you grabbed memory from the operating system and forgot to give it back) aren’t going to be things Pylint can find, though.

On the other hand, linters end up being of about equivalent power to the compiler in dynamic languages – which is a great first step towards ensuring your program works. If another program reads it and says “I don’t see anything obviously wrong with this”, that’s some assurance.

Smelly Code

Despite the drawbacks mentioned above, often we go along with fixing all the false positives as you don’t really know whether or not something is wrong until you try to fix it.

Code with lots of Pylint errors can be said to be ‘smelly’ code – we don’t know something is wrong for sure, but we need to check it out. Check out the write up here, and then skim a few of the code smells classified on C2.

Often you might fix one or two Pylint errors and three more will pop up. This is a sign that there’s actually a fundamental design flaw that leads the code to be brittle and hard to understand – even if on the surface it just seems like a few small warnings from Pylint.

If we keep the code squeaky clean, we’ll avoid any smell.

Pylint

Pylint is pretty much the industry standard linter for python. It does a lot of stylistic checking based on what’s considered true idiomatic python (called PEP8) as well as some deeper analysis.

Go download it here.

Challenge

We’re going to loosely follow it’s tutorial, which involves the Ceaser cipher which you’ll want to read up on.

First, fork this repo. Then create a branch in your forked repo where we’re going to do some work.

Then, go ahead and clean up the ceaser_script.py file using Pylint according to this tutorial.

 

When it’s clean, commit.

That is a workflow you might use if you inherit some code and want to clean it up – often running a linter on inherited code is a good way to both improve its readability as well as get familiar with it.

Next, we’ll work on a workflow that combines both linting and test driven development. In the future, you’ll be required to use the workflow practiced below!

The next step will be a little more difficult – create a new file, ceaser.py and ceaser_test.py – we’re going to refactor or change the script you worked on before to be more reusable.

 

  1. Write a test for a function you haven’t written yet in ceaser_test.py, the function will have the following signature:encode(message, offset) so you can call it like this, encode("beware the ides of march", 3) and get a message with the Ceaser cipher offset 3. (You’ll probably have to create a test message by hand)

  2. Ensure this tests fails. You may have to put an empty function in ceaser.py that does nothing.

  3. Ensure pylint is clean.
  4. Commit
  5. Using ceaser_script.py, copy and paste some of the functionality into your encode function in ceaser.py

  6. Debug it until your test passes.
  7. Ensure pylint is clean.
  8. Commit
  9. Write a test for a function you haven’t written yet in ceaser_test.py, the function will have the following signature:decode(encoded_message, offset) so you can call it like this decode("jewlrp ajk ippf kl aqjrk", 9) and get a decoded, English message using the Ceasar cipher. (Again, you’ll have to create a message by hand, the above was just random letters I made up, it’s not an actual message.)

  10. Ensure pylint is clean.
  11. Commit
  12. Using ceaser_script.py, copy and paste some of the functionality into your decode function.

  13. Debug until your test passes.
  14. Ensure pylint is clean.
  15. Commit
  16. Open a pull request on your branch.

The above illustrates a pattern – in Test Driven Development with Static Analysis, every commit should either be adding a test or code. Every commit has 10/10 on pylint and 100% coverage.

When a test fails and you can’t figure out why, then break out the debugger. Also, often running the debugger the first time you want to walk through your code can also be a good practice.

Hook up Pylint to your Text Editor

Fixing things as soon as they happen creates a tight feedback loop that both makes you more productive and accelerates learning. It’s easiest to see during testing.

If you make a change to your code, and your tests fail, you know what you just changed. All the context is still in your head and you’re much more quickly able to debug code and get the test passing again. Moreover, you know that the changes you made in the code ended up affecting tests that you may have not predicted. You learned something about the code.

Compare that to making a lot of code changes, then days later, running the tests. A few fail. You have no idea what changes are tied to which failures. You can try taking a debugging approach, and you can look at your git diffs to see what’s changed, but this is a much more complex problem than above. You’ve already moved on, mentally, to other things. Debugging the same issues could take two to ten times longer.

The lesson? Debug as close to possible to when you added the bug.

Linters work like fast unit tests – something that can run in the background of your editor and let you know about issues as soon as possible. Again, since they work like a compiler for a dynamic language, they’re a single large global test for things like misspellings, syntax errors and other things you’d otherwise have to wait for your tests to run. Catching them as soon as possible speeds you up, and allows you to focus your testing efforts on things the linter can’t catch, like actual logic errors rather than merely running the code looking for syntax problems.

Go ahead and use the instructions below to hook up Pylint to your editor of choice:

Pylint for Sublime

Pylint for Vim

Pylint for PyCharm

Hook up Pylint to Git

Another approach is to have git automatically reject any commit that doesn’t have a 10/10 out of Pylint.

When running from inside a text editor, Pylint decorates the current file. If you make changes to that file, and Pylint gives you a clean bill of health, that doesn’t mean that your changes didn’t suddenly break other files.

For example, you may rename a function, and forget to rename other places it is used. Pylint would flag your current file as clean, but other files where that function as being used as having errors.

Putting a Pylint check-on-commit allows you to do a whole project Pylint at the last moment to prevent adding any erroneous code to the repo.

Check out this repo and add it to your own fork of the Ceaser project above.

Some additional resources can be found here.

What about false positives?

For the duration of these chapters, we’ll treat every pylint error as a real error. You’ll be expected to fix everyone, whether you agree or not unless your mentor explicitly tells you to ignore them.

That being said, in the real world, often you have to make compromises. For that purpose, there are configuration files to turn off families of checks, suppression files to suppress warnings line by line, as well as in line suppressions. No task is tied to these, but go ahead and skim these links so you have a cursory understanding of how to squelch a Pylint error.

Example of inline suppressions

How to add suppressions to the config file

Example config file

To move on…

When you’re done, you’ll need to provide your mentor with the following…

  • show your mentor a 100% coverage report
  • show your mentor a 10/10 Pylint report
  • open a pull request on your code, and clean up any comments your mentor has.
  • show your mentor that you have pylint installed in your text editor
  • show your mentor that you have a pylint hook in your git repo

For Mentors…

  • In addition to the above, check out each commit and ensure that each one is pylint clean.
Advertisements

November 11, 2016 - Posted by | Uncategorized

1 Comment »

  1. […] SYWTLTC: (AB) Chapter 3.2 Quality : Static Analysis […]

    Pingback by So You Want To Learn to Code: Prologue « The Skeptical Methodologist | November 14, 2016 | Reply


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: