The Skeptical Methodologist

Software, Rants and Management

SYWTLTC: (AB) Chapter 3.2 Quality : Static Analysis

Go here if you want the prolog and table of contents to the SYWTLTC series!

Per the 5 pillars of quality, up next is static analysis. As always, treat all links as required reading unless stated otherwise.

What is Static Analysis?

Static analysis is a broad term used to categorize all tools that you can run on code to tell you whether it’s correct or not. It’s a program that you run on your program which tells you whether you’re making mistakes.

It’s called a static analyzer because it doesn’t run your code to figure out what’s wrong – it analyzes it “statically”, or unchanging.

What kinds of Static Analysis are there?

There are three broad categories of static analysis tools out there. Linters, static analyzers proper, and model checkers / theorem provers.

The most prevalent, and the one you’ll be using from here on out, is called a linter. Linters “remove lint” from programs. They operate primarily on the text of the program itself, looking for simple stylistic mistakes. Think of them like spellcheckers. They more or less look at your program line by line and give you warnings if for example, you use a variable name that is hard to understand, or if you switch between spaces and tabs.

The other categories (static analyzers, model checkers, theorem provers) can all eliminate harder and harder bugs to suss out, but require substantially more work. Python is a ‘dynamic language’, which means the entire program isn’t really defined until it’s running, and so ‘static’ analysis of the code itself tends to have too many unknowns to be worthwhile.

We’ll be investigating Pylint in particular which is primarily a linter but also does some more difficult checks as well by attempting to interpret your python without actually running it.

Benefits of Static Analysis

There are a number of benefits of static analysis as well as drawbacks, but most important to note is that many of these benefits and drawbacks tend to be complementary to testing, peer review, types / contracts and design (our other pillars). Static analysis in and of itself is of limited power, but combined with the other pillars of quality can be very powerful.

Consistent style

First off, stylistic checking ensures a code base has a single, consistent style. This helps maintainers as they can expect certain patterns in the whitespace, variable names and other parts of the code to read it more easily.

It also helps peer reviewers since, again, a single way to use whitespace, variable names, and other stylistic concerns make code easier to read than many different styles.

Absolute removal of certain kinds of bugs

Some bugs, such as variable misspellings, which would end up crashing your program at runtime can be absolutely eliminated from your code base.

This is in contrast to testing. Testing can only show that the one path through the code that the test executes does not fail in any way that the test doesn’t expect. In other words, you can never really prove your program works via tests alone, since each test only proves that that one, single scenario worked.

Linters can prove that your program is free of certain kinds of bugs, completely and absolutely.

Very low cost in terms of time; quick turn around

Compared to contracts, peer reviews or tests, linting takes nearly no time at all to run. Tests take a lot of time to write, and later, to maintain. Peer reviews can involve multiple person hours as other developers look at your code.

Linting takes, usually, on the order of seconds. This is great for two reasons.

First, it means that the level of effort you have to get a clean lint is minimal compared to testing. You can squash a lot of bugs very quickly with linting, a lot more than you would via testing.

Second, it means you can lint often. In the previous chapter, I showed you how to automatically run your tests as files change. This is a great productivity tool as you can find out if you broke a test very early.

Trying to make sure that “bad thing” gets feedback ASAP is a key to learning, and it’s also a key to fixing “bad thing” fast. The mistake you just made is still fresh in your mind, so getting feedback on it means you don’t have to go looking for the bug – it’s right there, right where you were already working.

Tests and testing still require some care – tests can easily take minutes or hours, which means you have to start splitting up what tests run when. Usually, we like ‘unit’ tests to be our fast tests, the ones we can run automatically on changes, whereas other tests we may run nightly.

Linters, however, are super fast. They can be run faster than unit tests even. Many linters are actually built into text editors and IDEs so that when you save your file, the linter automatically runs and tells you what errors it has found (again, like spellcheck).

For static languages like C++ or Java, it’s often said that just getting your program to compile is like one big test. We don’t get that luxury in python – however, we can get most of it back by linting early and often. A clean lint is like a version of a test that runs quickly and eradicates many kinds of errors.

Can’t test your tests

Speaking of testing, it’s hard to test your tests, and not always value added to do so. TDD ensures you do some minimal testing of your tests – this is why you make sure the test fails first and then passes when you do code changes. All too often have I written tests after the fact, then when a bug crept in, I realize that the way I wrote my test would have never found it because I screwed up writing the test.

How do you ensure your test code is high quality then? With the other pillars – static analysis in particular. Ensuring your test code has a clean lint gives some assurance that your tests are maintainable and readable, as well as free of certain kinds of errors. This, in turn, makes your tests more easy to peer review for other issues.

It’s a virtuous cycle!

Drawbacks

There are some downsides to expect from linting.

High false positive rate

Linters are going to find a lot of issues that just aren’t that important. Whitespace issues may hurt readability, but they’ll never crash a program. Variable names are nice to get consistent, but the interpreter doesn’t care.

Most of what you’ll be fixing will be things that may have never ended up crashing your program.

Fixing them, though, is often very simple. And you’ll get into a habit of breathing a sigh of relief when the linter runs and finds no issues in your code. You’ll become more confident as a coder, and be much more willing to take risks.

Types of bugs found usually aren’t that nefarious

Along with the above, the worst bugs are often those hardest to catch via linters. If you’re handling credit cards, making sure you debit the right account isn’t going to be something a linter can help you with. Making sure you don’t leak personally identifiable information is something linters would struggle to help you with too.

Often the bugs found are simpler readability and maintenance errors, as well as some actual defects that are pretty quick to learn how to avoid. On the other hand, linters prepare the code for people who can find those bugs in peer review and can give more assurance to test code that it’s correctly exercising your credit card and PII functionality.

Hard to do in dynamic languages

One final drawback is that linting is hard to do in dynamic languages, as discussed above. This means things that some languages can spot via static analysis alone like resource leaks (you grabbed memory from the operating system and forgot to give it back) aren’t going to be things Pylint can find, though.

On the other hand, linters end up being of about equivalent power to the compiler in dynamic languages – which is a great first step towards ensuring your program works. If another program reads it and says “I don’t see anything obviously wrong with this”, that’s some assurance.

Smelly Code

Despite the drawbacks mentioned above, often we go along with fixing all the false positives as you don’t really know whether or not something is wrong until you try to fix it.

Code with lots of Pylint errors can be said to be ‘smelly’ code – we don’t know something is wrong for sure, but we need to check it out. Check out the write up here, and then skim a few of the code smells classified on C2.

Often you might fix one or two Pylint errors and three more will pop up. This is a sign that there’s actually a fundamental design flaw that leads the code to be brittle and hard to understand – even if on the surface it just seems like a few small warnings from Pylint.

If we keep the code squeaky clean, we’ll avoid any smell.

Pylint

Pylint is pretty much the industry standard linter for python. It does a lot of stylistic checking based on what’s considered true idiomatic python (called PEP8) as well as some deeper analysis.

Go download it here.

Challenge

We’re going to loosely follow it’s tutorial, which involves the Ceaser cipher which you’ll want to read up on.

First, fork this repo. Then create a branch in your forked repo where we’re going to do some work.

Then, go ahead and clean up the ceaser_script.py file using Pylint according to this tutorial.

 

When it’s clean, commit.

That is a workflow you might use if you inherit some code and want to clean it up – often running a linter on inherited code is a good way to both improve its readability as well as get familiar with it.

Next, we’ll work on a workflow that combines both linting and test driven development. In the future, you’ll be required to use the workflow practiced below!

The next step will be a little more difficult – create a new file, ceaser.py and ceaser_test.py – we’re going to refactor or change the script you worked on before to be more reusable.

 

  1. Write a test for a function you haven’t written yet in ceaser_test.py, the function will have the following signature:encode(message, offset) so you can call it like this, encode("beware the ides of march", 3) and get a message with the Ceaser cipher offset 3. (You’ll probably have to create a test message by hand)

  2. Ensure this tests fails. You may have to put an empty function in ceaser.py that does nothing.

  3. Ensure pylint is clean.
  4. Commit
  5. Using ceaser_script.py, copy and paste some of the functionality into your encode function in ceaser.py

  6. Debug it until your test passes.
  7. Ensure pylint is clean.
  8. Commit
  9. Write a test for a function you haven’t written yet in ceaser_test.py, the function will have the following signature:decode(encoded_message, offset) so you can call it like this decode("jewlrp ajk ippf kl aqjrk", 9) and get a decoded, English message using the Ceasar cipher. (Again, you’ll have to create a message by hand, the above was just random letters I made up, it’s not an actual message.)

  10. Ensure pylint is clean.
  11. Commit
  12. Using ceaser_script.py, copy and paste some of the functionality into your decode function.

  13. Debug until your test passes.
  14. Ensure pylint is clean.
  15. Commit
  16. Open a pull request on your branch.

The above illustrates a pattern – in Test Driven Development with Static Analysis, every commit should either be adding a test or code. Every commit has 10/10 on pylint and 100% coverage.

When a test fails and you can’t figure out why, then break out the debugger. Also, often running the debugger the first time you want to walk through your code can also be a good practice.

Hook up Pylint to your Text Editor

Fixing things as soon as they happen creates a tight feedback loop that both makes you more productive and accelerates learning. It’s easiest to see during testing.

If you make a change to your code, and your tests fail, you know what you just changed. All the context is still in your head and you’re much more quickly able to debug code and get the test passing again. Moreover, you know that the changes you made in the code ended up affecting tests that you may have not predicted. You learned something about the code.

Compare that to making a lot of code changes, then days later, running the tests. A few fail. You have no idea what changes are tied to which failures. You can try taking a debugging approach, and you can look at your git diffs to see what’s changed, but this is a much more complex problem than above. You’ve already moved on, mentally, to other things. Debugging the same issues could take two to ten times longer.

The lesson? Debug as close to possible to when you added the bug.

Linters work like fast unit tests – something that can run in the background of your editor and let you know about issues as soon as possible. Again, since they work like a compiler for a dynamic language, they’re a single large global test for things like misspellings, syntax errors and other things you’d otherwise have to wait for your tests to run. Catching them as soon as possible speeds you up, and allows you to focus your testing efforts on things the linter can’t catch, like actual logic errors rather than merely running the code looking for syntax problems.

Go ahead and use the instructions below to hook up Pylint to your editor of choice:

Pylint for Sublime

Pylint for Vim

Pylint for PyCharm

Hook up Pylint to Git

Another approach is to have git automatically reject any commit that doesn’t have a 10/10 out of Pylint.

When running from inside a text editor, Pylint decorates the current file. If you make changes to that file, and Pylint gives you a clean bill of health, that doesn’t mean that your changes didn’t suddenly break other files.

For example, you may rename a function, and forget to rename other places it is used. Pylint would flag your current file as clean, but other files where that function as being used as having errors.

Putting a Pylint check-on-commit allows you to do a whole project Pylint at the last moment to prevent adding any erroneous code to the repo.

Check out this repo and add it to your own fork of the Ceaser project above.

Some additional resources can be found here.

What about false positives?

For the duration of these chapters, we’ll treat every pylint error as a real error. You’ll be expected to fix everyone, whether you agree or not unless your mentor explicitly tells you to ignore them.

That being said, in the real world, often you have to make compromises. For that purpose, there are configuration files to turn off families of checks, suppression files to suppress warnings line by line, as well as in line suppressions. No task is tied to these, but go ahead and skim these links so you have a cursory understanding of how to squelch a Pylint error.

Example of inline suppressions

How to add suppressions to the config file

Example config file

To move on…

When you’re done, you’ll need to provide your mentor with the following…

  • show your mentor a 100% coverage report
  • show your mentor a 10/10 Pylint report
  • open a pull request on your code, and clean up any comments your mentor has.
  • show your mentor that you have pylint installed in your text editor
  • show your mentor that you have a pylint hook in your git repo

For Mentors…

  • In addition to the above, check out each commit and ensure that each one is pylint clean.
Advertisements

November 11, 2016 Posted by | Uncategorized | 1 Comment

SYWTLTC: Novice Chapter 2: Effective Hacking

Go here if you want the prolog and table of contents to the SYWTLTC series!

The Novice section of SYWTLTC is intentionally pretty sparse – Chapter 1 gives you all the tools you need to get started in Code Combat.

However, there are often meta-lessons to be learned even as early as Code Combat. We’ll go over one of those today.

The recommended time to read this post is after finishing Kithgard Dungeon.

Hacking?

So I’m using this term loosely. But often, we hear the term ‘hack’ in the context of programming meaning when someone doesn’t fully understand what they’re doing and they’re just trying to ‘make it work’.

The usual workflow here is to make a sometimes not-so-educated guess about what might be wrong, change that thing, then run your program and see if it works.

Senior coders tend to have many tools in their toolbelt, however, we never completely drop hacking as a means to understand things. There will always be time when you will have code you inherit, a library you don’t understand, or even code you wrote yourself that you no longer remember how it works – there will be times like these that all you can do is ‘fiddle with it’ until it does what you want it to do.

Still, there are tips for more rigorous hacking

1. Scientific Hacking

Change one variable at a time

I don’t mean actual variables in a program, although that may be the case as well. What I mean here that’s scientifically inspired is that we try to isolate only one ‘theory’ of why it’s not working at a time.

If it may be X, Y or Z, you don’t change X, Y or Z all at the same time. Change one and see if it worked, back that change out, change the next and see if it worked, and so on.

This may feel like you’re going slower, but you’re actually going faster. This is because your ability to mentally understand what’s changing in the system goes out the window after a certain (very small) level of complexity. So you may be able to “change all the things!” once or twice on toy programs you’re working on, and things will appear to work.

But in larger programs, many bad things happen when you do this.

  1. Your program can suddenly appear to work. But it’s all in appearances.
  2. You may fix your thing and break something else.
  3. You may not even fix your thing, break something else, and not understand what you changed well enough to unbreak it.

Three is usually the most common.

There is actually an advanced way to change “all the things” though, and you’ll need to combine it with tip 7 at the bottom – always leave breadcrumbs (i.e., lots of git commits for every change you can back out, or comments in code that you can back out, ways to easily undo what you’ve done.)

Backwards Science

This is to basically do science in reverse – change all the things. Then run your program – does it work? If so, undo half of the things. Does it still work? Then you know it wasn’t that half. Undo half of what’s remaining – does it still work? If it doesn’t, you’ll want to turn those fixes back on and turn the other half off.

This is akin to a ‘binary search’ algorithm which you may become more familiar with later. And it’s a good compliment to the traditional ‘turn one on, leave all off’ technique described above. This is because some issues may be interactions of multiple fixes, i.e., you may need to make more than one fix to the code to get things in ship-shape. The turn-it-all-on and then binary search downwards can find this easier than the turn-one-on-at-a-time approach. The turn-one-on-at-a-time approach, though, usually is faster since it requires less work to set up and back out.

Keep a Journal

You can do this in a documentation tool, in comments in the code, or just in a paper spiral at your desk. Often it’s good to write down what you’re doing, and what the results were, again in a scientific manner. Each change you make to the code is a little ‘experiment’, and you need to write down what you did and what the results were for each experiment.

This helps with number 7 below – keeping a journal complements other techniques of ‘backing things out’. It also prevents trying the same experiment twice – which may happen if you’re struggling with a bug for months at a time. When you start forgetting what you’ve already tried, that’s when you truly begin to spin in place and become completely unproductive.

Finally, a journal can help with hypothesis-generation. As I stated above, each fix is an experiment. Your minds ability to come up with a hypothesis for any given event is nearly infinite (given enough time). But you’ll come up with better hypothesis the more information you have.

A hypothesis is valuable insofar as it explains the given data. Your initial bug is one data point – the program currently does X when it should do Y. Many hypotheses can fit this, and your job is to methodically step through them one by one until you find the one that’s correct.

However, each time you do an experiment, you narrow the solution space. If your program prints “Hello WOrld!” when it should print “Hello World!” and you perform an experiment to lowercase all O’s in the program and it fails… your real problem just got constrained. Now your problem is:

  1. Program prints “Hello WOrld!” when it should print “Hello World!” AND
  2. When lower casing all O’s at line 13, the program continued to malfunction.

A journal helps keep these thoughts all in order and allows each of your experiments to gather more data.

2. 90% of Programming is Knowing What to Google

Most of coding is research.

cfxphz8w4aa711p

But what to google and what sites to go to first is something you learn over time. This series will have a particular module dedicated to research, but until then, understand that if you have to search for something on the internet, that doesn’t mean you aren’t coding right.

Most of coding is googling for APIs, code snippets, blog opinions about tool X versus tool Y, and looking for others who have had your same problem and fixed it.

3. Don’t Grind

There’s a lot of times when you’ll be struggling with making your program work and you’ll choose to … struggle more.

Bayesian reasoning is a kind of statistical reasoning that says “What should we expect given what we’ve seen?” It says: take all the data into account, including new data, and what should we expect in data going forward?

In other words, given that you’ve already struggled for 3 days with this bug, how likely are you going to solve it by struggling for 3 more days?

Not very likely.

This is called grinding. And it’s a technique that may leave you with the answer, after maybe thirty more days, or may drive you to completely change your result (which is bad – if you wanted to design it in a certain way, it’s probably because that certain way was good. Changing to another way means you’re sacrificing quality because you couldn’t make it work.)

Or it may leave you quitting your job. I’ve seen all three happen.

When you find yourself grinding, your hypothesis generating engine slows down, and you have trouble coming up with new ideas for why your bug is occurring. You either rehash old ones – which is a waste of time if you’ve kept a journal – or you come up with increasingly bizarre theories on why your program may not be working, which isn’t the best use of your time.

The best thing to do when you realize you’re grinding is give up and work on something else. Your subconscious will be busy grinding away at the problem for you, and you’ll be greeted with an especially good idea right when you’re falling asleep, or when you’re showering, or otherwise occupied. These are the insightful ideas that have lots of promise, whereas the bizarre ideas you come up with staring at the code are almost always bad.

Walk away, play a game, read a book, talk about your problem with someone else, or talk about anything but your problem with someone else. Insight will strike.

4. Play

When you’re coding, you’re not always stuck on something. Sometimes, things are going just fine, swimmingly actually. This is when you should try to make your own problems to get stuck on.

If you’re trying some tutorial and you can get a button to show up on your screen where you want it – what happens when you move it? What happens when you set certain things to negative numbers? What happens when you try and push it off the screen?

These are experiments, like the above, but rather than experiments trying to prove or disprove a theory about how something is causing a failure, they’re still adding data to “how buttons work” or “how strings work” or just about anything else. They’re a form of play – exploration for its own sake – and they’re incredibly valuable forms of “hacking”.

Again, as with tip 7, leave yourself a way to back out. But rather than trying things to fix your program, you’re more or less trying ways to break it – or maybe not. You’re just trying things on a completely fine program, and checking with what you think will happen with what actually does.

Along with tip 5 below, playing is the best way to get the most out of something you’ve already done – if you already implemented some widget, what are a few ways to change it that you don’t know what they do? That ensures you get the most out of every project and exercise.

5. Make it Work; Then, Make it Pretty

Before we get further into this tip, let me make one thing clear –

You are not done with your code until it works and is pretty.

There’s nothing more demoralizing than sitting in a peer review with some recalcitrant coder who refuses to change what they have done because “it works, doesn’t it?”.

Working code is the bare minimum of what you’re expected to produce.

However, when trying to prioritize what to do first – getting things working is often the hardest part. Finding one solution to your problem is hard – there’s an infinite variety of solutions, but a much larger infinity of non-solutions. It’s ‘sparse’.

However, once you do have a working solution, it’s usually far easier to make slow incremental changes to that working solution to make it more pretty.

What I’m not saying here is that you should code a large project together with no regard to making things readable, and then do it later as an afterthought. What I am saying is that sometimes, you’ll get stuck – it’s these times when it’s okay to get some sawdust in various places, so long as you can follow tip 6 below and keep it isolated to a certain area.

The fact is, writing a test for each solution is going to be cumbersome if 99 of your potential solutions don’t work and the 100th does. Sometimes you get the benefit of a single test telling you whether or not your solution works at all – this is when you’re lucky. But when you’re designing a new feature and you don’t know how it should work yet – you want to play in the design space and see what feels right – letting things get slightly dirty in isolated parts of code is fine, so long as you follow through and get them cleaned up before any peer review.

6. The Surgical Curtain

In surgery, surgeons often lay down cloth around the incision site to block out everything except the area that they’re going to be working with. This is to more or less shrink the problem size and focus all attention only on the surgical area.

Similarly, when trying to ‘hack’, you want to shrink the problem by as much as possible, and only work on the area that is problematic.

Remember in scientific hacking, we talked about ‘reverse science’, where you change everything to see if your issue is still there?

There’s a similar technique to shrink the problem space, where you try and turn off (by removing or commenting out code) large swaths at a time and seeing if the problem is still there. As you turn things off and the problem remains, that means you can be confident (not sure, but confident) that your problem is not in that area of code.

Often you can shrink things down into a small toy program where your problem lies, and it becomes much easier and faster to try different experiments out on it.

This is one benefit of well factored / well-designed code, it’s usually easy to isolate parts of the code and write small ‘unit’ tests around where your issue lies, rather than having to run your entire program to see if it works or not. The curtain is easy to lay down in well-designed code.

If your stuck, and there are lines you can comment out while not affecting your problem, do so – this reduces chances of accidentally breaking other things, introducing interactions, and keeping the problem small enough that you can keep it all in your head.

7. Leave, and use Breadcrumbs

Finally, leave yourself a way out.

Hacking can often mean many changes to your code – if you’re making them methodically as illustrated in tip 1, you also need a methodical way of backing them all out. This is what source control is often used for – try an experiment and commit it to the repo. If that experiment doesn’t fix your problem, roll back the commit and your code will be as it was before you did anything.

Often, even with the best rigor, we find the code base to be an unintelligible mess after some hacking around. It’s best at some times to start all the way over, and leaving yourself breadcrumbs allows you to do that.

You really don’t want to find yourself trying to fix a problem where you have a code base that is so heavily hacked that it’s unrecognizable compared to how you found it. It means that you’ll basically have to debug your way out, which is never fun.

Get in, change only what you need to in a methodical fashion, and get out, leaving the code as clean as possible.

Leaving breadcrumbs like git commits that are very granular also allows you to easily back out scaffolding code like print statements and other things that help you debug.

Finally, backing out fixes that don’t work is incredibly important. If we write for readability first, and performance second (which you should), then you should assume that the code base is as readable as it can be. Any change you make that’s not a refactoring to make it more so must by default make it less so. In other words, any change you make that’s not explicitly made to improve readability is most likely harming it. No change should be left in that doesn’t do something – like fix a bug. If it doesn’t fix your bug, you need to take it out.

There are often thousands of ways to code something. Your fix may not have fixed your original problem, but it may not have also introduced new ones, meaning you could potentially leave it in and the code base would work as it always did. Don’t do this – you’ve harmed readability by letting in code that had no reason to be there. Back that code out and start anew on a new experiment.

8. Get some Theoretical Perspective

This is also the time to introduce one of our first adjunct books, Python Programming: An Introduction to Computer Science.

Often, when stuck, you just need some time to ‘sit’ on the problem. Robert Pirsig, in his book Zen and the Art of Motorcycle Maintenence, commented that ‘stuckness’ is one of the best places to be because it’s almost inevitable at that point that you’ll solve the problem.

It’s uncomfortable to be stuck – but we also want to avoid grinding. We don’t want to just keep hacking and hoping it works, we want to go back to our problem with good ideas.

Why do we grind? We grind to feel like we’re making progress. So if there were a way to continue making forward progress without grinding, that would be ideal – that’s where this textbook comes in.

The book provides a completely different perspective on the problem you’re trying to solve – different perspectives are more effective at breaking stuckness than more of the same. And it also provides a different means of moving forward by doing a lot of reading. The exercises in the book often have a different tact than those in Code Combat – fewer ‘tricks’ and more simply applying principles from the chapters.

I’m not a huge fan of how the chapters in Python Programming are laid out, so I suggest the following progression rather than the chapters as laid out:

  1. Programming Basics (ch 1, ch 2)
  2. Beginner Data Structures (ch 5, ch 11)
  3. Structured Programming (ch 7, 8)
  4. Procedural Programming (ch 6)
  5. Algorithms and Recursion (ch 13)
  6. Object Oriented Programming and Design (ch 4, ch 10, ch 9, ch 12)
  7. Numerical computing (ch 3)

Do the exercises, without application the theory means nothing. And these exercises often will push you in a different direction than code combat.

Conclusion

Hopefully these tips and early insights into how coders code are good to have. I know a few things like knowing that people spend most of their time debugging and googling have helped people feel like they aren’t utter failures when they’re working through code combat.

It’s okay to hack, it’s okay to research.

But it’s also good to practice hacking and researching the right way so that you can speed yourself up and be ready for some more robust tools to put in your tool chest.

November 1, 2016 Posted by | Uncategorized | 1 Comment