The Skeptical Methodologist

Software, Rants and Management

Stop with the Big O Problems : You are not a Leet Coder

Big O (sometimes called Big Oh notation) is a mathematical analysis of algorithmic performance as the size of the dataset increases. It’s taught in undergrad Computer Science courses usually while comparing the performance of different sorting algorithms or different tree like data structures.

It’s also commonly used as the underlying ‘test’ in many company’s code challenges. The excuse used to focus on the nature of Big O performance is that these companies believe they write high performance code, and that the foundation of high performance code is Big O notation, ergo, they need to test all of their candidates for it.

We Want the Best Coders

The real reason companies test for Big O notation, however, is two fold. One, they have no idea what to hire for, and they’ve found that segmenting out their candidates based on how well they know Big O notation seems as good as a way to segment as anything else, and they also like saying they’ll hire anyone but secretly only hiring college grads.

If an algorithmic challenge is part of your hiring process, you’re almost certainly biasing your results to college grads, and let me remind you, having a college degree is NOT PREDICTIVE ON PROGRAMMING ABILITY OR SKILL.

So, you’re wasting effort applying a segmentation to your candidate stream that provides NO PREDICTIVE VALUE. Why would you do all that work? Why are you asking your candidates to do all that work – as these challenges are often hard?

Well, that’s kind of the second thing you’re selecting for. How hard did they study for the challenge? The unspoken rule with any algorithmic challenge is that they can be trained and ‘grinded’ for using tools like HackerRank and Leetcoder. There you can both get the challenges themselves, attempt your answer, and see others ‘right’ answers so you can memorize them.

But We Get the People Most Willing to Mindlessly Grind Without Complaint

This makes any hiring process that uses these kinds of candidate challenges implicit “how much do you want it?” sorts of affairs. Candidate selection is far less about coding ability and far more “how much did you grind”? Unfortunately, by only allowing candidates that were ‘hungry’ to get in past your hiring process, all you’ve hired is a bunch of folks who think you’re the best place to work before they’ve even got in.

Note, generally, you want to be the best place to work because of your culture, practices, and compensation. But yeah, you can totally cheat and only hire people who’ve already showed they were willing to work for free just to get in your doors for weeks and weeks doing arbitrary ‘programming-like’ exercises.

So you may think you’ve set out to get good coders – not only that, good coders who know about performance, because you’ve told yourself and anyone who will listen that at your company you solve “hard problems” that are very “performance intensive”. But what you’ve actually found is just a bunch of sycophants who aren’t going to challenge anything your company says or does. In fact, you may be one of those sycophants.

This is going to seriously undermine any attempts at innovation at your company.

What Real Performance Looks Like

I work in the programmatic advertising industry, often compared to high frequency trading. Do we use Big O notation to write highly performant code?

Yes. Once. In the last five years.

Anyone who actually works in the high performance computing space knows that Big O as concept isn’t something you’re going to need every day as a developer. This is for two reasons.

One, most algorithms have already been written using these best practices. Std::sort has already taken into account the best sorting algorithm for the general case, so eschewing std::sort and using something else is often a fool’s errand. Just use the built in tools that you have – in fact, the more you use built in and reusable tools, the more likely it is you’re already using many of the ‘best algorithms’ for the job in a Big O sense.

Second, and more importantly – Big O necessarily leaves out the constants. This means that the only thing Big O worries about is asymptotic performance over large data sets as it scales based on the size of that data set. This is only one factor of performance, and in many cases, the constants that you hit so widely wipe out any gains you get by using an algorithm that better scales with the size of the data set.

High performance computing cares about cache locality, pipelining and branch mis-predictions as much as it cares about Big O performance. It often is far more sensitive to sneaky system calls than anything else – meaning your fancy algorithm is shot because you accidentally did something that allocated memory.

What data structure should I use for my problem? A std::vector in 99% of cases. “But wouldn’t a tree be more performant for my use case?” No. Measure it. Seriously. Trees don’t outperform vectors until the hundreds of thousands of data entries because vectors have been so incredibly heavily optimized for cache locality and other factors.

So Never use Big O?

No, of course not. Big O is important. It’s just not important in most cases. What is the size of the data you are dealing with? If its not that large, its better to just build your algorithm off of off the shelf reusable parts since those parts are heavily tested and optimized to work pretty well in any general use case.

All performance questions don’t start with a mathematical proof of asymptotic performance, they start with the profiler.

Find your bottle necks. In most cases, your bottle necks are places in the code where very simple restructurings will work wonders. You may even find outright bugs where you’re sorting the same list again and again on accident.

Once you’ve done that so much that you’re genuinely hitting algorithms as your bottle neck, your next step is to see if there’s any off the shelf open source algorithms that do better than std::sort for your problem.

If you’ve exhausted that, and that’s going to happen very rarely, then and only then are you going to start parsing through white papers looking for better asymptotic performance.

Notice, I didn’t say you’re going to replace your vector with a tree, or anything else so basic. All those easy Big O optimizations have already been done for you. Only when you’ve reached the end of what you can easily find in the open source world are you going to really start looking at the most recent research and possibly implementing something from scratch.

And then you’re going to profile and measure again, just as you do after every single optimization you make because you can never guess well enough if the constants are going to eat your shiny new Big O algorithm’s lunch. In our case, it was a wash – the fancy algorithm wasn’t measurably faster than what we had. And since we were going to have to maintain the fancy algorithm, we dropped it and put the older, simpler algorithm back in.

We have one of the fastest systems on the planet in terms of what we do, and if we haven’t had to pay attention to Big O to write it, neither will most people.

I guarantee you, most algorithms written with Big O in mind have a print statement somewhere in there for “logging, you know, just in case” that end up either hitting the screen or the disk and shooting your performance in the foot. If you don’t understand how a single huge constant can dwarf whatever your asymptotic performance is – like writing to a database too frequently, reading or writing to a disk, or basically any IO or memory calls – then you’re not writing highly performant code. You’re just doing coding riddles and filling yourself with an unearned sense of entitlement that you are a ‘leet coder’.

September 29, 2020 Posted by | Uncategorized | Leave a comment