Breaking News

Big O Notation, algorithm efficiency: Internet vs. Pigeon transfer

Today we're going to cover one of my favourite topics, which are big O,  algorithmic efficiency. I'm gonna start off with a story.

Table Of Content 

Case Study of a pigeon vs slow internet transferring data

Back in 2009 there was this company in South Africa, and they had a really slow internet they were really really frustrated by, they have actually two offices about 50 miles away, so they set up this little test to see if it would be faster transfer big chunk of data over the internet with their real slow internet service provider or via carrier pigeon literally.  They took a bunch of USB sticks or probably actually one giant one strapped into a pigeon and flew it from one office to the next, and they got a ton of press over this test and media kinda of love this stuff right. And of course, the pigeon won all right it wouldn't be a funny story otherwise. And so they got a whole bunch of press over that and, people were like oh my god I can't believe that a pigeon beat the Internet.

But the problem was really actually kind of a ridiculous test because of the thing. Transfering data over the internet takes longer and longer and longer with how much data there is with certain simplifications and assumptions, that's not really the case with pigeons. Pigeons might be really slow or fast but it always basically takes the same amount of time for a pigeon to transfer one, transfer some chunk of data from one office to the next. It just has to fly 50 miles. It's not going to take longer because that USB stick contains more data, so of course, for a sufficiently large file, the pigeons gonna win.

Pigeon transfer data (constant time)

1gb = 60 minutes 
2gb = 60 minutes
3gb = 60 minutes 
1000 gb = 60 minutes 

So in big O, the way we talk about this is we describe the pigeon transfer speed as O of 1 or O(1), it's constant time concerning the size of the input. It doesn't take long with more input. Again with certain simplifications and assumptions about the pigeon's ability to carry a USB stick. 

Internet transfer data (Linear Time) 

1gb = 30 minutes 
2gb = 60 minutes
3gb = 90 minutes 
1000 gb = 500 hours

But the Internet time is. Internet transfer time is O of n or O(n). It scales linearly concerning the amount of input. Twice the amount of data is gonna take roughly twice as much time. So that's what Big O is. 

How the runtime scales concerning some input variables

Big O is essentially an equation that describes how the runtime scales with respect to some input variables. So I'm going to talk about four specific roles but first let me show you what this might look like in code. 

So let's consider this simple function that just walks through an array and checks if it contains a particular value so you would describe this as O of n, where n is the size of the array. What about this method that talks through an array and prints all pairs of values in the array note that this will print if it has the elements 5 and 2 in the array it'll print both 2, 5 and 5, 2, So you would describe this probably as O of n squared  O(n2) where n is the size of the array. You can see that because it has n elements in the array,  so it has n squared pairs, so the amount of time it's going to take you to run that function is going to increase concerning n squared, Ok so that's kind of the basics of Big O

How would you describe the runtime to mow a square plot of land?

Let's take another real-world example. How would you describe the runtime to mow a square plot of land? So your square plot of grass so you have this plot of grass, and you need to harvest all of it. What's the runtime of this? Well, it's kind of an interesting question, but you could describe it one of two ways. One of which, well one way you can explain it is O of a where a is the amount of area in that plot of land. Remember that this is just an equation that expresses how the time increases, so you don't have to use n in your equation, you can use other variables and often it makes sense to do that.

So you can just describe this as O of a where a is the amount of area there. You could also describe this as O of s squared where s is the amount of,  is this length of one side, because it is a square plot of land so the amount of area is s squared. So it's important that you realize is O of a or O(a), and this O of s O(s) squared are obviously saying essentially the same thing, just like when you described the area of a square. Well you could describe it as 25 or you describe it as 5 squared,  just because it has a square and it doesn't make it bigger so both of ways of describing the runtime. It just depends on what might be clearer for that problem.

Four Important rules to know of Big O

(1) Different steps get added

The first one is if you have two different steps in your algorithm, you add up those steps, so if you have a first step that takes you know O of a time and a second step that takes O of b you add up those runtimes and you get over a plus b. So for example you have the runtime that, you have this algorithm, that first walks through one array, and then it walks through this other array. It's gonna be the amount of time it takes to walk through each of them. So you'll add up their runtimes.

(2) Drop Contants

The second way, in the second rule is that you drop constants. So let's look at these two different ways that you can compress out the min and Max element in your array. One finds the min element and then finds the max element, the other one finds them in the max simultaneously. Now these are fundamentally doing the same thing, they're doing essentially the exact same operations just doing them in slightly different orders.

Both of these get described as O of n, where n is the size of the array. Now it's really tempting for people to sometimes see two different loops and describes as O of 2n or O(2n) and so it's important that you remember that you drop constants and Big O. You do not describe them as O of 2n or O of 3n because you're just looking for how things scale roughly. is it a linear relationship, is it a quadratic relationship, so you always drop constants.

(3) Different input -> different variables

The third rule is that if you have different inputs you're usually going to use different variables to represent them. So let's take this example where we have two arrays and we're walking through them to figure out the number of elements in common between the two arrays. Some people will mistakenly call this as call this O of n squared or O(n2) but that's not correct . When you just talk about runtime if you describe things as O of n  or O of n squared or O of n log, n must have a meaning, it's not like things are inherently data or other. So when you describe this as O of n squared it really doesn't, make sense because it doesn't what does n mean? n is not the size of the array, if there's two different arrays. What you want to describe instead is O of a times b. Again fundamentally, Big O is an equation that expresses how the runtime changes, how it scales, so you described as O of a times b or O(a*b).

(4) Drop non-dominant terms

 The fourth rule is that you drop non dominant terms. So let's suppose you have this code here that walks through an array and prints out the biggest element and then for some reason it goes and prints all pairs of values in the array. So that first step take O of n O(n) time, the second step takes O of n squared time O(n2), so you could first get as a you know less simplifiedform,  you can describe to the as O of n+ n squared O(n+n2), but note that, you compare this two O of n and the runtime, or the compare this to the run time O of n squared and the runtime O of n squared plus n squared. Both of those two runtimes reduce down to n squared and what's interesting here is that O of n plus n squared, should logically be between them. So and this n squared thing kind of dominates this O of n thing, and so you drop that non dominant term. It's the n squared that's really gonna drive how the runtime changes.

And if you want you can look up a more academic explanation for when exactly you drop things and what exactly you can't, but this is kind of layman's explanation for it. We've now covered the very basic pieces of Big O. This is a incredibly important concept to master for your interviews so don't be lazy, here make sure you really really understand this and try a whole bunch of exercises to solidify your understanding. Good luck

Steps To Become the person you want to be in life is in your hands, Business, Education Money, Career

Follow us
Twitter: @StepsToBecome
Facebook: @StepsToBecome
Google+: +Stepstobecome

Enter your email address:

Delivered by FeedBurner

Powered by Blogger.