highlights from algorithms to live by

A route is just an ordering of the towns, so trying them all by brute force is the dreaded O(n!) “factorial time”—the computational equivalent of sorting a deck of cards by throwing them in the air until they happen to land in order.
When computer scientists are up against a formidable challenge, their minds also turn to relaxation, as they pass around books like An Introduction to Relaxation Methods or Discrete Relaxation Techniques. But they don’t relax themselves; they relax the problem.
For instance, you can relax the traveling salesman problem by letting the salesman visit the same town more than once, and letting him retrace his steps for free. Finding the shortest route under these looser rules produces what’s called the “minimum spanning tree.” (If you prefer, you can also think of the minimum spanning tree as the fewest miles of road needed to connect every town to at least one other town. The shortest traveling salesman route and the minimum spanning tree for Lincoln’s judicial circuit are shown below.) As it turns out, solving this looser problem takes a computer essentially no time at all. And while the minimum spanning tree doesn’t necessarily lead straight to the solution of the real problem, it is quite useful all the same. For one thing, the spanning tree, with its free backtracking, will never be any longer than the real solution, which has to follow all the rules. Therefore, we can use the relaxed problem—the fantasy—as a lower bound on the reality.
As we noted, discrete optimization’s commitment to whole numbers—a fire department can have one engine in the garage, or two, or three, but not two and a half fire trucks, or π of them—is what makes discrete optimization problems so hard to solve. In fact, both the fire truck problem and the party invitation problem are intractable: no general efficient solution for them exists. But, as it turns out, there do exist a number of efficient strategies for solving the continuous versions of these problems, where any fraction or decimal is a possible solution. Researchers confronted with a discrete optimization problem might gaze at those strategies enviously—but they also can do more than that. They can try to relax their discrete problem into a continuous one and see what happens.
a powerful computational technique called Lagrangian Relaxation. The idea behind Lagrangian Relaxation is simple. An optimization problem has two parts: the rules and the scorekeeping. In Lagrangian Relaxation, we take some of the problem’s constraints and bake them into the scoring system instead. That is, we take the impossible and downgrade it to costly.
In 1777, George-Louis Leclerc, Comte de Buffon, published the results of an interesting probabilistic analysis. If we drop a needle onto a lined piece of paper, he asked, how likely is it to cross one of the lines? Buffon’s work showed that if the needle is shorter than the gap between the blines, the answer is 2⁄π times the needle’s length divided by the length of the gap. For Buffon, deriving this formula was enough. But in 1812, Pierre-Simon Laplace, one of the heroes of chapter 6, pointed out that this result has another implication: one could estimate the value of π simply by dropping needles onto paper.
just play the game. I noticed that it may be much more practical to [try] … laying down the cards, or experimenting with the process and merely noticing what proportion comes out successfully, rather than to try to compute all the combinatorial possibilities which are an exponentially increasing number so great that, except in very elementary cases, there is no way to estimate it. This is intellectually surprising, and if not exactly humiliating, it gives one a feeling of modesty about the limits of rational or traditional thinking. In a sufficiently complicated problem, actual sampling is better than an examination of all the chains of possibilities.
Miller had found a set of equations (expressed in terms of two numbers, n and x) that are always true if n is prime, regardless of what values you plug in for x. If they come out false even for a single value of x, then there’s no way n can be prime—in these cases, x is called a “witness” against primality. The problem, though, is false positives: even when n is not prime, the equations will still come out true some of the time. This seemed to leave Miller’s approach hanging. Rabin realized that this was a place where a step outside the usually deterministic world of computer science might be valuable. If the number n is actually nonprime, how many possible values of x would give a false positive and declare it a prime number? The answer, Rabin showed, is no more than one-quarter. So for a random value of x, if Miller’s equations come out true, there’s only a one-in-four chance that n isn’t actually prime. And crucially, each time we sample a new random x and Miller’s equations check out, the probability that n only seems prime, but isn’t really, drops by another multiple of four. Repeat the procedure ten times, and the probability of a false positive is one in four to the tenth power—less than one in a million. Still not enough certainty? Check another five times and you’re down to one in a billion.
How certain is certain enough? In practice, modern cryptographic systems, the ones that encrypt Internet connections and digital transactions, are tuned for a false positive rate of less than one in a million billion billion. In other words, that’s a decimal that begins with twenty-four zeros—less than one false prime for the number of grains of sand on Earth. This standard comes after a mere forty applications of the Miller-Rabin test. It’s true that you are never fully certain—but you can get awfully close, awfully quick.
One curious example from mathematics is known as “polynomial identity testing.” If you have two polynomial expressions, such as 2x3 + 13x2 + 22x + 8 and (2x + 1) × (x + 2) × (x + 4), working out whether those expressions are in fact the same function—by doing all the multiplication, then comparing the results—can be incredibly time-consuming, especially as the number of variables increases. Here again randomness offers a way forward: just generate some random xs and plug them in. If the two expressions are not the same, it would be a big coincidence if they gave the same answer for some randomly generated input.
Computer scientists know this concept as the “Byzantine generals problem.” Imagine two generals, on opposite sides of a valley that contains their common enemy, attempting to coordinate an attack. Only by perfect synchronization will they succeed; for either to attack alone is suicide. What’s worse, any messages from one general to the other must be delivered by hand across the very terrain that contains the enemy, meaning there’s a chance that any given message will never arrive. The first general, say, suggests a time for the attack, but won’t dare go for it unless he knows for sure that his comrade is moving, too. The second general receives the orders and sends back a confirmation—but won’t dare attack unless he knows that the first general received that confirmation (since otherwise the first general won’t be going). The first general receives the confirmation—but won’t attack until he’s certain that the second general knows he did. Following this chain of logic requires an infinite series of messages, and obviously that won’t do. Communication is one of those delightful things that work only in practice; in theory it’s impossible.
The breakthrough turned out to be increasing the average delay after every successive failure—specifically, doubling the potential delay before trying to transmit again. So after an initial failure, a sender would randomly retransmit either one or two turns later; after a second failure, it would try again anywhere from one to four turns later; a third failure in a row would mean waiting somewhere between one and eight turns, and so on. This elegant approach allows the network to accommodate potentially any number of competing signals. Since the maximum delay length (2, 4, 8, 16…) forms an exponential progression, it’s become known as Exponential Backoff. Exponential Backoff was a huge part of the successful functioning of the ALOHAnet beginning in 1971, and in the 1980s it was baked into TCP, becoming a critical part of the Internet. All these decades later, it still is.
The most prevalent critique of modern communications is that we are “always connected.” But the problem isn’t that we’re always connected; we’re not. The problem is that we’re always buffered. The difference is enormous. The feeling that one needs to look at everything on the Internet, or read all possible books, or see all possible shows, is bufferbloat. You miss an episode of your favorite series and watch it an hour, a day, a decade later. You go on vacation and come home to a mountain of correspondence. It used to be that people knocked on your door, got no response, and went away. Now they’re effectively waiting in line when you come home.
In poker, recursion is a dangerous game. You don’t want to get caught one step behind your opponent, of course—but there’s also an imperative not to get too far ahead of them either. “There’s a rule that you really only want to play one level above your opponent,” explains poker professional Vanessa Rousso. “If you play too far above your opponent, you’re going to think they have information that they don’t actually have—[and] they won’t be able to glean the information that you want them to glean from your actions.” Sometimes poker pros will deliberately bait their opponent into a convoluted recursion, meanwhile playing completely by-the-book, unpsychological poker themselves. This is known as luring them into “a leveling war against themselves.”
Why not simply allow them unlimited vacation? Anecdotal reports thus far are mixed—but from a game-theoretic perspective, this approach is a nightmare. All employees want, in theory, to take as much vacation as possible. But they also all want to take just slightly less vacation than each other, to be perceived as more loyal, more committed, and more dedicated (hence more promotion-worthy). Everyone looks to the others for a baseline, and will take just slightly less than that. The Nash equilibrium of this game is zero. As the CEO of software company Travis CI, Mathias Meyer, writes, “People will hesitate to take a vacation as they don’t want to seem like that person who’s taking the most vacation days. It’s a race to the bottom.”
Let’s return you and your bank-robbing co-conspirator to the jail cell for another go at the prisoner’s dilemma, with one crucial addition: the Godfather. Now you and your fellow thief are members of a crime syndicate, and the don has made it, shall we say, all too clear that any informants will sleep with the fishes. This alteration of the game’s payoffs has the effect of limiting the actions you can take, yet ironically makes it far more likely that things will end well, both for you and your partner. Since defection is now less attractive (to put it mildly), both prisoners are induced to cooperate, and both will confidently walk away half a million dollars richer. Minus, of course, a nominal tithe to the don. The counterintuitive and powerful thing here is we can worsen every outcome—death on the one hand, taxes on the other—yet make everyone’s lives better by shifting the equilibrium.