Algorithms to Live By

With its talk of algorithms, theorems, proofs and abstract terminology, it’s hard to imagine any book on computer science being accessible or interesting to the average reader. Even less so when you think about applying it to the realm of human decision-making, which seems to operate in a completely separate dimension – that of intuition, emotion and heuristics. However, this book managed to exceed my expectations entirely. Other books and articles on AI tend to paint it as an abstract, complex and powerful thing, offering little in the way of solutions beyond reducing bias, promoting diversity or strengthening other social institutions in response. In contrast, this is one of the few pieces that actually puts you in a position to understand how algorithms may influence (and in many ways already are influencing) our lives. Rich with references and further reading, it is also a springboard for delving deeper into many central concepts in computer science should you be so inclined.

What I really love about this book is that it is grounded in practicality. We often think of computer science and mathematics as disciplines which are determinative and foolproof. Every answer must be correct and every theorem is unshakable. Yet, reality often forces us to deal with the constraints of limited space and time, missing or unreliable information, energy use and limited computational (or thinking) power. By taking all these factors into account, the authors make a strong case that in fact, many of the heuristics, shortcuts or seemingly irrational decisions we make as humans are the best we can do under our circumstances. Hopefully, we can come to a better appreciation of the limits of computation and have a more balanced view of how algorithms and AI may affect our lives in the future.

Outline

  • Optimal stopping problems deal with scenarios where we must make the most optimal choice out of a pool of options. However, we do not know the value of each option until we encounter it. This presents a dilemma that extends to scenarios such as hiring employees, dating, selling your house or finding a parking spot: how do we know when we should stop, and when we should keep looking? While there are different optimal algorithms depending on the contextual information available, none of them are guaranteed to find the best solution. Indeed, when faced with limited information in everyday choices, the best we can hope for is to maximise our chances for success.
  • Explore/Exploit is a classic dilemma that we face all the time: should we stick to the things we know and love, or risk exploring something new? This is framed in mathematics as the multi-armed bandit problem: deciding which slot machines to play that maximise your payoff over time. Win-stay, Lose-switch algorithms only solve part of the problem, since we shouldn’t let just one bad experience dictate our preferences for something. The Gittins Index offers a way of quantifying relative payoff rates as a function of wins and losses (or good and bad experiences). But more than anything, the answer to this problem is framed by the constraint of time. Exploration gradually gives way to exploitation as we gain more experience and information. This can offer us insights as to why things such as our personal preferences, hobbies and social circles become more fixed as we age.
  • Sorting is a problem that both computers and humans deal with an an everyday basis. Because it is required so often, finding more efficient sorting algorithms stands to drastically improve overall efficiency. But knowing how to sort is hardly as useful as knowing when to sort. In scenarios where the cost of searching something is low, it is much better not to sort at all. When sorting, we often care only about speed, but robustness is also an important factor. This comes up in scenarios where comparisons may not always yield consistent results (such as when two teams face off in sports) or when readings are faced with noise (like taking sensor readings). In these cases, seemingly inefficient algorithms which make more comparisons (such as bubble sort) may actually perform better.
  • Caching is the process of making important resources more accessible to improve the rate of work. We can think of our desk as a cache; anything on our desk is designed for easier reach and is (ideally) closely related to the task at hand. The problem, of course, is having limited space and deciding how best to use it. Computers use various replacement policies to decide what to evict from the cache in order to make space for new items, but unless we can predict the future, caching algorithms can never be perfectly optimal. However, the least recently used (LRU) policy comes surprisingly close in terms of performance. Finally, when building a cache, bigger always means slower: the bigger your trove of knowledge, the harder it will be to recall something. This offers us a novel way to perceive the problems with memory and recall that come with old age – it might be just an issue of too much knowledge.
  • Scheduling concerns itself with the ordering and managing of tasks. How should we best spend our time in a day, month, year or lifetime? Indeed, personal time management is a perennial problem that self-help gurus and productivity systems have all been trying to tackle. This chapter delves into the many perspectives that scheduling theory has to offer. There is the problem of precedence constraints and priority inversion, where we might hold off on an important task because something trivial is not yet done, and hold off on the trivial task because there are more important things to do. When faced with uncertainty in our daily schedules, context switching is a way for us to be more responsive to new tasks or deadlines that come our way. On the flip side, thrashing occurs when we are interrupted too often, and spend more time and effort in the process of switching between tasks instead of working on the task itself. To avoid this, it is important to set boundaries on interruptions: decide which interruptions are vitally important to you, which ones can be coalesced (to be done all at once), and which to ignore completely.
  • Bayes’ Rule: this chapter focuses on how our past experience informs our future predictions. We are always extrapolating trends and making educated guesses in realms such as business, dating, and gambling, and we build computer models to chart future trends based on current information and assumptions. Having a prior belief (such as the average age of a person in predicting life expectancy) goes a long way in helping us make educated guesses, but where no priors exist, we can apply the Copernican Principle as well: the longer something lasts, the longer you can expect it to last. Our predictions also matter in the context of the distributions that they exist as – normal, power and erlang distributions are all common in the real world. Finally, it is also important to note that our predictions are only as good as our priors – ironically, reading breaking news and exceptional stories all over the world might hurt rather than help our understanding of it.
  • Overfitting is what happens when we create overly complex models to explain our data and observations. When we focus too much on optimising our model to fit the data, we lose sight of the big picture, which often leads to undesired consequences. Overfitting food to taste good results in unhealthy, calorically dense meals. Overfitting schools to maximise the proxy metric of grades causes them to prioritise rote learning and scoring well on the test. The examples are endless; when we don’t have complete information (and we never do), a better fit for the data can actually create a worse model overall. We can mitigate overfitting by penalising complexity, meaning that more complex models must offer improvements that outweigh the penalty they incur. Cross fitting is a process by which we test our models against different data points, and this can help reduce overfitting as well. Overall, this chapter shows that keeping things simple can be the rational thing to do, our heuristics can help us make better decisions, and that cultural resistance to short-term changes can be a good thing.
  • Relaxation is a handy tool that can aid us when facing the most intractable problems. In this context, it implies loosening the constraints and boundaries of a problem, trying to solve an easier version which can then inform a viable solution to the original one. In human decision making, relaxation can be construed as “what if” scenarios – what would you do if you could not fail? what if all jobs paid the same? In some cases, implementing a solution that breaks some constraints can result in better outcomes with relatively trivial consequences. We may even consider it as thinking outside the box.
  • Randomness is another seemingly counterintuitive tool that can often give us a foothold when dealing with complex problems. As it turns out, knowing when to use randomness and probabilistic answers can help us arrive at viable answers without too much computation. Sampling is a simple way of using randomness to gain insight into unknown or complex phenomena without the hassle of computing an entire set of outcomes. The Miller Rabin primality test is a nondeterministic (or probabilistic) test that tells us whether a given number is a prime, but has a 25% chance of a false positive (a non-prime tested as prime). But repeating the same test 10 times gives us less than a one-in-a-million chance that the number is a false prime, which can make us fairly certain in our convictions. This brings up a crucial point: that algorithms should not simply be judged by space and time efficiency, but also error probability. Hill climbing and simulated annealing algorithms likewise use randomness to explore various “local maximum” solutions over a large space of possibilities. This is likened to exploring and finding the highest hills among different areas, then comparing their heights to hopefully find the highest hill among all the areas as a “global maximum”. Surprisingly, the more complex our problem, the more reasonable it is to turn to chance to help solve it.
  • Networking deals with the transfer and routing of information, and it offers surprisingly valuable lessons for human interactions as well. One of the central problems is knowing whether the opposite party has actually received your transmission, and acknowledgements play a central role in this. Machines use ACK signals; humans use phrases such as “mmm”, “uh-huh” and “oh?”. In fact, good communication between humans is as much a role of the listener as it is for the speaker. Exponential backoff is an algorithm that helps machines overcome connection failures: wait for an exponentially longer period of time between every failure. As it turns out, this may be a viable method for dealing with shaky human relationships, recurring drug addicts and more. Additive increase, multiplicative decrease (AIMD) is an algorithm that helps to manage decentralised network traffic and avoid congestion. This sawtooth-like process may offer us an alternative way to look at career progression, offering a more dynamic form of hierarchy which consistently brings organisational output to its maximum. Bufferbloat happens when queues in a network become too long and are never zeroed out; we end up with slow speeds and long queues. This highlights something important: queues are only useful when zeroed out. A perpetual queue is pointless because it signals that the rate of work cannot match the rate of new requests. This has parallels to the unread messages and emails that we are accustomed to piling up so often. In these cases, “never” is better than “later”.
  • Game theory deals with interactions between separate entities, the way 2 or more people interact in any context with varying outcomes. The first problem that strikes us when examining optimal strategies is the problem of recursion. Your strategy might pre-empt your opponent’s strategy, but your opponent might be pre-empting the fact that you pre-empted their strategy, and this can go on ad infinitum. This concept is known in computer science as the halting problem. To combat this, mathematicians often look for Nash equilibria which often serve as the best strategy in the long run. However, there are several complications with this. For one, proving the existence of an equilibrium does not necessarily imply that we can actually compute it. Furthermore, there are cases where dominant strategies based on an equilibrium do not lead to optimal outcomes (as illustrated by the Prisoner’s Dilemma). In these cases, mechanism design is an ingenious solution: changing the rules of the game to fit the desired behaviour, instead of the other way around. In effect, our emotions often serve this very role by making us care about the well-being of others and acting beyond our self-interest. A good life, then, is less about picking the right strategy than it is about playing the right game.

Leave a comment