So like, history really matters. It lets you see that a huge chunk of what you are operating with – all that is received – are previous distinctions made by *someone* to *someone*. And knowing the history gives you a huge advantage and you can do stuff you couldn’t do before and see stuff you haven’t seen before. And part of this power is that you can see what mistakes were made and avoid them.

I’m disturbed as to how Less Wrong is making claims to be about rationality, and then taking *some *results from the rationality literature and *absolutely ignoring the history of the field,* and *dismissing some modern currents of research, *and *neglecting really well established results, *with* no apparent justification.*

I’ll go through this at length at some point, now I just want to share some (lenghty) quotes about 1) The history of rationality, 2) The animating metaphor of rationality, 3) Why optimizing fails, 4) What doesn’t

**History & Metaphor**

“Disputes about the nature of human rationality are as old as the concept of rationality itself, which emerged during the Enlightenment (Daston, 1988). These controversies are about norms, that is, the evaluation of moral, social, and intellectual judgment (e.g., Cohen, 1981; Lopes, 1991). The most recent debate involves four sets of scholars, who think that one can understand the nature of sapiens by (a) constructing as-if theories of unbounded rationality, by (b) constructing as-if theories of optimization under constraints, by (c) demonstrating irrational cognitive illusions, or by (d) studying ecological rationality.” (1)

“The heavenly ideal of perfect knowledge, impossible on earth, provides the gold standard for many ideals of rationality. From antiquity to the Enlightenment, knowledge—as opposed to opinion—was thought to require certainty. Such certainty was promised by Christianity but began to be eroded by events surrounding the Reformation and Counter-Reformation. The French astronomer and physicist Pierre-Simon Laplace (1749–1827), who made seminal contributions to probability theory and was one of the most infl uential scientists ever, created a fictional being known as Laplace’s superintelligence or demon. The demon, a secularized version of God, knows everything about the past and present and can deduce the future with certitude. This ideal underlies the fi rst three of the four positions on rationality, even though they seem to be directly opposed to one another. The fi rst two picture human behavior as an approximation to the demon, while the third blames humans for failing to reach this ideal. I will use the term omniscience to refer to this ideal of perfect knowledge (of past and present, not future). The mental ability to deduce the future from perfect knowledge requires omnipotence, or unlimited computational power. To be able to deduce the future with certainty implies that the structure of the world is deterministic. Omniscience, omnipotence, and determinism are ideals that have shaped many theories of rationality. Laplace’s demon is fascinating precisely because he is so unlike us. Yet as the Bible tells us, God created humans in his own image. In my opinion, social science took this story too literally and, in many a theory, re-created us in proximity to that image.” (1)

#### Programs of rationality

Context-independent norms? | Behavior can conform to those norms? | Perfect strategy achievable in principle? | Perfect strategy achievable in practice? | |

Unbounded | Y | Y | Y | Y |

Optimization under constraints | Y | Y | Y | N |

Heuristics and Biases | Y | Y | N | N |

Fast and Frugal Heuristics | N | N | N | N |

(This is mostly correct and how each program assesses itself. I’d argue that the F&F program *does accept *normative ideals.)

#### Programs of rationality as they relate to the metaphor

The metaphor of the demon illustrates the programs and their differences.

The demon is omnipotent and omniscient and the universe is deterministic in the original thought experiment (which is how it can perfectly predict everything.) Unbounded rationality lets go of determinism and sides in optimization instead. Optimization within constraints let’s go of omnipotence but keep optimization. (All of the above as descriptive models.) The heuristics and biases program uses the demon as an ideal of behavior and then finds instances of failures. The fast and frugal program claims to do away with the demon (and normative ideals) completely. I think that is wrong, but I’ll leave that for a future post.

**Why you can’t optimize anything**

“

#### Optimization Is Intractable in Most Natural Situations

The ideal of as-if optimization is obviously limited because, in most natural situations, optimization is computationally intractable in any implementation, whether machine or neural. In computer science, these situations are called NP-hard or NP-complete; that is, the solution cannot be computed in polynomial time. “Almost every problem we look at in AI is NP-complete” (Reddy, 1988: 15). For instance, no mind or computer can apply Bayes’s rule to a large number of variables that are mutually dependent because the number of computations increases exponentially with the number of variables.

Probabilistic inference using Bayesian belief networks, for example, is intractable (Cooper, 1990; Dagum & Luby, 1993). In such situations, a fully rational Bayesian mind cannot exist. Even for games with simple and well-defi ned rules, such as chess and Go, we do not know the optimal strategy. Nevertheless, we do know what a good outcome is. In these situations, as-if optimization can only be achieved once the real situation is changed and simplified in a mathematically convenient way so that optimization is possible. Thus, the choice is between finding a good heuristic solution for a game where no optimal one is known and fi nding an optimal solution for a game with modified rules. That may mean abandoning our study of chess in favor of tic-tac-toe.

#### Optimization Is Not an Option with Multiple Goals

If optimization of a criterion is tractable, multiple goals nevertheless pose a difficulty. Consider the traveling salesman problem, where a salesman has to find the shortest route to visit N cities. This problem is intractable for large Ns but can be solved if the number of cities is small, say if N = 10, which results only in some 181,000 different routes the salesman has to compare (Gigerenzer, 2007). However, if there is more than one goal, such as the fastest, the most scenic, and the shortest route, then there is no way to determine to single best route. In this case there are three best routes. One might try to rescue optimization by weighting the three goals, such as 3x + 2y + z, but this in turn means determining what the best weights would be, which is unclear for most interesting problems.

#### Optimization Is Not an Option with Imprecise, Subjective Criteria

Unlike physical distance, subjective criteria are often imprecise. Happiness, which means many things, is an obvious example. But the issue is more general. A colleague of mine is an engineer who builds concert halls. He informed me that even if he had unlimited amounts of money at his disposal, he would never succeed in maximizing acoustical quality. The reason is that although experts agree on the defi nition of bad acoustics, there is no consensus as to what defi nes the best. Hence he could try to maximize acoustical quality according to a single expert’s judgment but not to unanimous agreement.

#### Optimization Is Unfeasible When Problems Are Unfamiliar and Time Is Scarce

In situations where optimization is in principle possible (unlike those under the first three points), a practical issue remains. Selten (2001) distinguishes between familiar and unfamiliar problems. In the case of a familiar problem, the decision maker knows the optimal solution. This may be due to prior training or because the problem is simple enough. In the case of an unfamiliar problem, however, the decision maker cannot simply employ a known method because the method that leads to the best result must fi rst be discovered. In other words, the agent has to solve two tasks: level 1, employing a method that leads to a solution, and level 2, fi nding this method. Thus, two questions arise. What is the optimal method to be chosen? And what is the optimal approach to discovering that method? (There may be an infi nite regress: level 3, finding a method for level 2, and so on.) At each level, time must be spent in deciding. Although Selten’s argument concerning unfamiliar problems has not yet been cast into mathematical form, as the issue of combinatorial explosion has been, it strongly suggests that an optimizing approach to unfamiliar problems is rarely feasible when decision time is scarce.

#### Optimization Does Not Imply an Optimal Outcome

Some economists, biologists, and cognitive scientists seem to believe that a theory of bounded rationality must rely on optimization in order to promise optimal decisions. No optimization, no good decision. But this does not follow. Optimization needs to be distinguished from an optimal outcome. Note that the term optimization refers to a mathematical process—computing the maximum or minimum of a function—which does not guarantee optimal outcomes in the real world. The reason is that one has to make assumptions about the world in order to be able to optimize. These assumptions are typically selected by mathematical convenience, based on simplifi cations, and rarely grounded in psychological reality. If they are wrong, one has built the optimization castle on sand and optimization will not necessarily lead to optimal results. This is one reason why models of bounded rationality that do not involve optimization can often make predictions as good as those made by models that involve optimization (Gigerenzer & Selten, 2001b; Selten, 1998; March, 1978). A second reason is robustness. As described in chapter 1, asset allocation models that rely on optimization, Bayesian or otherwise, can perform systematically worse than a simple heuristic because parameter estimates are not robust.” (1)

** **

**If no optimization then what?**

Although I can’t find it now, I seem to recall that Chalmers had an argument up that was something like this:

- If not-P, then what?
- Therefore, P [1].

And it’s funny because its true and I totally want to avoid the trap of trying to take something away and not giving something back, when there is something to be given. And it has been there for *60 years*.

** **

### Satisficing

“Throughout the early history of decision making research, researchers devised mathematical algorithms to predict purely rational, or optimal, decision making (Tyszka, 1989). Simon (1955, 1956) rejected the idea of optimal choice, proposing the theory of “bounded rationality.” He argued that due to time constraints and cognitive limitations, it is not possible for humans to consider all existing decision outcomes and then make fully reasoned, purely rational, choices. He suggested that humans operate rationally within practical boundaries, or within the limits of bounded rationality. For example, classic rational choice models would indicate that a man who wanted to sell his house would perform a series of mathematical manipulations for each offer he would receive. These mathematical operations would identify the probable range of likely offers, the probability of receiving a better offer within an acceptable time period, the relative diminished value of a higher offer at some future time, etc. Such calculations are beyond the computational capacity of most decision makers and are prohibitively labor intensive for nearly all decision makers. Instead, Simon (1955) predicted that the man in this example would simplify the decision making process by Assum[ing] a price at which he can certainly sell and will be willing to sell in the nth time period. Second, he will set his initial acceptance price quite high, watch the distribution of offers he receives, and gradually and approximately adjust his acceptance price downward or upward until he receives an offer he accepts—without ever making probability calculations. (Simon, 1955, pp. 117–118) The idea of simplification methods underlies all of Simon’s work in human cognition. The most important simplification mechanism is “satisficing,” or choosing decision outcomes that are good enough to suit decision makers’ purposes, but that are not necessarily optimal outcomes. Satisficing involves “setting an acceptable level or aspiration level as a final criterion and simply taking the first acceptable [option]” (Newell & Simon, 1972, p. 681). Gigerenzer and Goldstein (1996) explained the origin of this term: “Satisficing, a blend of sufficing and satisfying, is a word of Scottish origin, which Simon uses to characterize algorithms that successfully deal with conditions of limited time, knowledge, or computational capacities” (p. 651). Satisficing acts as a “stop rule” (Simon, 1979, p. 4)— once an acceptable alternative is found, the decision maker concludes the decision process. Nonetheless, satisficing does not limit the decision maker to one deciding factor: When the criterion of problem solution or action has more than one dimension, there is the matter of calculating the relative merits of several alternatives, one of which may be preferred along one dimension, another along another . . . The satisficing rule . . . stipulates that search stops when a solution has been found that is good enough along all dimensions. (Simon, 1979, p. 3) Nor does satisficing lock the decision maker into searching for an unrealistically superior option or a needlessly inferior option. Decision makers adjust “aspirations upward or downward in the face of benign or harsh circumstances, respectively” (Simon, 1979, p. 3) as they work through the decision-making process. Simon contended that satisficing generally leads to choices roughly equal in quality to the choices predicted by optimizing algorithms. Gigerenzer and Goldstein (1996) showed that a simple “Take the Best” option algorithm (the electronic equivalent of human satisficing) matched or outperformed an optimizing algorithm on accuracy and speed. Because decisions based on satisficing behaviors are much easier to make than decisions based on optimizing algorithms, requiring less time and less cognitive exertion, satisficing is a highly rational, efficient decision-making behavior.” (2)

** **

**Conclusion**

Scott Alexander recently published an apologia of the rationalist community. The very first comment says “In the interest of steelmanning, perhaps you should consider it a critique of the proclaimed values of rationalism vs the enacted ones. You do address this to some degree, but on some points you don’t address it.”

The fact that established results that are 60 years old are mostly ignored, and that valid mainstream and contemporary currents of research (and not only Fast & Frugal heuristics, but also Naturalistic Decision Making) means that the community is, to put it mildly, failing at the virtue of scholarship.

(I don’t particularly care about Less Wrong and think it is wrong about a bunch of things, and that there are a bunch of things wrong with it. *But* it was my first scaffold to understand a lot of things and my understanding develops in the rejection and refinement of some earlier views coming from it.)

** **

- – Gigerenzer, G. (2008).
*Rationality for mortals: How people cope with uncertainty*. Oxford University Press. - – Agosto, D. E. (2002). Bounded rationality and satisficing in young people’s Web‐based decision making.
*Journal of the American Society for Information Science and Technology*,*53*(1), 16-27.

Future:

- Go through power of metaphor & analogy (Lakoff, Hofstadter)
- epistemic humility and responsibility
- importance of history
- Go into why gigerenzer is awesome even if wrong on a bunch of things, and how stanovich is totally awesome as well

## 2 thoughts on “You can’t optimize anything, literally”