Discourse: Legibility

Mulling over the past posts I see a trend of caring about discourse. I was not really sure why that was happening, but I believe I have some insight now. Inferential distances seem like a terrible problem. I commend Eliezer on structuring the problem, but think we stopped too soon at trying to find a solution.

I *really* want a domain general way of reducing inferential distances. I want to be able to talk to hedgehogs, to have interdisciplinary discourse happen instead of being a buzzword, want people to understand why randomisation of pure strategies is a nash equilibrium without having to explain all of game theory and lose my interlocutor’s goodwill mid-twelfth sentence.

Everything that I care about depends on other humans in some form, and especially in communication lines not breaking down. Safe, ethical, reasonable, alive discourse is really important and I am  thus am grasping at ways to make discourse easier.

I think that legibility provides a way forward from stating the problem of inferential distances. I think it reduces a very broad area of potential inferential distances. If you get legibility then stuff like the invisible hand of the market, or how competing companies end up solving needs of the population, or how cities are built without central planning, or how startups figure out secrets becomes much easier to understand. Grokking legibility allows you to bypass this particular failure of folk epistemology of not grasping how there can be non-predictive control.

This seems, at the moment, to be the right way to approach the “problem of inferential distances”. Find, and share, concepts that explain an area of potential inferential distances. With that in mind, over the next two sections I first describe legibility, using as scaffold an essay by Venkatesh Rao, and then add my commentary in the second session.


Venkatesh has a great introduction to legibility. Legibility justifies a failure mode where you project “your subjective lack of comprehension onto the object you are looking at, as “irrationality.”” The failure mode is caused by a desire for legibility.

The idea comes from the book “Seeing like a State” in which James C. Scott’s uses it as an illustration of what states do and how they fail.

“The book begins with an early example, “scientific” forestry (illustrated in the picture above). The early modern state, Germany in this case, was only interested in maximizing tax revenues from forestry. This meant that the acreage, yield and market value of a forest had to be measured, and only these obviously relevant variables were comprehended by the statist mental model. Traditional wild and unruly forests were literally illegible to the state surveyor’s eyes, and this gave birth to “scientific” forestry: the gradual transformation of forests with a rich diversity of species growing wildly and randomly into orderly stands of the highest-yielding varieties. The resulting catastrophes — better recognized these days as the problems of monoculture — were inevitable.

High-modernist (think Bauhaus and Le Corbusier) aesthetics necessarily lead to simplification, since a reality that serves many purposes presents itself as illegible to a vision informed by a singular purpose. Any elements that are non-functional with respect to the singular purpose tend to confuse, and are therefore eliminated during the attempt to “rationalize.” The deep failure in thinking lies is the mistaken assumption that thriving, successful and functional realities must necessarily be legible. Or at least more legible to the all-seeing statist eye in the sky (many of the pictures in the book are literally aerial views) than to the local, embedded, eye on the ground.

Complex realities turn this logic on its head; it is easier to comprehend the whole by walking among the trees, absorbing the gestalt, and becoming a holographic/fractal part of the forest, than by hovering above it.

This  imposed simplification, in service of legibility to the state’s eye, makes the rich reality brittle, and failure  follows. The imagined improvements are not realized. The metaphors of killing the golden goose, and the Procrustean bed come to mind.”

“The picture is not an exception, and the word “legibility” is not a metaphor; the actual visual/textual sense of the word (as in “readability”) is what is meant.”

Venkatesh further proposes that legibility is desired because it serves a very specific psychological function. Legibility “quells the anxieties evoked by apparent chaos.”



I have found Venkatesh’s analysis to be both good and incomplete. In what follows I try to augment the analysis with commentary on what I think Venkatesh missed.

Legibility is an interactional property

Presumably “这是废话” is gibberish to you, whilst the rest of the essay is not. “The picture is not an exception, and the word “legibility” is not a metaphor; the actual visual/textual sense of the word (as in “readability”) is what is meant.” Readability is not an object property but an interactional property, the readability of the chinese characters are as dependent on their proper formulation as on your ability to read chinese.

This means that a particular object is not legible or illegible in itself, but legible or illegible to someone.

Legibility serves a social justification purpose

If you are the one being read, the incentives may be stacked in such a way that you really wish to be legible. You can imagine various times in history in which not being legible meant death.
This explains why the same social regularities get explained in wildly different ways – there is an incentive to be legible which means slightly confabulating the original phenomenon being experienced.

Different societal configurations accept different explanations and this change in incentives has wide-reaching consequences. An example is the relatively innocent modern idea that art is self-expression.

“We have an idea that art is self-expression—which historically is weird. An artist used to be seen as a medium through which something else operated. He was a servant of the God. Maybe a mask-maker would have fasted and prayed for a week before he had a vision of the Mask he was to carve, because no one wanted to se his Mask, they wanted to see the God’s. When Eskimos believed that each piece of bone only had one shape inside it, then the artist didn’t have to ‘think up’ an idea. He had to wait until he knew what was in there — and this is crucial. When he’d finished carving his friends couldn’t say ‘I’m a bit worried about that Nanook at the third igloo’, but only, ‘He made a mess getting that out!’ or ‘There are some very odd bits of bone about these days.’ These days of course the Eskimos get booklets giving illustrations of what will sell, but before we infected them, they were in contact with a source of inspiration that we are not. It’s no wonder that our artists are aberrant characters. It’s not surprising that great African sculptors end up carving coffee tables, or that the talent of our children dies the moment we expect them to become adult. Once we believe that art is self-expression, then the individual can be criticised not only for his skill or lack of skill, but simply for being what he is.” (1)

It should be easy to see how this will shift the type of creative work that can be done.

Legibility eases the anxieties of chaos through control

According to Venkatesh’s analysis, readers want legibility because it quells the anxieties of chaos. I think this is true and incomplete. There is a question left open: why is it that legibility quells the anxiety? Why is it that chaos causes anxiety?

Chaos is the lack of the existence of a pattern. No patterns means that prediction is impossible. And under a certain mindset power and control can only be gained through predictive knowledge.

What the fans of predictive control, of legibility-through-rationalization miss is that not all control entails a better outcome. It seems like there are particular cases in which letting go of (predictive) control leads to a better equilibrium.



Legibility is (1) interactional, (2) social justificatory and (3) provides a sort of predictive control. The first thing to understand is that there is non-predictive control. By definition these things will not be legible, but can still be controllable. Apparent chaos doesn’t not mean uncontrollability or powerlessness. The second one is that legibility is situated and relational, that is, what is legible at a certain time and place may not be in another, and thus one need not see lack of intelligibility as a failing of the object of study. Chesterton said this best: “In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, “I don’t see the use of this; let us clear it away.” To which the more intelligent type of reformer will do well to answer: “If you don’t see the use of it, I certainly won’t let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.”


  1. – Keith Johnstone, Impro, p. 78-79.









Transformative practices and isolation

“How to get people in meditation?” Mark answers at 3:04: “The long game is sorta of… just to be like… so like fucking awesome and clearly so much fun and having such a great life… I think the best motivators are just sort of become one of the shiny happy people, become ridiculous successful so that people do whatever you do.”

I think meditation is awesome and totally incredible in a million ways. Same with therapy.

I have managed to get exactly 0 people into meditation and about 6 into therapy.

Mark’s answer is about letting them come, letting them see how much insanely better life can be, and reason themselves into doing what is unblocking that. Foot-in-the-door, over door-in-the-face.

I *fear* that this approach might fail. I hold this fear for 3 reasons:

  1. I don’t believe you can shift the position of many people through reason.
  2. Both of these proposals lead to a better life in expectation
  3. 2nd and further order effects

With regards to the first one, part of the blog is arguing for that, and I touched on it here.

The second is that at any particular point in time, and for any particular person they might actually get worse, or be going through a valley in search of the next local optimum, and thus it is not clear to the outsiders how this is not worse. (Related: dark night.)

To talk about the third one I want to cite a study from memory, with the promise I’ll find the original source later. The study had three people watching a movie at the same time, and alone. Two watched a shitty movie, one watched a good movie. Afterwards they got to sit around a coffee table and talk. Then they had their happiness measured. Take a moment to guess the outcomes and why.

The people that saw the shitty movie were happier – presumably because they could connect over the shitty movie. People bond over negative experiences. This explains a bunch of social practices. (Like hazing, which Cialdini to my recall in Influence said was only about commitment and consistency effects, I think this is what is going on.)

It is hard to take 2nd and further-order effects into account and not just quit. Being healthy for example. A noble enough goal: eat healthy, exercise. And so you figure out a healthy diet and exercise and focus on those and keep trying to keep on the mark.

And one year goes by and you realize that eating healthy is almost impossible because no one even know what that is and that focusing too much on it and on exercise damaged your social support network which happens to have an even bigger effect on your health. And you played by the rules and did the best you could and feel cheated on and torn and angry and rebellious.

It is not clear to me that just engaging in these practices – whilst everything around stays the same – ends up to be that beneficial in the long run. Part of it – for me – has to do with loneliness and connection.

Loneliness and Connection

I have, in the past, used two metaphors before to explain how the “bad side” of therapy and meditation (and a bunch more things) feels like: one is that I feel like I’m climbing a mountain and I have to climb and that the worst thing in the world is to one day be on my death bed and be thinking about how I could’ve climbed further and didn’t. The problem is that as I climb there are less and less fellow climbers.

The second one is through venn diagrams. The more I climb, the more circles delimit my area, the lonelier it gets. And I don’t want to be lonely, and showing only one part of myself at a time. I want to show all, at all times, and to have it belong and connect and to connect through it and to touch and feel touched in all the facets sequentially and in parallel.

I suspect that the partial emphasis I’ve had thus far on bridging communities and discourses and communicating is an attempt to work through this dialectic.

And I really care about all this, like, really care about it – it cuts to the bone. And I have no solution and no way to negotiate between these drives yet and it hurts. Like an impossible trade-off. How do you choose between your two children?

Epistemic Virtues?

In this essay I explore how I conceptualise of how to go about knowing, and the strange interactions between attempts to gain knowledge and their impacts on ourselves and others. I frame this in term of epistemic virtues, since these are a helpful frame to designate something that is desirable.

Inter-community communication

I’ve mentioned this quote before:

“Ok, my question might’ve been easy to misunderstand. My point was that it seems to me that you’re not familiar with the general culture in which MacIntyre writes, and so you don’t even get what he’s saying and what narratives he’s responding to. It’s like reading Nietzsche when you don’t know what Christianity is.

So your confusions aren’t about what MacIntyre is in fact saying (some of which I think has merit, some doesn’t), but it just fails to connect at all.

And while I overall like MacIntyre, I’m not enough of a fan to try to bridge that gap for him, and unless I did this full-time for a year or so, I don’t think I could come up with something better than “well, read these dozens of old books that might not seem relevant to you now, and some of which are bad but you won’t understand the later reactions otherwise, and also learn these languages because you can’t translate this stuff”. Which is a horrible answer.

Worse, it doesn’t even tell you why you should care to begin with. I think part of that is that, besides the meta-point that MacIntyre makes about narratives in general, it seems to me that the concrete construction and discourse he uses is deeply *European* and unless you are reasonably familiar with that, it will seem like one theologian advocating Calvinism instead of Lutherism when you’re Shinto and wonder why you should care about Jesus at all. (This is a general problem for non-continental readings of continental philosophy, I think – it’s deeply rooted in European drama.One reason Aristotle is so attractive is that all European drama theory derives from him and even someone as clever as Brecht couldn’t break it, so he’s an obvious attractor. I, and I suspect many continentals, came to philosophy essentially through drama, and that makes communication with outsiders difficult. Not enough shared language and sometimes very different goals.)

So I’ll save that goodwill for some later (and more fruitful) topic, if you don’t mind.

As to MacIntyre’s meta-point of “use the community-negotiated tools and narratives you already have” instead of “look for elegant theories no one actually uses anyway”, well, I *wanted* to write a different explanation of that, but then Vladimir did it already in his comment below, and I couldn’t do a *better* job right now, but he still failed, so…”

I love it all the way to hell and want to talk more about it.

Another way in which I love it is because it’s about the principle of charity and epistemic humility.


Epistemic arrogance

It seems that the main way that the educated people I know form opinions, when in doubt, is to do a literature search, discover the existing positions, read them up, and decide for one or another. What the actual hell?

This behaviour entails that these people alief something like this “My reasoning methods are such that I can go into a field, understand the positions held by experts, understand their disagreements, and how to resolve them, and execute this solution. Also, I can do it, in the fraction of the time it took the experts.” (Since they’ve been going back and forth about this for years.)

This strikes me as a most interesting form of insanity. (Now, if what was happening was “I have no reason to, but I’m gonna take this particular expert’s word for it, and hold the meta-belief that I’m probably wrong and keep on the lookout for evidence for me being wrong” then that is fine and in fact the whole blog is something of the sort because I believe you need some opinion to update on, else your brain will just rationalize whatever happens into I-knew-it-all-along and no model will ever be developed.)

If you have a strong opinion on any issue that the experts are divided on (like all of the PhilPapers survey questions) then you need really strong argument as to why you are not just extremely overconfident, but actually have considerably better reasoning mechanisms, or access to evidence or reasons not being taken into account by the experts.

One big offender is Less Wrong. A philosophical community that takes all its positions to be factual since it thinks that philosophy is useless and lacks the historical and philosophical awareness to recognise itself a philosophical community.

Another offender is everyone ever, pretty much. We are all born into naive realism and into using our reasoning mechanisms and taking their outputs for the truth. It takes a lot to get to the point where you consider that they might be wrong (9 Elizer-essays, roughly.). This leads to all kinds of weird effects (Some of which I described in the context of  Social Descriptive Epistemology.)

I’ve met exactly one person that has a claim to be reasoning from first principles. His reasoning processes are really different from what I see others doing.

Epistemic humility

Know who else was all about questioning his beliefs, and rebuilding them bottom-up from first principles? Descartes.

Hear me out a second. Maybe Descartes screwed up in some areas, sure. But I respect him:

Meditation I. Of the things which may be brought within the sphere of the doubtful.
It is now some years since I detected how many were the false beliefs that I had from my earliest youth admitted as true, and how doubtful was everything I had since constructed on this basis; and from that time I was convinced that I must once for all seriously undertake to rid myself of all the opinions which I had formerly accepted, and commence to build anew from the foundation, if I wanted to establish any firm and permanent structure in the sciences.” (1)

This is a very radical goal. This is the goal of someone that has been proven terribly wrong, someone that had to deal with real lovecraftian monsters, and at that level I get him and his desire for certainty.

And yes, maybe building a philosophical system up from pieces that you can be absolutely certain of was not the best option, but it was a brilliant option at the time. (2)

The thrust was coming from the right place, a deep felt understanding of what it is like to be horribly wrong.

A modern piece of thinking that I assess as coming from the same place-of-thinking is that of moral uncertainty. Moral uncertainty admits that you are uncertain between moral theories, “about which moral theory is right”, and from that point on provides questions about what to do. If you assess most probability to one theory and it weakly suggests one action, and all other theories strongly suggest other actions, what to do? How do you compare value between theories? Are there actions that all theories recommend?

I don’t care particularly about the details, but I do care that it takes as a starting point a state of uncertainty, of lack of knowledge. I find this to be epistemically responsible.

Epistemic responsibility

So, I’ve talked before about how definitions matter. Given that frame it is not surprising to me that you can get amazing life transformations through acceptance and mindfulness. (That is, through observing without judgement.)

Judgement needs a normative frame from which to judge. In psychotherapy, when this normative frame is given by society and the approach focus on this normative aspect, it is called a superego-based approach: “Ego analysis, although itself emerging out of psychoanalytic thinking, is practically unrecognizable as such. Apfelbaum describes it as “analyzing without psychoanalyzing.” He has made what seems on the face of it a simple shift – replacing the id with the superego as the major pathological force – but the result has been the total transformation of the nature of therapy.

The replacement of the id with the superego as the principal pathological force shifts the client-therapist relationship from one that is intrinsically adversarial (i.e., therapists’ seeing themselves as dealing with resistant clients) to one that is intrinsically collaborative. The core of the problem is now seen as punitive superego injunctions – the feeling of unentitlement to what clients are experiencing (their sense of shame or self-blame) that prevents them from getting sufficiently on their own sides to think and talk effectively about it. Therapists become advocates for clients, in order to get the clients on their own sides.”  (3)

Superego therapies, in the language of the former post, get back to seeing that the mountains are mountains, or that X was never about Y. You don’t need to be violent to yourself because you are lazy, because you were never lazy to start with. “Lazy” doesn’t obtain, it is (a) folk psychology, (b) not a good category, and (c) it is a thin story that damages the one accepting it.

I think people going normative over themselves is a huge source of self-violence. I think this is precisely what Marshall Rosenberg got to with NVC and jackal language: the language of “should”, “ought”, “must”, “have to.”

Epistemic responsibility is thus a shift from judging (which can only be done according to your normative system) to distrust about your normative systems and acceptance and observation. You really don’t want to kill yourself trying to attain whatever inconsistent, not-actualizable, wrong normative system you currently hold. (Like, RCT for example.)

Mark described a very similar effect (you can easily substitute S1 and S2 for Id and Superego and it applies):

“System one is extremely smart in some ways and extremely stupid in other ways. System 2 is extremely smart in some ways and extremely stupid in other ways.

It’s usually much smarter to use System 2 to prevent these sorts of System 1 override situations from even coming up. (Like, don’t even buy the cookies.) You only get like one or two big System 1 overrides per day, though you can build up this muscle.

And, geez, I say this all the time, but with great power comes great responsibility. You should be really, really, really, really sure that this isn’t one of those times that System 1 is being brilliant, not stupid. Otherwise, you hurt yourself if you override repeatedly. (Like, maybe you should eat those mixed nuts, because maybe you need that selenium–and people typically lose weight eating nuts, anyway–but, yeah, that selenium. System 1 is brilliant, in its own way.)

And there are certain games where System 1 always wins in the end, like with sexuality. You use System 2 to constructively engage with System 1. Otherwise, System 1 will eat you alive. It’s like the alcoholic who somehow convinces themselves that walking into the bar is precisely what they need to do to keep from having a drink.

For some things, System 1 always wins in the end, if you fight it head on–a very dangerous long-term game to play.”

And, of course, observation is way difficult. Good luck telling a passerby to sit down and meditate, and not get involved in their thoughts.

Observing is a skill, and it is not easy to acquire. Chapman talks about observation in a great essay here of which two points relate to this emphasis on epistemic responsibility:

  • Selecting and formulating problems is as important as solving them; these each require different cognitive skills.
  • Problem formulation (vocabulary selection) requires careful, non-formal observation of the real world.

Sensitivity to other perspectives

A virtue that follows from epistemic responsibility is sensitivity to other perspectives. Normativity blocks this. If you have your normative system in place you are seeing through its eyes and judging stuff as “correct” “bad” and so on. If you remove it, you can actually see what is happening.

A beautiful essay to gain part of this skill is the one on the typical mind fallacy. (BTW, amazing party trick: ask people to describe in which modality they think. Most of the time one person will say images and most will say voice and wonder ensues).

Another useful concept is that of inferential distance.

But there is more to people than minds; and you can have emotional empathy, besides mental empathy, or perspective-taking.

  1. – Descartes, R. (2013). Meditations on First Philosophy. Broadview Press.
  2. – Descartes is the first modern philosopher. He breaks away with the tradition of scholasticism of merely analysing and commenting on other works (for the most part), and goes “Screw you guys, I’m figuring it out all alone.” He defined the course of philosophy after him, to this day, becoming an axis: you either side, or don’t side; but you are always defined in relation to his positions. The whole discourse after him is defined by him, for or against, but no third option.
  3. Collaborative couple therapy. Good Ideas, 2007.


  • knightian uncertainty
  • signalling is a problem: of course i (at some level) want to signal being humble and honest. I speculate about stuff all the time. Only because I think some model + the meta-belief that the model is wrong look out for errors is better than no model since no model does nothing (because data/frame theory) -> coherence theory, neurath ship.
  • Why I like Taleb. Taleb is obsessed with the invisible, the unseen, what he does not know, and what he does not know he doesn’t know. I am too. I’ve been painfully wrong a hundred too many times.
  • Epistemic responsibility places a really high barrier to act on something that is major and vibes really well with cluster thinking
  • Of course I’m framing all of these as desirable things
    • This might be the wrong framing. This is not about knowledge and not about morality, but how to go about knowing being a situated agent, without destroying yourself or others.
  • Of course that I need to go reflective and see how my arguments about certainty in morality applies here, to epistemic virtues
  • ernst von glaserfeld on epistemology and knowing
  • http://www.newscientist.com/article/mg22429973.000-most-violence-arises-from-morality-not-the-lack-of-it.html

Imprudent formalisation

I have heard scores of people talking about their utility functions. This strikes me as ridiculous on its face – as a if people were speaking about their gills:The concept doesn’t apply, it’s a category error.

So why do people do it? In what follows I first expand on the issue, show several reasons as to why its problematic in the first place, and end my analysis with a discussion of why it is done, and why it can be done.


“My utility function”

The problem I want to treat is first described here: “If I ever say “my utility function”, you could reasonably accuse me of cargo-cult rationality; trying to become more rational by superficially imitating the abstract rationalists we study makes about as much sense as building an air traffic control station out of grass to summon cargo planes.

There are two ways an agent could be said to have a utility function:

  1. It could behave in accordance with the VNM axioms; always choosing in a sane and consistent manner, such that “there exists a U”. The agent need not have an explicit representation of U.
  2. It could have an explicit utility function that it tries to expected-maximize. The agent need not perfectly follow the VNM axioms all the time. (Real bounded decision systems will take shortcuts for efficiency and may not achieve perfect rationality, like how real floating point arithmetic isn’t associative).

Neither of these is true of humans. Our behaviour and preferences are not consistent and sane enough to be VNM, and we are generally quite confused about what we even want, never mind having reduced it to a utility function. Nevertheless, you still see the occasional reference to “my utility function”.”

All of the above is clear to me. It is not clear to me why, despite that, I’ve heard scores of people talking about their utility function. In what follows I tease out an incomplete list of the problems of engaging in imprudent formalization  and then, given the number of problems, try to understand why it is that it is done. (I’m using “imprudent” to mean both premature and plain wrong.)


Consequences of imprudent formalisation

Formality is awesome. I’m not debating this. Mathematics is insanely awesome. “Sit and think about stuff and sketch it out with a pencil and the results will be infinitely precise and it’s way cheaper than going around trying stuff out.” Yes, the most amazing free-lunch ever. Mathematics is awesome, logic is awesome, formality is awesome.
Except, it isn’t a free lunch. Formality comes at a price: “Formality has a cognitive overhead, it does not deal well with tacit or evolving knowledge, it cannot represent the situated nature of knowledge “ (1). Using technical terms in a premature fashion leads you to be precisely wrong instead of roughly correct.
Besides this, formalisation makes communication difficult and can create pain. I discuss these consequences below.


Firstly, premature formalisation leads to a greater intellectual separation: being formal over being fuzzy extends your inferential distance. Secondly, premature formalisation leads to a greater emotional separation: maybe you have noticed that as you speak about what is more unique to you, more personal, deeper, your ability to connect increases. Your experience resonates at an analogue level. Imagine speaking to someone about your utility function and how you would get 56 utilons by getting chocolate, or how you “Really really want and need chocolate right now and your day was terrible and you are really sad and you love chocolate because it makes you happier on rainy days”. It’s a no-brainer really.



Perception is bottom up and top-down. “The “top-down” processing refers to a person’s concept and expectations (knowledge), and selective mechanisms (attention) that influence perception. Perception depends on complex functions of the nervous system, but subjectively seems mostly effortless because this processing happens outside conscious awareness”.

You have to make sense of what you perceive, somehow. (Sensemaking, something I touched upon here.) And you will use the concepts at hand. You want your concepts to cut reality correctly, to package in a way that makes sense to you. If they don’t you will end up sandpapering against reality and that really sucks and hurts and if I can push you away from it my day has been a success.
Just try to get a sense of how someone experiences life going with the concepts of “monkey tribal evolutionary psychology, status seeking and signaling, freeriding, deception, self-deception, game theory, general awfulness, and everything ever from Overcoming Bias “ and someone that is thinking from a place of dreaming “about massively flexible and malleable interaction spaces. What if game breaks were a regular occurrence and could be used a jumping-off points to carve unique paths through interaction, play, collaboration, and financial/resource-independence spaces?”. And yes, a million times yes, both, and its a continuum, but some concepts suck and they don’t come with warning labels.


I’ve been talking about utility functions thus far because I’m detailing my analysis on that piece. I want to understand how and why that came about and in the future generalise lessons as possible. But, bad concepts (which include imprudently formalised concepts) go beyond that and I want to do a proof of concept: akrasia.
Mark says “First, I want to say that akrasia, by itself, is a functionally meaningless concept. I put it in the same category as depression, cancer, epilepsy, etc. Applying the label doesn’t tell you what to *do*. I don’t ever use the label “akrasia” in my own thinking.”
“Akrasia” is insane. You blackbox it and then treat it as useful? Like your body suddenly starts shaking, and someone goes “Yep, tremors” and you are like “Oh, ok”. NO! You want the fix, the origin, not a synonym that doesn’t point at anything.



Why is it done?

In the previous section we saw that there are a lot of downside to imprudent formalization. Still, people engage in it. Why? I suspect that there are significant upsides. I describe them below, after explaining exosemantics:

exosemantic – the part of a word or statement that isn’t its strict entailments, but which are extremely common implicatures– specifically, these shouldn’t be contextual or Gricean implicatures, but socially bound ones, which have been formed by continued use of the word in particular contexts, or by particular speakers. The exosemantics of a word may eventually become incorporated into the defining entailments.


It is commonly known that words carry meaning on two levels: denotation, or strict, dictionary-level meaning, and connotation, or emotional association; but there is a third, exosemantic level. The word “eldritch”, for example, denotes otherworldliness and connotes a feeling of cosmic horror toward its referent; but it also exosemantically implies that its user has read Lovecraft. The word “liberty” is no different from the word “freedom’, The word “praxis” is no different from a certain definition of the word “practice” except in its exosemantic layer: “praxis” is heavy; “praxis” implies familiarity with—association with—the academic tradition that uses the word “praxis”.”

Two most powerful forces in the universe

I have talked about justification and on how reasons are for justification. The further step in that theory is to consider cultures as justification systems. Justification systems being “the interlocking networks of language-based beliefs and values that function to legitimize a particular worldview.”
The proposal is that formal systems originate from a desire to, a need for, justification. If that is right, then formal means better with regards to justifying yourself to others, and you would expect to see imprudent formalization. (In science this is called physics envy. In AI this was the GOFAI against which Dreyfus argued – this partially matches to neats vs scruffies but not really -, in economics this is the current debate about the use of mathematics and the old debate)
And, of course, formal systems and their technical notions carry the ultimate justificatory power. (If this seems difficult to buy, consider that normative systems in fact make claims about how one ought to behave, and ought is more powerful than want. There are only normative and applied ethics – the construction of systems of how one should behave, and the study of how to obtain that.)

With regards to the second force, Robin Hanson has said all there is to be said about signalling. It seems that in this regard imprecise formalism are being used like shibboleths: they allow one to signal being part of the in-group.

Why can it be done?

Robin Hanson has again a spot-on observation: “For subjects where there is little social monitoring and strong personal penalties for incorrect beliefs, we expect the functional role of beliefs to dominate. Beliefs about military missions or engineering projects come to mind. But for subjects with high social interest and little personal penalty for mistakes, we expect the social role of beliefs to dominate. Consider beliefs about large elections or beliefs addressing abstract philosophical, religious, or scientific questions.”

In this particular case the fact is that the penalty for being wrong is apparently small – because the belief is largely invisible. For one to be penalized, at the very least, the interlocutor has to understand what an utility function is and why it doesn’t make sense to apply the concept. Compare this to someone going around talking about their gills. Gills are really visible, and the social penalty would be immediate and harsh.

This makes copying the community shibboleths very cheap, with the upsides of justification and group membership. That is why it can, and is done.




  1. – Shipman, F. M., & Marshall, C. C. (1993). Formality considered harmful: Experiences, emerging themes, and directions. University of Colorado, Boulder, Department of Computer Science.





  • I suggested that there are tons of consequences. What is going on? The consequences are long term and fuzzy and the benefits immediate and clear. Go into that.

You can’t optimize anything, literally

So like, history really matters. It lets you see that a huge chunk of what you are operating with – all that is received – are previous distinctions made by someone to someone. And knowing the history gives you a huge advantage and you can do stuff you couldn’t do before and see stuff you haven’t seen before. And part of this power is that you can see what mistakes were made and avoid them.

I’m disturbed as to how Less Wrong is making claims to be about rationality, and then taking some results from the rationality literature and absolutely ignoring the history of the field, and dismissing some modern currents of research, and neglecting really well established results, with no apparent justification.

I’ll go through this at length at some point, now I just want to share some (lenghty) quotes about 1) The history of rationality, 2) The animating metaphor of rationality, 3) Why optimizing fails, 4) What doesn’t


History & Metaphor

“Disputes about the nature of human rationality are as old as the concept of rationality itself, which emerged during the Enlightenment (Daston, 1988). These controversies are about norms, that is, the evaluation of moral, social, and intellectual judgment (e.g., Cohen, 1981; Lopes, 1991). The most recent debate involves four sets of scholars, who think that one can understand the nature of sapiens by (a) constructing as-if theories of unbounded rationality, by (b) constructing as-if theories of optimization under constraints, by (c) demonstrating irrational cognitive illusions, or by (d) studying ecological rationality.” (1)

“The heavenly ideal of perfect knowledge, impossible on earth, provides the gold standard for many ideals of rationality. From antiquity to the Enlightenment, knowledge—as opposed to opinion—was thought to require certainty. Such certainty was promised by Christianity but began to be eroded by events surrounding the Reformation and Counter-Reformation. The French astronomer and physicist Pierre-Simon Laplace (1749–1827), who made seminal contributions to probability theory and was one of the most infl uential scientists ever, created a fictional being known as Laplace’s superintelligence or demon. The demon, a secularized version of God, knows everything about the past and present and can deduce the future with certitude. This ideal underlies the fi rst three of the four positions on rationality, even though they seem to be directly opposed to one another. The fi rst two picture human behavior as an approximation to the demon, while the third blames humans for failing to reach this ideal. I will use the term omniscience to refer to this ideal of perfect knowledge (of past and present, not future). The mental ability to deduce the future from perfect knowledge requires omnipotence, or unlimited computational power. To be able to deduce the future with certainty implies that the structure of the world is deterministic. Omniscience, omnipotence, and determinism are ideals that have shaped many theories of rationality. Laplace’s demon is fascinating precisely because he is so unlike us. Yet as the Bible tells us, God created humans in his own image. In my opinion, social science took this story too literally and, in many a theory, re-created us in proximity to that image.” (1)

Programs of rationality

Context-independent norms? Behavior can conform to those norms? Perfect strategy achievable in principle? Perfect strategy achievable in practice?
Unbounded Y Y Y Y
Optimization under constraints Y Y Y N
Heuristics and Biases Y Y N N
Fast and Frugal Heuristics N N N N

(This is mostly correct and how each program assesses itself. I’d argue that the F&F program does accept normative ideals.)

Programs of rationality as they relate to the metaphor

The metaphor of the demon illustrates the programs and their differences.

The demon is omnipotent and omniscient and the universe is deterministic in the original thought experiment (which is how it can perfectly predict everything.) Unbounded rationality lets go of determinism and sides in optimization instead. Optimization within constraints let’s go of omnipotence but keep optimization. (All of the above as descriptive models.) The heuristics and biases program uses the demon as an ideal of behavior and then finds instances of failures. The fast and frugal program claims to do away with the demon (and normative ideals) completely. I think that is wrong, but I’ll leave that for a future post.


Why you can’t optimize anything

Optimization Is Intractable in Most Natural Situations

The ideal of as-if optimization is obviously limited because, in most natural situations, optimization is computationally intractable in any implementation, whether machine or neural. In computer science, these situations are called NP-hard or NP-complete; that is, the solution cannot be computed in polynomial time. “Almost every problem we look at in AI is NP-complete” (Reddy, 1988: 15). For instance, no mind or computer can apply Bayes’s rule to a large number of variables that are mutually dependent because the number of computations increases exponentially with the number of variables.

Probabilistic inference using Bayesian belief networks, for example, is intractable (Cooper, 1990; Dagum & Luby, 1993). In such situations, a fully rational Bayesian mind cannot exist. Even for games with simple and well-defi ned rules, such as chess and Go, we do not know the optimal strategy. Nevertheless, we do know what a good outcome is. In these situations, as-if optimization can only be achieved once the real situation is changed and simplified in a mathematically convenient way so that optimization is possible. Thus, the choice is between finding a good heuristic solution for a game where no optimal one is known and fi nding an optimal solution for a game with modified rules. That may mean abandoning our study of chess in favor of tic-tac-toe.

Optimization Is Not an Option with Multiple Goals

If optimization of a criterion is tractable, multiple goals nevertheless pose a difficulty. Consider the traveling salesman problem, where a salesman has to find the shortest route to visit N cities. This problem is intractable for large Ns but can be solved if the number of cities is small, say if N = 10, which results only in some 181,000 different routes the salesman has to compare (Gigerenzer, 2007). However, if there is more than one goal, such as the fastest, the most scenic, and the shortest route, then there is no way to determine to single best route. In this case there are three best routes. One might try to rescue optimization by weighting the three goals, such as 3x + 2y + z, but this in turn means determining what the best weights would be, which is unclear for most interesting problems.

Optimization Is Not an Option with Imprecise, Subjective Criteria

Unlike physical distance, subjective criteria are often imprecise. Happiness, which means many things, is an obvious example. But the issue is more general. A colleague of mine is an engineer who builds concert halls. He informed me that even if he had unlimited amounts of money at his disposal, he would never succeed in maximizing acoustical quality. The reason is that although experts agree on the defi nition of bad acoustics, there is no consensus as to what defi nes the best. Hence he could try to maximize acoustical quality according to a single expert’s judgment but not to unanimous agreement.

Optimization Is Unfeasible When Problems Are Unfamiliar and Time Is Scarce

In situations where optimization is in principle possible (unlike those under the first three points), a practical issue remains. Selten (2001) distinguishes between familiar and unfamiliar problems. In the case of a familiar problem, the decision maker knows the optimal solution. This may be due to prior training or because the problem is simple enough. In the case of an unfamiliar problem, however, the decision maker cannot simply employ a known method because the method that leads to the best result must fi rst be discovered. In other words, the agent has to solve two tasks: level 1, employing a method that leads to a solution, and level 2, fi nding this method. Thus, two questions arise. What is the optimal method to be chosen? And what is the optimal approach to discovering that method? (There may be an infi nite regress: level 3, finding a method for level 2, and so on.) At each level, time must be spent in deciding. Although Selten’s argument concerning unfamiliar problems has not yet been cast into mathematical form, as the issue of combinatorial explosion has been, it strongly suggests that an optimizing approach to unfamiliar problems is rarely feasible when decision time is scarce.

Optimization Does Not Imply an Optimal Outcome

Some economists, biologists, and cognitive scientists seem to believe that a theory of bounded rationality must rely on optimization in order to promise optimal decisions. No optimization, no good decision. But this does not follow. Optimization needs to be distinguished from an optimal outcome. Note that the term optimization refers to a mathematical process—computing the maximum or minimum of a function—which does not guarantee optimal outcomes in the real world. The reason is that one has to make assumptions about the world in order to be able to optimize. These assumptions are typically selected by mathematical convenience, based on simplifi cations, and rarely grounded in psychological reality. If they are wrong, one has built the optimization castle on sand and optimization will not necessarily lead to optimal results. This is one reason why models of bounded rationality that do not involve optimization can often make predictions as good as those made by models that involve optimization (Gigerenzer & Selten, 2001b; Selten, 1998; March, 1978). A second reason is robustness. As described in chapter 1, asset allocation models that rely on optimization, Bayesian or otherwise, can perform systematically worse than a simple heuristic because parameter estimates are not robust.” (1)


If no optimization then what?

Although I can’t find it now, I seem to recall that Chalmers had an argument up that was something like this:

  1. If not-P, then what?
  2. Therefore, P [1].

And it’s funny because its true and I totally want to avoid the trap of trying to take something away and not giving something back, when there is something to be given. And it has been there for 60 years.



“Throughout the early history of decision making research, researchers devised mathematical algorithms to predict purely rational, or optimal, decision making (Tyszka, 1989). Simon (1955, 1956) rejected the idea of optimal choice, proposing the theory of “bounded rationality.” He argued that due to time constraints and cognitive limitations, it is not possible for humans to consider all existing decision outcomes and then make fully reasoned, purely rational, choices. He suggested that humans operate rationally within practical boundaries, or within the limits of bounded rationality. For example, classic rational choice models would indicate that a man who wanted to sell his house would perform a series of mathematical manipulations for each offer he would receive. These mathematical operations would identify the probable range of likely offers, the probability of receiving a better offer within an acceptable time period, the relative diminished value of a higher offer at some future time, etc. Such calculations are beyond the computational capacity of most decision makers and are prohibitively labor intensive for nearly all decision makers. Instead, Simon (1955) predicted that the man in this example would simplify the decision making process by Assum[ing] a price at which he can certainly sell and will be willing to sell in the nth time period. Second, he will set his initial acceptance price quite high, watch the distribution of offers he receives, and gradually and approximately adjust his acceptance price downward or upward until he receives an offer he accepts—without ever making probability calculations. (Simon, 1955, pp. 117–118) The idea of simplification methods underlies all of Simon’s work in human cognition. The most important simplification mechanism is “satisficing,” or choosing decision outcomes that are good enough to suit decision makers’ purposes, but that are not necessarily optimal outcomes. Satisficing involves “setting an acceptable level or aspiration level as a final criterion and simply taking the first acceptable [option]” (Newell & Simon, 1972, p. 681). Gigerenzer and Goldstein (1996) explained the origin of this term: “Satisficing, a blend of sufficing and satisfying, is a word of Scottish origin, which Simon uses to characterize algorithms that successfully deal with conditions of limited time, knowledge, or computational capacities” (p. 651). Satisficing acts as a “stop rule” (Simon, 1979, p. 4)— once an acceptable alternative is found, the decision maker concludes the decision process. Nonetheless, satisficing does not limit the decision maker to one deciding factor: When the criterion of problem solution or action has more than one dimension, there is the matter of calculating the relative merits of several alternatives, one of which may be preferred along one dimension, another along another . . . The satisficing rule . . . stipulates that search stops when a solution has been found that is good enough along all dimensions. (Simon, 1979, p. 3) Nor does satisficing lock the decision maker into searching for an unrealistically superior option or a needlessly inferior option. Decision makers adjust “aspirations upward or downward in the face of benign or harsh circumstances, respectively” (Simon, 1979, p. 3) as they work through the decision-making process. Simon contended that satisficing generally leads to choices roughly equal in quality to the choices predicted by optimizing algorithms. Gigerenzer and Goldstein (1996) showed that a simple “Take the Best” option algorithm (the electronic equivalent of human satisficing) matched or outperformed an optimizing algorithm on accuracy and speed. Because decisions based on satisficing behaviors are much easier to make than decisions based on optimizing algorithms, requiring less time and less cognitive exertion, satisficing is a highly rational, efficient decision-making behavior.” (2)



Scott Alexander recently published an apologia of the rationalist community. The very first comment says “In the interest of steelmanning, perhaps you should consider it a critique of the proclaimed values of rationalism vs the enacted ones. You do address this to some degree, but on some points you don’t address it.”

The fact that established results that are 60 years old are mostly ignored, and that valid mainstream and contemporary currents of research (and not only Fast & Frugal heuristics, but also Naturalistic Decision Making) means that the community is, to put it mildly, failing at the virtue of scholarship.

(I don’t particularly care about Less Wrong and think it is wrong about a bunch of things, and that there are a bunch of things wrong with it. But it was my first scaffold to understand a lot of things and my understanding develops in the rejection and refinement of some earlier views coming from it.)


  1. – Gigerenzer, G. (2008). Rationality for mortals: How people cope with uncertainty. Oxford University Press.
  2. – Agosto, D. E. (2002). Bounded rationality and satisficing in young people’s Web‐based decision making. Journal of the American Society for Information Science and Technology, 53(1), 16-27.




  • Go through power of metaphor & analogy (Lakoff, Hofstadter)
  • epistemic humility and responsibility
  • importance of history
  • Go into why gigerenzer is awesome even if wrong on a bunch of things, and how stanovich is totally awesome as well

Overanalyzing shock

I once shocked a friend to the depth of his soul by saying “Maybe you’re right, let’s check.”

Me and him were going to pick up someone from the airport. I was leaving the house and he said “I bet the plane is going to be late”. I replied “Maybe you are right, let’s check the estimated arrival time”. What happened next stunned me.

He was in total shock at my answer, unable to grasp what was happening. Kinda like in a movie when something comes out from the left-field and the hero is actually the villain and for a moment your are lost and disconnected trying to reorient yourself.

I was confused and stunned in response to his reaction, and it took me a year but I think I now have the pieces in place to understand it.

What follows is an analysis of the event, me trying to make sense of that experience.



Communities of discourse

I have in the past lightly circled the issue of communities of discourse. So let us dive in: “[a] discourse community is a group of people who share a set of discourses, understood as basic values and assumptions, and ways of communicating about those goals. Linguist John Swales defined discourse communities as “groups that have goals or purposes, and use communication to achieve these goals.”

Some examples of a discourse community might be those who read and/or contribute to a particular academic journal, or members of an email list for Madonna fans. Each discourse community has its own unwritten rules about what can be said and how it can be said: for instance, the journal will not accept an article with the claim that “Discourse is the coolest concept”; on the other hand, members of the email list may or may not appreciate a Freudian analysis of Madonna’s latest single. Most people move within and between different discourse communities every day.

Since the discourse community itself is intangible, it is easier to imagine discourse communities in terms of the fora in which they operate. The hypothetical journal and email list can each be seen as an example of a forum, or a “concrete, local manifestation of the operation of the discourse community”.


A discourse community:

  1. has a broadly agreed set of common public goals.
  2. has mechanisms of intercommunication among its members.
  3. uses its participatory mechanisms primarily to provide information and feedback.
  4. utilizes and hence possesses one or more genres in the communicative furtherance of its aims.
  5. in addition to owning genres, it has acquired some specific lexis.
  6. has a threshold level of members with a suitable degree of relevant content and discoursal expertise.”


Futher, “[a]ll language is the language of community, be this a community bound by biological ties, or by the practice of a common discipline or technique. The terms used, their meaning, their definition, can only be understood in the context of the habits, ways of thought, methods, external circumstances, and tradition known to the users of those terms. A deviation from usage requires justification …”

That is a community of discourse is a community operating through shared frames of sensemaking.



In the data/frame theory (which I discussed before) it is posited that everything that is made sense of is made sense of from a particular point of view. Thus, the sentence “I bet the plane is going to be late” means different things depending on the frame from which it is being analysed. And if communities operate through different frames, it means different things depending on the community in which it is being uttered.

The event I mentioned happened when I was in my home country. (Not the USA.) There, “I bet that X” is not an expression of an empirical statement, but something else. Something that you do so that if X does indeed come to pass you can go “I told you so” and keep face.
The answer I gave entailed the LessWrong/San Francisco/etc. understanding of the statement: an  empirical prediction to be argued over or better on by epistemic agents, with the shared goal of improving models.



The OODA loop is a model of a decision making cycle developed by USA Colonel John Boyd based on observing jet fighter battles.

OODA stands for Observation, Orientation, Decision, Action. A pilot is constantly going through these loops or cycles in a dogfight: he tries to observe the enemy as best he can, this observation being somewhat fluid, since nothing is standing still and all of this is happening at great speed. With a lightning-quick observation, he then must orient this movement of the enemy, what it means, what are his intentions, how does it fit into the overall battle. This is the critical part of the cycle. Based on this orientation, he makes a decision as to how to respond, and then takes the appropriate action.

Screen Shot 2014-12-04 at 01.01.30

The OODA models provides an explanation of what happened. He was oriented based on being in a certain community of discourse. This led to a decision and action. This action has a prediction about what will happen (from which deviance is taken as feedback with regards to whether the initial orientation was appropriate). The fact that I answered as from within another community of discourse caused him to reorient aggressively which explains why he was in shock, not speaking or moving.

This caused a reset of his own OODA loop, making him go back to square one, back to observing.

(Coincidentally part of what the OODA loop illuminates is how to use your opponent mental models against them, once they acted on a model predicting a result, give them a different result so that they have to reset the whole loop. Do this multiple times and they will start lagging and mistakes will accumulate, then go in for the kill. This was not what I was attempting to do to my friend.)

Still, what does it mean to reset the loop? To have to go back to observation?



Here is a further model detailing what is going on. I like it because I like how smoothly it integrates into the other ones, like all of them are circling the same things from really different perspectives and traditions.

Screen Shot 2014-12-04 at 01.01.38

According to this model, surprise is model error, what the model predicted that ended up being wrong. According to the OODA loop, he oriented by using the mental model of us being in one community of discourse, and me being oriented to another one at that time provided him a reply that his model did not account for in the least.

Hence, utter shock.



This episode stayed in my mind for a year: I was stunned at his shock and couldn’t make sense of it. I can now, and it took me only one year and serendipitously learning about 4 models (frames, discourse community, surprise as model error, ooda loop).

And this is super interesting and analysing is fun, but also this is insane, people don’t do this – remembering past episodes from one year before and use concepts they’ve learned meanwhile to put together their understanding of what happened then. (Do they?)

Then, why did I?

I don’t know why, but this is really important. (1)


  1. – This sentence usually comes when I’m focusing. I find something, some handle, some piece and it’s painfully clear it matters, but the history isn’t set yet it is not yet clear why it matters. This happens very frequently and has thus far been right each time. (Which is not reason for me to believe that will be the case in the future [thanks Hume], but it shifts the probability mass or whoever a “Bayesian” would frame it.)




  1. Look into what has been said about discourse and discourse communities
  2. Look at Jaakko Hintikka and Van Benthem on logic of communitie
  3. sensemaking
  4. “ordering of experience”
  5. Talking between and from one community to the other (nvc creates a third language to do this)
  6. See how my linking of frames and discourse matches to weick’s view on organizational sensemaking

On why speaking to Hedgehogs doesn’t come naturally to me

People are sometimes frustrated by their inability to place me in existing communities. Especially if they are intellectual communities. This makes sense. If they can place me then it is much easier to engage, they can model the parts of my thinking that I haven’t made explicit to be a copy of the general thinking of the community hivemind and it will be a good enough approximation. Unfortunately this cannot happen with my thinking.

The fact that it doesn’t happen leads to a second reason for frustration. The realization that I have read all the same arguments, and agreed with all the same premises as they have agreed with, and yet am not acting in the way that they are. I’m not pledging allegiance to the same causes. Why?

Is is that I lack the ability to take ideas seriously? Certainly that is partially the case. My mind drifts from idea to idea without any sort of “reasonability” barrier and actually believing those (which would imply acting on them) would be problematic.

Moreover, it certainly is not the case that I’m outside of the scope of what Robin Hanson mentions as the Homo Hypocritus. And it certainly is the case that I have a standing problem that blocks me from feeling like a group-member in general. But I do not think these fully explicate what is happening.

I believe that me expressing epistemic agreement and then not acting in the way that others that express agreement act is caused by the fact that I’m thinking from fundamentally orthogonal epistemological presumptions. Being placed in an existing (intellectual) community and acting in a way that reflects how people who have read the arguments shared between that community act requires a certain hedgehoginess (1) of thinking that I lack. This hedgehoginess is the ability to believe what you conclude.

I lack this piece.

I will attempt to explain why I believe I am justified in not believing what I conclude. This whole essay is serves as an attempt to facilitate others being charitable to my shortcoming, in the same way Paul Graham tried to make managers charitable to makers time. If I succeed I create a bridge for shortcoming despite the different mental configurations.

In what follow I first present my guess of the problem origin. I then present a formal depiction of what I think I lack, and why I’m justified in lacking it. I end up with musings about how to overcome this difference.


Problem origin

I believe the frustration arises due to an inference that goes something like this:

  1. If someone takes the ideas A ^ B ^ C seriously, then they will do D.
  2. You don’t do D.
  3. Therefore you don’t take the ideas A ^ B ^ C seriously . [1,2]

I will attempt to argue that the correct inference is:

  1. If someone takes the ideas A ^ B ^ C seriously and they have hedgehog-piece 1, then they will do A.
  2. You don’t do A.
  3. Therefore you don’t take the ideas A ^ B ^ C seriously, or you do not have hedgehog-piece 1. [1,2]

And of course, me claiming that the second disjunction is the correct one.


Hedgehog-piece 1, formally

I proceed to present (2) a very intuitive principle of reasoning that I believe others possess and that I lack, which explains the differences that we get frustrated over. I then present a paradox that follows from the acceptance of the principle. I end with an explanation of the why the paradox arises.

The Principle of Closure

The principle of closure is defined as follows:

“Necessarily, if S has justified beliefs in some propositions and comes to believe that q solely on the basis of competently deducing it from those propositions, while retaining justified beliefs in the propositions throughout the deduction, then S has a justified belief that q.”

My hypothesis is that hedgehog-piece 1 is the principle of closure. I also hypothesize that others have not reasoned themselves to it, but that it is a natural piece of their mental configuration the same way it is a naturally missing piece of my mental configuration. (I’m deliberately leaving these terms fuzzy. A case of choosing roughly correct over precisely wrong.)

In the next section I replicate a paradox that I believe undermines the principle of closure.

The Preface paradox

“It is customary for authors of academic books to include in the preface of their books statements such as “any errors that remain are my sole responsibility.” Occasionally they go further and actually claim there are errors in the books, with statements such as “the errors that are found herein are mine alone.”

(1) Such an author has written a book that contains many assertions, and has factually checked each one carefully, submitted it to reviewers for comment, etc. Thus, he has reason to believe that each assertion he has made is true.

(2) However, he knows, having learned from experience, that, despite his best efforts, there are very likely undetected errors in his book. So he also has good reason to believe that there is at least one assertion in his book that is not true.

Thus, he has good reason, from (1), to rationally believe that each statement in his book is true, while at the same time he has good reason, from (2), to rationally believe that the book contains at least one error. Thus he can rationally believe that the book both does and does not contain at least one error.”

Diagnosing the Preface Paradox

In this section I replicate a diagnosis of why the Preface Paradox holds and what it means for the Principle of Closure.

“Consider a very long sequence of competently performed simple single-premise deductions, where the conclusion of one deduction is the premise of the next. Suppose that I am justified in believing the initial premise (to a very high degree), but have no other evidence about the intermediate or final conclusions. Suppose that I come to believe the conclusion (to a very high degree) solely on the basis of going through the long deduction. I should think it likely that I’ve made a mistake somewhere in my reasoning. So it is epistemically irresponsible for me to believe the conclusion. My belief in the conclusion is unjustified.”

“Diagnosis of the preference paradox: Having a justified belief is compatible with there being a small risk that the belief is false. Having a justified belief is incompatible with there being a large risk that the belief is false. Risk can aggregate over deductive inferences. In particular, risk can aggregate over conjunction introduction”

“(T)here is a natural diagnosis of what’s going on: A thinker’s rational degree of belief drops ever so slightly with each deductive step. Given enough steps, the thinker’s rational degree of belief drops significantly. To put the point more generally, the core insight is simply this: If deduction is a way of extending belief – as the Williamsonian line of thought suggests – then there is some risk in performing any deduction. This risk can aggregate, too.“



The acceptance of the preface paradox as a counter-argument to the principle of closure makes it so that I can say of argument that concludes A – “Yes, I think that argument is valid and sound; but I don’t believe that A”. Understandably this frustrates the people arguing for A.

Alas, I didn’t reason myself into this acceptance. It is a very formal description of what has been characteristic of my reason for the longest time. And that is as far as I can say without falling into a narrative fallacy. I give this formal description in an attempt to make my reasoning less opaque, and hopefully less frustrating to others.

This hypothesis explains why consilience is my automatic go-to principle to figure stuff out, and why I’m attracted to many weak arguments over one strong argument. It also explains why I frustrate hedgehogs and vice-versa, which I explore in more detail below. Further, it predicts that posts in this blog will be very hit-or-miss, as I talk to a specific community at a time. So if so far you have had no luck, don’t despair, dear reader!


Why I frustrate hedgehogs and vice-versa

Here is Venkatesh Rao on the upsetting process through which to change Hedgehogs’ and Foxes’ beliefs:

“It is  tedious to undermine even though it is lightly held. A strong view requires an opponent to first expertly analyze the entire belief complex and identify its most fundamental elements, and then figure out a falsification that operates within the justification model accepted by the believer. This second point is complex. You cannot undermine a belief except by operating within the justification model the believer uses to interpret it.  A strong view can only be undermined by hanging it by its own petard, through local expertise.”

And conversely:

“To get a fox to change his or her mind on the other hand, you have to undermine an individual belief in multiple ways and in multiple places, since chances are, any idea a fox holds is anchored by multiple instances in multiple domains, connected via a web of metaphors, analogies and narratives. To get a fox to change his or her mind in extensive ways, you have to painstakingly undermine every fragmentary belief he or she holds, in multiple domains. There is no core you can attack and undermine. There is not much coherence you can exploit, and few axioms that you can undermine to collapse an entire edifice of beliefs efficiently. Any such collapses you can trigger will tend to be shallow, localized and contained. The fox’s beliefs are strongly held because there is no center, little reliance on foundational beliefs and many anchors. Their thinking is hard to pin down to any one set of axioms, and therefore hard to undermine.”


Interspecies communication

In his depiction of Foxes and Hedgehogs, Rao misses the one dimension that matters to me in the context of this essay: Explicitness. This is because I believe that explicitness is a sine qua non for communication and that reasons matter only insofar as they are communicable.

I divide the challenges of communication by species. The fox faces a certain challenge, the hedgehog another. Following, I make these challenges explicit.

Challenges to communicable reasons

The challenge for the fox is learning how to introspect into the reasons it is using to decide, and communicate those.

This is a challenge because introspection is difficult:

“This study tested the prediction that introspecting about the reasons for one’s preferences would reduce satisfaction with a consumer choice. Subjects evaluated two types of posters and then choseone to take home. Those instructed to think about their reasons chose a different type of poster than control subjects and, when contacted 3 weeks later, were less satisfied with their choice. When people think about reasons, they appear to focus on attributes of the stimulus that are easy to verbalize and seem like plausible reasons but may not be important causes of their initial evaluations. When these attributes imply a new evaluation of the stimulus, people change their attitudes and base their choices on these new attitudes. Over time, however, people’s initial evaluation of the stimulus seems to return, and they come to regret choices based on the new attitudes.)” (3)

But there is some evidence that introspection can be trained. (4) (5) Further, the cost is to be wrong. Sometimes (frequently?) you will just believe the wrong things for the wrong reasons. Making reasons explicit helps overcome this.


The challenge for the hedgehog is to make the fundamental beliefs and justification model accepted explicit, and communicate those.

There is not much I can say about this. Hopefully some hedgehog friend can take up the challenge and report back. I understand this is asking for someone to see the unseen, or the background of whatever they are looking at. I understand this is not-trivial.



If I have been successful this essay will dismiss some frustration of my epistemic interlocutors. It will have done so by making my thinking explicit, which is what I concluded foxes ought to do in order to improve communication. (Yes, going meta, very-LW.)

Hopefully I managed to explain this fundamental cog in how I think and why it might seem that I don’t take ideas seriously, when I do.


  1. Hegdehogs and foxes
  2. This whole section is composed of quotes from Schechter, J. (2013). Rational self-doubt and the failure of closure.Philosophical studies, 163(2), 429-452; except for the preface paradox which comes from here
  3. Quoted from here, citation is Wilson, Timothy D., Douglas J. Lisle, Jonathan W. Schooler, Sara D. Hodges, Kristen J. Klaaren, and Suzanne J. LaFleur. “Introspecting about reasons can reduce post-choice satisfaction.”Personality and Social Psychology Bulletin 19 (1993): 331-331
  4. Fox, M. C., Ericsson, K. A., & Best, R. (2011). Do procedures for verbal reporting of thinking have to be reactive? A meta-analysis and recommendations for best reporting methods. Psychological bulletin, 137(2), 316.
  5. Gendlin, E. T. (2012). Focusing-oriented psychotherapy: A manual of the experiential method. Guilford Press.


  • This is why it is difficult to me to talk to hedgehogs and to be convinced by them and vice-versa (they need to hack away at my belief in many places, I need a ton of specialized knowledge.
    • How to fix this?
      • Making things explicit (understanding why you value many perspective, them being as clear as possible about the local knowledge needed to go for the neck)
  • To what extent does this discussion overlap with the cluster vs sequence thinking?
  • Hedgehog, fox; fragility, robustness and anti-fragility
  • Can I give a strong argument for “many weak arguments” of the form of “If many weak arguments can be generated for one side that cannot be generated from the other, and there is no strong argument either way this provides evidence for this one side” that is acceptable to Hedgehogs?
  • Drill deeper into hedgehoginess/foxiness being a collection of pieces
  • “Why having two types of reasoning and have them communicate over localizing the wrong pieces and remove them?”
    • Reason works better in a community with each arguing for different sides, not in an individual.
    • Why I believe in epistemic communities over epistemic individuals as the place where reason thrives (Arguments for one side vs the other; not one person painstakingly trying to get better). [Reason is social, specialisation, etc.]
  • http://www.ribbonfarm.com/2011/07/31/on-being-an-illegible-person/