Talk on Multiverse-wide Cooperation via Correlated Decision-Making

In the past few months, I thought a lot about the implications of non-causal decision theory. In addition to writing up my thoughts in a long paper that we plan to publish on the FRI website soon, I also prepared a presentation, which I delivered to some researchers at FHI and my colleagues at FRI/EAF. Below you can find a recording of the talk.

The slides are available here.

Given the original target audiences, the talk assumes prior knowledge of a few topics:

Anthropic uncertainty in the Evidential Blackmail

I’m currently writing a piece on anthropic uncertainty in Newcomb problems. The idea is that whenever someone simulates us to predict our actions, this leads us to have anthropic uncertainty about whether we’re in this simulation or not. (If we knew whether we were in the real world or in the simulation, then the simulation wouldn’t fulfill its purpose anymore.) This kind of reasoning changes quite a lot about the answers that decision theories give in predictive dilemmas. It makes their reasoning “more updateless”, since they reason from a more impartial stance: a stance from which they don’t know their exact position in the thought experiment, yet.

This topic isn’t new, but it hasn’t been discussed in-depth before. As far as I am aware, it has been brought up on LessWrong by gRR and in two blog posts by Stuart Armstrong. Outside LessWrong, there is a post by Scott Aaronson, and one by Andrew Critch. The idea is also mentioned in passing by Neal (2006, p. 13). Are there any other sources and discussions of it that I have overlooked?

In this post, I examine what the assumption that predictions or simulations lead to anthropic uncertainty implies for the Evidential Blackmail (also XOR Blackmail), a problem which is often presented as a counter-example to evidential decision theory (EDT) (Cf. Soares & Fallenstein, 2015, p. 5; Soares & Levinstein, 2017, pp. 3–4). A similar problem has been introduced as “Yankees vs. Red Sox” by Arntzenius (2008), and discussed by Ahmed and Price (2012). I would be very grateful for any kind of feedback on my post.

We could formalize the blackmailer’s procedure in the Evidential Blackmail something like this:

def blackmailer():
    your_action = your_policy(receive_letter)
    if predict_stock() == “retain” and your_action == “pay”:
        return “letter”
    elif predict_stock() == “fall” and your_action == “not pay”:
        return “letter”
    else
        return “no letter”

Let p denote the probability P(retain) with which our stock retains its value a. The blackmailer asks us for an amount of money b, where 0<b<a. The ex ante expected utilities are now:

EU(pay) = P(letter|pay) * (a – b) + P(no letter & retain|pay) * a = p (a – b),

EU(not pay) = P(no letter & retain|not pay) * a = p a.

According to the problem description, P(no letter & retain|pay) is 0, and P(no letter & retain|not pay) is p.1 As long as we don’t know whether a letter has been sent or not (even if it might already be on its way to us), committing to not paying gives us only information about whether the letter has been sent, not about our stock, so we should commit not to pay.

Now for the situation in which we have already received the letter. (All of the following probabilities will be conditioned on “letter”.) We don’t know whether we’re in the simulation or not. But what we do if we’re in the simulation can actually change our probability that we’re in the simulation in the first place. Note that the blackmailer has to simulate us one time in any case, regardless of whether our stock goes down or not. So if we are in the simulation and we receive the letter, P(retain|pay) is still equal to P(retain|not pay): neither paying nor not paying gives us any evidence about whether our stock retains its value or not, conditional on being in the simulation. But if we are in the simulation, we can influence whether the blackmailer sends us a letter in the real world. In the simulation, our action decides over whether we receive the letter in the cases where we keep our money, or whether we receive the letter when we lose.

Let’s begin by calculating EDT’s expected utility of not paying. We will lose all money for certain if we’re in the real world and don’t pay, so we only consider the case where we’re in the simulation:

EU(not pay) = P(sim & retain|not pay) * a.

For both SSA and SIA, if our stock doesn’t go down and we don’t pay up, then we’re certain to be in the simulation: P(sim|retain, not pay) = 1, while we could be either simulated or real if our stock falls: P(sim|fall, not pay) = 1/2. Moreover, P(sim & retain|not pay) = P(retain|sim, not pay) * P(sim) = P(sim|retain, not pay) * P(retain). Under SSA, P(retain) is just p.2 We hence get

EU_SSA(not pay) = P(sim|retain, not pay) * p * a = p a.

Our expected utility for paying is:

EU_SSA(pay) = P(sim & retain|pay) * (a – b) + P(not sim|pay) * (a – b)

= P(sim|retain, pay) * p * (a – b) + P(not sim|pay) * (a – b).

If we pay up and the stock retains its value, there is exactly one of us in the simulation and one of us in the real world, so P(sim|retain, pay) = 1/2, while we’re sure to be in the simulation for the scenario in which our stock falls: P(sim|fall, pay) = 1. Knowing both P(sim & retain|pay) and P(sim & fall|pay), we can calculate P(not sim|pay) = p/2. This gives us

EU_SSA(pay) = 1/2 * p * (a – b) + 1/2 * p * (a – b) = p (a – b).

Great, EDT + SSA seems to calculate exactly the same payoffs as all other decision theories – namely, that by paying the Blackmailer, one just loses the money one pays the blackmailer, but gains nothing.

For SIA probabilities, P(retain|letter) depends on whether we pay or don’t pay. If we pay, then there are (in expectation) 2 p observers in the “retain” world, while there are (1 – p) observers in the “fall” world. So our updated P(retain|letter, pay) should be (2 p)/(1 + p). If we don’t pay, it’s p/(2 – p) respectively. Using the above probabilities and Bayes’ theorem, we have P(sim|pay) = 1/(1 + p) and P(sim|not pay) = 1/(2 – p). Hence,

EU_SIA(not pay) = P(sim & retain|not pay) * a = (p a)/(2 – p),

and

EU_SIA(pay) = P(sim) * P(retain|sim) * (a – b) + P(not sim) * (a – b)

= (p (a – b))/(1 + p) + (p (a – b))/(1 + p)

= (2 p (a – b))/(1 + p).

It seems like paying the blackmailer would be better here than not paying, if p and b are sufficiently low.

Why doesn’t SIA give the ex ante expected utilities, as SSA does? Up until now I have just assumed correlated decision-making, so that the decisions of the simulated us will also be those of the real-world us (and of course the other way around – that’s how the blackmail works in the first place). The simulated us hence also gets attributed the impact of our real copy. The problem is now that SIA thinks we’re more likely to be in worlds with more observers. So the worlds in which we have additional impact due to correlated decision-making get double-counted. In the world where we pay the blackmailer, there are two observers for p, while there is only one observer for (1 – p). If we don’t pay the blackmailer, there is only one observer for p, and two observers for (1 – p). SIA hence slightly favors paying the blackmailer, to make the p-world more likely.

To remediate the problem of double-counting for EDT + SIA, we could use something along the lines of Stuart Armstrong’s Correlated Decision Principle (CDP). First, we aggregate the “EDT + SIA” expected utilities of all observers. Then, we divide this expected utility by the number of individuals who we are deciding for. For EU_CDP(pay), there is with probability 1 an observer in the simulation, and with probability p one in the real world. To get the aggregated expected utility, we thus have to multiply EU(pay) by (1 + p). Since we have decided for two individuals, we divide this EU by 2 and get EU_CDP(pay) = ((2 p (a – b))/(1 + p)) * 1/2 * (1 + p) = p (a – b).

For EU_CDP(not pay), it gets more complex: the number of individuals any observer is making a decision for is actually just 1 – namely, the observer in the simulation. The observer in the real world doesn’t get his expected utility from his own decision, but from influencing the other observer in the simulation. On the other hand, we multiply EU(not pay) by (2 – p), since there is one observer in the simulation with probability 1, and with probability (1 – p) there is another observer in the real world. Putting this together, we get EU_CDP(not pay) = ((p a)/(2 – p)) * (2 – p) = p a. So EDT + SIA + CDP arrives at the same payoffs as EDT + SSA, although it is admittedly a rather messy and informal approach.

I conclude that, when taking into account anthropic uncertainty, EDT doesn’t give in to the Evidential Blackmail. This is true for SSA and possibly also for SIA + CDP. Fortunately, at least for SSA, we have avoided any kind of anthropic funny-business. Note that this is not some kind of dirty hack: if we grant the premise that simulations have to involve anthropic uncertainty, then per definition of the thought experiment – because there is necessarily a simulation involved in the Evidential Blackmail –, EDT doesn’t actually pay the blackmailer. Of course, this still leaves open the question of whether we have anthropic uncertainty in all problems involving simulations, and hence whether my argument applies to all conceivable versions of the problem. Moreover, there are other anthropic problems, such as the one introduced by Conitzer (2015a), in which EDT + SSA are still exploitable (in absence of a method to “bind themselves”).

1 P(no letter & retain|not pay) = P(no letter|retain, not pay) * P(retain|not pay) = 1 * P(retain|not pay) = P(retain|not pay, letter) * P(letter|not pay) + P(retain|not pay, no letter)* P(no letter|not pay) = p.

2 This becomes apparent if we compare the Evidential Blackmail to Sleeping Beauty. SSA is the “halfer position”, which means that after updating on being an observer (receiving the letter), we should still assign the prior probability p, regardless of how many observers there are in either of the two possible worlds.

3 The result that EDT and SIA lead to actions that are not optimal ex ante is also featured in several publications about anthropic problems, e.g., Arntzenius, 2002; Briggs, 2010; Conitzer, 2015b; Schwarz, 2015.


Ahmed, A., & Price, H. (2012). Arntzenius on “Why ain”cha rich?’. Erkenntnis. An International Journal of Analytic Philosophy, 77(1), 15–30.

Arntzenius, F. (2002). Reflections on Sleeping Beauty. Analysis, 62(1), 53–62.
Arntzenius, F. (2008). No Regrets, or: Edith Piaf Revamps Decision Theory. Erkenntnis. An International Journal of Analytic Philosophy, 68(2), 277–297.

Briggs, R. (2010). Putting a value on Beauty. Oxford Studies in Epistemology, 3, 3–34.

Conitzer, V. (2015a). A devastating example for the Halfer Rule. Philosophical Studies, 172(8), 1985–1992.

Conitzer, V. (2015b). Can rational choice guide us to correct de se beliefs? Synthese, 192(12), 4107–4119.

Neal, R. M. (2006, August 23). Puzzles of Anthropic Reasoning Resolved Using Full Non-indexical Conditioning. arXiv [math.ST]. Retrieved from http://arxiv.org/abs/math/0608592

Schwarz, W. (2015). Lost memories and useless coins: revisiting the absentminded driver. Synthese, 192(9), 3011–3036.

Soares, N., & Fallenstein, B. (2015, July 7). Toward Idealized Decision Theory. arXiv [cs.AI]. Retrieved from http://arxiv.org/abs/1507.01986

Soares, N., & Levinstein, B. (2017). Cheating Death in Damascus. Retrieved from https://intelligence.org/files/DeathInDamascus.pdf

The average utilitarian’s solipsism wager

The following prudential argument is relatively common in my circles: We probably live in a simulation, but if we don’t, our actions matter much more. Thus, expected value calculations are dominated by the utility under the assumption that we (or some copies of ours) are in the real world. Consequently, the simulation argument affects our prioritization only slightly — we should still mostly act under the assumption that we are not in a simulation.

A commonly cited analogy is due to Michael Vassar: “If you think you are Napoleon, and [almost] everyone that thinks this way is in a mental institution, you should still act like Napoleon, because if you are, your actions matter a lot.” An everyday application of this kind of argument is the following: Probably, you will not be in an accident today, but if you are, the consequences for your life are enormous. So, you better fasten your seat belt.

Note how these arguments do not affect the probabilities we assign to some event or hypothesis. They are only about the event’s (or hypothesis’) prudential weight — the extent to which we tailor our actions to the case in which the event occurs (or the hypothesis is true).

For total utilitarians (and many other consequentialist value systems), similar arguments apply to most theories postulating a large universe or multiverse. To the extent that it makes a difference for our actions, we should tailor them to the assumption that we live in a large multiverse with many copies of us because under this assumption we can affect the lives of many more beings.

For average utilitarians, the exact opposite applies. Even if they have many copies, they will have an impact on a much smaller fraction of beings if they live in a large universe or multiverse. Thus, they should usually base their actions on the assumption of a small universe, such as a universe in which Earth is the only inhabited planet. This may already have some implications, e.g. via the simulation argument or the Fermi paradox. If they also take the average over time — I do not know whether this is the default for average utilitarianism — they would also base their actions on the assumption that there are just a few past and future agents. So, average utilitarians are subject to a much stronger Doomsday argument.

Maybe the bearing of such prudential arguments is even more powerful, though. There is some chance that metaphysical solipsism is true: the view that only my (or your) own mind exists and that everything else is just an illusion. If solipsism were true, our impact on average welfare (or average preference fulfillment) would be enormous, perhaps 7.5 billion times bigger than it would be under the assumption that Earth exists — about 100 billion times bigger if you also count humans that have lived in the past. Solipsism seems to deserve a probability larger than one in 6 (or 100) billion. (In fact, I think solipsism is likely enough for this to qualify as a non-Pascalian argument.) So, perhaps average utilitarians should maximize primarily for their own welfare?

Acknowledgement

The idea of this post is partly due to Lukas Gloor.

A Non-Comprehensive List of Human Values

Human values are said to be complex (cf. Stewart-Williams 2015, section “Morality Is a Mess”; Muehlhauser and Helm 2012, ch. 3, 4, 5.3). As evidence, the following is a non-comprehensive list of things that many people care about:

Abundance, art, asceticism, autarky, authority, autonomy, beauty, benevolence, challenge, community, competence, competitiveness, complexity, cooperation, creativity, crime, critical thinking, curiosity, democracy, dignity, diligence, diversity, duties, emotion, equality, excellence, excitement, experience, fairness, faithfulness, family, free will, freedom, friendship, frugality, fulfillment, fun, gender differences, gender equality, happiness, health, honesty, humbleness, idealism, improvement, intelligence, justice, knowledge, law abidance, life, love, loyalty, modesty, nature, novelty, openness, optimism, organization, pain, parsimony, peace, privacy, progress, promises, prosperity, purity, rationality, religion, respect, rights, sadness, safety, sanctity, self-determination, simplicity, sincerity, society, spirituality, stability, striving, suffering, surprise, technology, tolerance, tradition, variety, veracity, welfare, truth.

Note that from the inside, most of these values feel distinct from each other. Also, note that most of these do not feel instrumental to each other. For instance, people often want to find out the truth even when that truth is not useful for, e.g., reducing suffering or preserving tradition.

Some (articles with) lists that helped me to compile this list are Keith‑Spiegel’s moral characteristics list, moral foundations theory, Your Dictionary’s Examples of Morals, Eliezer Yudkowsky’s 31 laws of fun, table A1 in Bain et al.’s Collective Futures and Peter Levine’s an alternative to Moral Foundations Theory.

“Betting on the Past” by Arif Ahmed

[This post assumes knowledge of decision theory, as discussed in Eliezer Yudkowsky’s Timeless Decision Theory and in Arbital’s Introduction to Logical Decision Theory.]

I recently discovered an interesting thought experiment, “Betting on the Past” by Cambridge philosopher Arif Ahmed. It can be found in his book Evidence, Decision and Causality, which is an elaborate defense of Evidential Decision Theory (EDT). I believe that Betting on the Past may be used to money-pump non-EDT agents, refuting Causal Decision Theories (CDT), and potentially even ones that use logical conditioning, such as Timeless Decision Theory (TDT) or Updateless Decision Theory (UDT). At the very least, non-EDT decision theories are unlikely to win this bet. Moreover, no conspicuous perfect predicting powers, genetic influences, or manipulations of decision algorithms are required to make Betting on the Past work, and anyone can replicate the game at home. For these reasons, it might make a more compelling case in favor of EDT than the Coin Flip Creation, a problem I recently proposed in an attempt to defend EDT’s answers in medical Newcomb problems. In Ahmed’s thought experiment, Alice faces the following decision problem:

Betting on the Past: In my pocket (says Bob) I have a slip of paper on which is written a proposition P. You must choose between two bets. Bet 1 is a bet on P at 10:1 for a stake of one dollar. Bet 2 is a bet on P at 1:10 for a stake of ten dollars. So your pay-offs are as in [Figure 1]. Before you choose whether to take Bet 1 or Bet 2 I should tell you what P is. It is the proposition that the past state of the world was such as to cause you now to take Bet 2. [Ahmed 2014, p. 120]

Ahmed goes on to specify that Alice could indicate which bet she’ll take by either raising or lowering her hand. One can find a detailed discussion of the thought experiment’s implications, as well as a formal analysis of CDT’s and EDT’s decisions in Ahmed’s book. In the following, I want to outline a few key points.

Would CDT win in this problem? Alice is betting on a past state of the world. She can’t causally influence the past, and she’s uncertain whether the proposition is true or not. In either case, Bet 1 strictly dominates Bet 2: no matter which state the past is in, Bet 1 always yields a higher utility. For these reasons, causal decision theories would take Bet 1. Nevertheless, as soon as Alice comes to a definite decision, she updates on whether the proposition is true or false. If she’s a causal agent, she then finds out that she has lost: the past state of the world was such as to cause her to take Bet 1, so the proposition is false. If she had taken Bet 2, she would have found out that the proposition was correct, and she would have won, albeit a smaller amount than if she had won with Bet 1.

Betting on the Past seems to qualify as a kind of Newcomb’s paradox; it seems to have an equivalent payoff matrix (Figure 1).

Figure 1: Betting on the past has a similar payoff matrix to Newcomb’s paradox

P is true P is false
 Take Bet 1 10 -1
 Take Bet 2 1 -10

Furthermore, its causal structure seems to resemble those of e.g. the Smoking Lesion or Solomon’s problem, indicating it as a kind of medical Newcomb problem. In medical Newcomb problems, a “Nature” node determines both the present state of the world (whether the agent is sick/will win the bet) and the agent’s decision (see Figure 2). In this regard, they differ from Newcomb’s original problem, where said node refers to the agent’s decision algorithm.

Figure 2: Betting on the past (left) has a similar causal structure to medical Newcomb problems (right).

One could object to Betting on the Past being a medical Newcomb problem, since the outcomes conditional on our actions here are certain, while e.g. in the Smoking Lesion, observing our actions only shifts our probabilities in degrees. I believe this shouldn’t make a crucial difference. On the one hand, we can conceive of absolutely certain medical Newcomb cases like the Coin Flip Creation. On the other hand, Newcomb’s original problem is often formalized with absolute certainties as well. I’d be surprised if probabilistic vs. certain reasoning would make a difference to decision theories. First, we can always approximate certainties to an arbitrarily high degree. We might ask ourselves why a negligible further increase in certainty would at some point suddenly completely change the recommended action, then. Secondly, we’re never really certain in the real world anyway, so if the two cases would be different, this would render all thought experiments useless that use absolute certainties.

If Betting on the Past is indeed a kind of medical Newcomb problem, this would be an interesting conclusion. It would follow that if one prefers Bet 2, one should also one-box in medical Newcomb problems. And taking Bet 2 seems so obviously correct! I point this out because one-boxing in medical Newcomb problems is what EDT would do, and it is often put forward as both a counterexample to EDT and as the decision problem that separates EDT from Logical Decision Theories (LDT), such as TDT or UDT. (See e.g. Yudkowsky 2010, p.67)

Before we examine the case for EDT further, let’s take a closer look at what LDTs would do in Betting on the Past. As far as I understand, LDTs would take correlations with other decision algorithms into account, but they would ignore “retrocausality” (i.e. smoke in the smoker’s lesion, chew gum in the chewing gum problem, etc.). If there is a purely physical cause, then this causal node isn’t altered in the logical counterfactuals that an LDT agent reasons over. Perhaps if the bet was about the state of the world yesterday, LDT would still take Bet 2. Clearly, LDT’s algorithm already existed yesterday, and it can influence this algorithm’s output; so if it chooses Bet 2, it can change yesterday’s world and make the proposition true. But at some point, this reasoning has to break down. If we choose a more distant point in the past as a reference for Alice’s bet – maybe as far back as the birth of our universe – she’ll eventually be unable to exert any possible influence via logical counterfactuals. At some point, the correlation becomes a purely physical one. All she can do at that point is what opponents of evidential reasoning would call “managing the news” (Lewis, 1981) – she can merely try to go for the action that gives her the best Bayesian update.

So, do Logical Decision Theories get it wrong? I’m not sure about that; they come in different versions, and some haven’t yet been properly formalized, so it’s hard for me to judge. I can very well imagine that e.g. Proof-Based Decision Theory would take Bet 2, since it could prove P to be either true or false, contingent on the action it would take. I would argue, though, that if a decision theory takes Bet 2 – and if I’m right about Betting on the Past being a medical Newcomb problem – then it appears it would also have to “one-box”, i.e. take the option recommended by EDT, in other medical Newcomb problems.

If all of this is true, it might imply that we don’t really need LDT’s logical conditioning and that EDT’s simple Bayesian conditioning on actions could suffice. The only remaining difference between LDT and EDT would then be EDT’s lack of updatelessness. What would an updateless version of EDT look like? Some progress on this front has already been made by Everitt, Leike, and Hutter 2015. Caspar Oesterheld and I hope to be able to say more about it soon ourselves.

A Better Framing of Newcomb’s Problem

While I disagree with James M. Joyce on the correct solution to Newcomb’s problem, I agree with him that the standard framing of Newcomb’s problem (from Nozick 1969) can be improved upon. Indeed, I very much prefer the framing he gives in chapter 5.1 of The Foundations of Causal Decision Theory, which (according to Joyce) is originally due to JH Sobel:

Suppose there is a brilliant (and very rich) psychologist who knows you so well that he can predict your choices with a high degree of accuracy. One Monday as you are on the way to the bank he stops you, holds out a thousand dollar bill, and says: “You may take this if you like, but I must warn you that there is a catch. This past Friday I made a prediction about what your decision would be. I deposited $1,000,000 into your bank account on that day if I thought you would refuse my offer, but I deposited nothing if I thought you would accept. The money is already either in the bank or not, and nothing you now do can change the fact. Do you want the extra $1,000?” You have seen the psychologist carry out this experiment on two hundred people, one hundred of whom took the cash and one hundred of whom did not, and he correctly forecast all but one choice. There is no magic in this. He does not, for instance, have a crystal ball that allows him to “foresee” what you choose. All his predictions were made solely on the basis of knowledge of facts about the history of the world up to Friday. He may know that you have a gene that predetermines your choice, or he may base his conclusions on a detailed study of your childhood, your responses to Rorschach tests, or whatever. The main point is that you now have no causal influence over what he did on Friday; his prediction is a fixed part of the fabric of the past. Do you want the money?

I prefer this over the standard framing because people can remember the offer and the balance of their bank account better than box 1 and box 2. For some reason, I also find it easier to explain this thought experiments without referring to the thought experiment itself in the middle of the explanation. So, now whenever I describe Newcomb’s problem, I start with Sobel’s rather than Nozick’s version.

Of course, someone who wants to explore decision theory more deeply also needs to learn about the standard version, if only because people sometimes use “one-boxing” and “two-boxing” (the options in Newcomb’s original problem) to denote the analogous choices in other thought experiments. (Even if there are no boxes in these other thought experiments!) But luckily it does not take more than a few sentences to describe the original Newcomb problem based on Sobel’s version. You only need to explain that Newcomb’s problem replaces your bank account with an opaque box whose content you always keep; and puts the offer into a second, transparent box. And then the question is whether you stick with one box or go home with both.

Peter Thiel on Startup Culture

I recently read Peter Thiel’s Zero to One. All in all, it is an informative read. I found parts of ch. 10 on startup culture particularly interesting. Here’s the section “What’s under Silicon Valley’s Hoodies”:

Unlike people on the East Coast, who all wear the same skinny jeans or pinstripe suits depending on their industry, young people in Mountain View and Palo Alto go to work wearing T-shirts. It’s a chliché that tech workers don’t care about what they wear, but if you look closely at those T-shirts, you’ll see the logos of the wearers’ companies—and tech workers care about those very much. What makes a startup employee instantly distinguishable to outsiders is the branded T-shirt or hoodie that makes him look the same as his co-workers. The startup uniform encapsulates a simple but essential principle: everyone at your company should be different in the same way—a tribe of like-minded people fiercely devoted to the company’s mission.

Max Levchin, my co-founder at PayPal, says that statups should make their early staff as personally similar as possible. Startups have limited resources and small teams. They must work quickly and efficiently in order to survive, and that’s easier to do when everyone shares an understanding of the world. The early PayPal team worked well together because we were all the same kind of nerd. We all loved science ficion: Cryptonomicon was required reading, and we preferred the capitalist Star Wars to the communist Star Trek. Most important, we were all obsessed with creating a digital currency that would be controlled by individuals instead of governments. For the company to work, it didn’t matter what people looked like or which country they came from, but we needed every new hire to be equally obsessed.

In the section “Of cults and consultants” of the same chapter, he goes on:

In the most intense kind of organization, members hang out only with other members. They ignore their families and abandon the outside world. In exchange, they experience strong feelings of belonging, and maybe get access to esoteric “truths” denied to ordinary people. We have a word for such organizations: cults. Cultures of total dedication look crazy from the outside, partly because the most notorious cults were homicidal: Jim Jones and Charles Manson did not make good exits.

But entrepeneurs should take cultures of extreme dedication seriosuly. Is a lukewarm attitude to one’s work a sign of mental health? Is a merely professional attitude the only sane approach? The extreme opposite of a cult is a consulting firm like Accenture: not only does it lack a distinctive mission of its own, but individual consultants are regularly dropping in and out of companies to which they have no long-term connection whatsover.

[…]

The best startups might be considered slightly less extreme kinds of cults. The biggest difference is that cults tend to be fanatically wrong about something important. People at a successful startup are fanatically right about something those outside it have missed. You’re not going to learn those kinds of secrets from consultants, and you don’t need to worry if your company doesn’t make sense to conventional professionals. Better to be called a cult—or even a mafia.