Multiverse-wide cooperation via correlated decision making – Summary

This is a short summary of some of the main points from my paper on multiverse-wide superrationality. For details, caveats and justifications, see the full paper.

The target audience for this post consists of:

  • people who have already thought about the topic and thus don’t want to read through the long explanations given in the paper;
  • people who have already read (some of) the full paper and just want to refresh their memory;
  • people who don’t yet know whether they should read the full paper and thus want to know whether the content is interesting or relevant to them.
If you are not in any of these groups, this post may be confusing and not very helpful for understanding the main

Main idea

  • Take values of agents with your decision algorithm into account to make it more likely that they do the same. I’ll use Hofstadter’s (1983) term superrationality to refer to this kind of cooperation.
  • Whereas acausal trade as it is usually understood seems to require mutual simulation and is thus hard to get right as a human, superrationality is easy to apply for humans (if they know how they can benefit agents that use the same decision algorithm).
  • Superrationality may not be relevant among agents on Earth, e.g. because on Earth we already have causal cooperation and few people use the same decision algorithm as we use. But if we think that we might live in a vast universe or multiverse (as seems to be a common view among physicists, see, e.g., Tegmark (2003)), then there are (potentially infinitely) many agents with whom we could cooperate in the above way.
  • This multiverse-wide superrationality (MSR) suggests that when deciding between policies in our part of the multiverse, we should essentially adopt a new utility function (or, more generally, a new set of preferences) which takes into account the preferences of all agents with our decision algorithm. I will call that our compromise utility function (CUF). Whatever CUF we adopt, the others will (be more likely to) adopt a structurally similar CUF. E.g., if our CUF gives more weight to our values, then the others’ CUF will also give more weight to their values. The gains from trade appear to be highest if everyone adopts the same CUF. If this is the case, multiverse-wide superrationality has strong implications for what decisions we should make.

The superrationality mechanism

  • Superrationality works without reciprocity. For example, imagine there is one agent for every integer and that for every i, agent i can benefit agent i+1 at low cost to herself. If all the agents use the same decision algorithm, then agent i should benefit agent i+1 to make it more likely that agent i-1 also cooperates in the same way. That is, agent i should give something to an agent that cannot in any way return the favor. This means that when cooperating superrationally, you don’t need to identify which agents can help you.
  • How should the new criterion for making decisions, our compromise utility function, look like?
    • Harsanyi’s (1955) aggregation theorem suggests that it should be a weighted sum of the utility functions of all the participating agents.
    • To maximize gains from trade, everyone should adopt the same weights.
    • Variance-voting (Cotton-Barratt 2013; MacAskill 2014, ch. 3) is a promising candidate.
    • If some of the values require coordination (e.g., if one of the agents wants there to be at least one proof of the Riemann conjecture in the multiverse), then things get more complicated.
  • “Updatelessness” has some implications. E.g., it means that one should, under certain conditions, accept a superrational compromise that is bad for oneself.

The values of the other agents

  • To maximize the compromise utility function, it is very useful (though not strictly necessary, see section “Interventions”) to know what other agents with similar decision algorithms care about.
  • The orthogonality thesis (Bostrom 2012) implies that the values of the other agents are probably different from ours, which means that taking them into account makes a difference.
  • Not all aspects of the values of agents with our decision algorithm are relevant:
    • Only the consequentialist parts of their values matter (though things like minimizing the number of rule violations committed by all agents is a perfectly fine consequentialist value system).
    • Only values that apply to our part of the multiverse are relevant. (Some agents may care exclusively or primarily about their part of the multiverse.)
    • At least humans care differently about far away than about near things. Because we are far away from most agents with our decision algorithm, we only need to think about what they care about in distant things.
    • Superrationalists may care more about their idealized values, so we may try to idealize their values. However, we should be very careful to idealize only in ways consistent with their meta-preferences. (Otherwise, your values may be mis-idealized.)
  • There are some ways to learn about what other superrational agents care about.
    • The empirical approach: We can survey the relevant aspects of human values. The values of humans who take superrationality seriously are particularly relevant.
      • An example of relevant research is Bain et al.’s (2013) study on what people care about in future societies. They found that people put most weight on how warm, caring and benevolent members of these societies are. If we believe that construal level theory (see Trope and Liberman (2010) for an excellent summary) is roughly correct, then such results should carry over to evaluations of other psychologically distant societies. Although these results have been replicated a few times (Bain et al. 2012; Park et al. 2015; Judge and Wilson 2015; Bain et al. 2016), they are tentative and merely exemplify relevant research in this domain.
      • Another interesting data point is the values of the EA/LW/SSC/rationalist community, to my knowledge the only group of people who plausibly act on superrationality.
    • The theoretical approach: We could think about the processes that affect the distribution of values in the multiverse.
      • Biological evolution
      • Cultural evolution (see, e.g., Henrich 2015)
      • Late great filters
        • For example, if a lot of civilizations self-destruct with weapons of mass destruction, then the compromise utility function may contain a lot more peaceful values than an analysis based on biological and cultural evolution suggests.
      • The transition to whole brain emulations (Hanson 2016)
      • The transition to de novo AI (Bostrom 2014)

Interventions

  • There are some general ways in which we can effectively increase our compromise utility function without knowing its exact content.
    • Many meta-activities don’t require any such knowledge as long as we think that it can be acquired in the future. E.g., we could convince other people of MSR, do research on MSR, etc.
    • Sometimes, very very small bits of knowledge suffice to identify promising interventions. For example, if we believe that the consequentialist parts of human values are a better approximation of the consequentialist parts of other agents’ values than non-consequentialist human values, then we should make people more consequentialist (without necessarily promoting any particular consequentialist morality).
    • Another relevant point is that no matter how well we know the content of the compromise function, the argument in favor of maximizing it in our part of the universe is still just as valid. Thus, even if we know very little about its content, we should still do our best at maximizing it. (That said, we will often be better at maximizing the values of humans, in great part because we know and understand these values better.)
  • Meta-activities
    • Further research
    • Promoting multiverse-wide superrationality
  • Probably ensuring that superintelligent AIs have a decision theory that reasons correctly about superrationality is ultimately the most important intervention (although promoting multiverse-wide superrationality among humans can be instrumental for doing so).
  • There are some interventions in the moral advocacy space which align people’s preferences more with those of other superrational agents about our universe.
    • Promoting consequentialism
      • This is also good because consequentialism enables cooperation with the agents in other parts of the multiverse.
    • Promoting pluralism (e.g., convincing utilitarians to also take things other than welfare into account)
    • Promoting concern for benevolence and warmth (or whatever other value is much stronger represented in high versus low construal preferences)
    • Facilitating moral progress (i.e., presenting people with the arguments for both sides). Probably valuing preference idealization is more common than disvaluing it.
    • Promoting multiverse-wide preference utilitarianism
  • Promoting causal cooperation

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s