A Non-Comprehensive List of Human Values

On February 10, 2017September 16, 2018 By CasparIn General5 Comments

Human values are said to be complex (cf. Stewart-Williams 2015, section “Morality Is a Mess”; Muehlhauser and Helm 2012, ch. 3, 4, 5.3). As evidence, the following is a non-comprehensive list of things that many people care about:

Abundance, achievement, adventure, affiliation, altruism, apatheia, art, asceticism, austerity, autarky, authority, autonomy, beauty, benevolence, bodily integrity, challenge, collective property, commemoration, communism, community, compassion, competence, competition, competitiveness, complexity, comradery, conscientiousness, consciousness, contentment, cooperation, courage, “crabs in a bucket”, creativity, crime, critical thinking, curiosity, democracy, determination, dignity, diligence, discipline, diversity, duties, education, emotion, envy, equality, equanimity, excellence, excitement, experience, fairness, faithfulness, family, fortitude, frankness, free will, freedom, friendship, frugality, fulfillment, fun, good intentions, greed, happiness, harmony, health, honesty, honor, humility, idealism, idolatry, imagination, improvement, incorruptibility, individuality, industriousness, intelligence, justice, knowledge, law abidance, life, love, loyalty, modesty, monogamy, mutual affection, nature, novelty, obedience, openness, optimism, order, organization, pain, parsimony, peace, peace of mind, pity, play, population size, preference fulfillment, privacy, progress, promises, property, prosperity, punctuality, punishment, purity, racism, rationality, reliability, religion, respect, restraint, rights, sadness, safety, sanctity, security, self-control, self-denial, self-determination, self-expression, self-pity, simplicity, sincerity, social parasitism, society, spirituality, stability, straightforwardness, strength, striving, subordination, suffering, surprise, technology, temperance, thought, tolerance, toughness, truth, tradition, transparency, valor, variety, veracity, wealth, welfare, wisdom.

Note that from the inside, most of these values feel distinct from each other. Some of them have strong overlap, however. For instance, industriousness, diligence and conscientiousness often refer to similar things.

Also, note that most of these do not feel instrumental to each other. For example, people often want to find out the truth even when that truth is not useful for, e.g., reducing suffering or preserving tradition.

Some terms subsume multiple very different or even opposing moral views. For instance, progressives would say it’s fair if wealth is taken from the rich and given to the poor while libertarians would say it is fair if everyone receives wealth in proportion to how the market values their work.

Many of the values can be interpreted both deontologically and consequentialistically. For example, “frugality” could refer to the moral maxim “you shall be frugal” or to “you shall care about others being frugal”.

These values should not be understand as being valued additively. People presumably do not care about the amount of consciousness in the world plus the amount of happiness in the world. Instead they may care about the amount of consciousness times the average happiness of the conscious experiences.

Some (articles with) lists that helped me to compile this list are Keith‑Spiegel’s moral characteristics list, moral foundations theory, Your Dictionary’s Examples of Morals, Eliezer Yudkowsky’s 31 laws of fun, table A1 in Bain et al.’s Collective Futures, the examples in the Wikipedia article on Prussian values, the Moral Code of the Builder of Communism, the ten commandments, section IV, chapter 1 in Nussbaum’s (2000) Women and Human Development, Frankena’s (1973) Ethics, 2nd ed., p. 87f. and Peter Levine’s an alternative to Moral Foundations Theory.

Joyce’s Better Framing of Newcomb’s Problem

On February 2, 2017September 3, 2018 By CasparIn General3 Comments

While I disagree with James M. Joyce on the correct solution to Newcomb’s problem, I agree with him that the standard framing of Newcomb’s problem (from Nozick 1969) can be improved upon. Indeed, I very much prefer the framing he gives in chapter 5.1 of The Foundations of Causal Decision Theory, which (according to Joyce) is originally due to JH Sobel:

Suppose there is a brilliant (and very rich) psychologist who knows you so well that he can predict your choices with a high degree of accuracy. One Monday as you are on the way to the bank he stops you, holds out a thousand dollar bill, and says: “You may take this if you like, but I must warn you that there is a catch. This past Friday I made a prediction about what your decision would be. I deposited $1,000,000 into your bank account on that day if I thought you would refuse my offer, but I deposited nothing if I thought you would accept. The money is already either in the bank or not, and nothing you now do can change the fact. Do you want the extra $1,000?” You have seen the psychologist carry out this experiment on two hundred people, one hundred of whom took the cash and one hundred of whom did not, and he correctly forecast all but one choice. There is no magic in this. He does not, for instance, have a crystal ball that allows him to “foresee” what you choose. All his predictions were made solely on the basis of knowledge of facts about the history of the world up to Friday. He may know that you have a gene that predetermines your choice, or he may base his conclusions on a detailed study of your childhood, your responses to Rorschach tests, or whatever. The main point is that you now have no causal influence over what he did on Friday; his prediction is a fixed part of the fabric of the past. Do you want the money?

I prefer this over the standard framing because people can remember the offer and the balance of their bank account better than box 1 and box 2. For some reason, I also find it easier to explain this thought experiments without referring to the thought experiment itself in the middle of the explanation. So, now whenever I describe Newcomb’s problem, I start with Sobel’s rather than Nozick’s version.

Of course, someone who wants to explore decision theory more deeply also needs to learn about the standard version, if only because people sometimes use “one-boxing” and “two-boxing” (the options in Newcomb’s original problem) to denote the analogous choices in other thought experiments. (Even if there are no boxes in these other thought experiments!) But luckily it does not take more than a few sentences to describe the original Newcomb problem based on Sobel’s version. You only need to explain that Newcomb’s problem replaces your bank account with an opaque box whose content you always keep; and puts the offer into a second, transparent box. And then the question is whether you stick with one box or go home with both.

Peter Thiel on Startup Culture

On January 24, 2017 By CasparIn GeneralLeave a comment

I recently read Peter Thiel’s Zero to One. All in all, it is an informative read. I found parts of ch. 10 on startup culture particularly interesting. Here’s the section “What’s under Silicon Valley’s Hoodies”:

Unlike people on the East Coast, who all wear the same skinny jeans or pinstripe suits depending on their industry, young people in Mountain View and Palo Alto go to work wearing T-shirts. It’s a chliché that tech workers don’t care about what they wear, but if you look closely at those T-shirts, you’ll see the logos of the wearers’ companies—and tech workers care about those very much. What makes a startup employee instantly distinguishable to outsiders is the branded T-shirt or hoodie that makes him look the same as his co-workers. The startup uniform encapsulates a simple but essential principle: everyone at your company should be different in the same way—a tribe of like-minded people fiercely devoted to the company’s mission.

Max Levchin, my co-founder at PayPal, says that statups should make their early staff as personally similar as possible. Startups have limited resources and small teams. They must work quickly and efficiently in order to survive, and that’s easier to do when everyone shares an understanding of the world. The early PayPal team worked well together because we were all the same kind of nerd. We all loved science ficion: Cryptonomicon was required reading, and we preferred the capitalist Star Wars to the communist Star Trek. Most important, we were all obsessed with creating a digital currency that would be controlled by individuals instead of governments. For the company to work, it didn’t matter what people looked like or which country they came from, but we needed every new hire to be equally obsessed.

In the section “Of cults and consultants” of the same chapter, he goes on:

In the most intense kind of organization, members hang out only with other members. They ignore their families and abandon the outside world. In exchange, they experience strong feelings of belonging, and maybe get access to esoteric “truths” denied to ordinary people. We have a word for such organizations: cults. Cultures of total dedication look crazy from the outside, partly because the most notorious cults were homicidal: Jim Jones and Charles Manson did not make good exits.

But entrepeneurs should take cultures of extreme dedication seriosuly. Is a lukewarm attitude to one’s work a sign of mental health? Is a merely professional attitude the only sane approach? The extreme opposite of a cult is a consulting firm like Accenture: not only does it lack a distinctive mission of its own, but individual consultants are regularly dropping in and out of companies to which they have no long-term connection whatsover.

[…]

The best startups might be considered slightly less extreme kinds of cults. The biggest difference is that cults tend to be fanatically wrong about something important. People at a successful startup are fanatically right about something those outside it have missed. You’re not going to learn those kinds of secrets from consultants, and you don’t need to worry if your company doesn’t make sense to conventional professionals. Better to be called a cult—or even a mafia.

Is it a bias or just a preference? An interesting issue in preference idealization

On January 18, 2017March 26, 2020 By CasparIn General2 Comments

When taking others’ preferences into account, we will often want to idealize them rather than taking them too literally. Consider the following example. You hold a glass of transparent liquid in your hand. A woman walks by, says that she is very thirsty and would like to drink from your glass. What she doesn’t know, however, is that the water in the glass is (for some reason not relevant to this example) poisoned. Should you allow her to drink? Most people would say you should not. While she does desire to drink out of the glass, this desire would probably disappear upon gaining knowledge of its content. Therefore, one might say that her object-level preference is to drink from the glass, while her idealized preference would be not to drink from it. There is not too much literature on preference idealization, as far as I know, but, if you’re not already familiar with it, anyway, consider looking into “Coherent Extrapolated Volition“.

Preference idealization is not always as easy as inferring that someone doesn’t want to drink poison, and in this post, I will discuss a particular sub-problem: accounting for cognitive biases, i.e. systematic mistakes in our thinking, as they pertain to our moral judgments. However, the line between biases and genuine moral judgments is sometimes not clear.

Specifically, we look at cognitive biases that people exhibited in non-moral decisions, where their status as a bias to be corrected is much less controversial, but which can explain certain ethical intuitions. By offering such an error theory of a moral intuition, i.e. an explanation for how people could erroneously come to such a judgment, the intuition is called into question. Defendants of the intuition can respond that even if the bias can be used to explain the genesis of that moral judgment, they would nonetheless stick with that moral intuition. After all, the existence of all our moral positions can be explained by non-moral facts about the world – “explaining is not explaining away”. Consider the following examples.

Omission bias: People judge consequences of inaction as less severe than those of action. Again, this is clearly a bias in some cases, especially non-moral ones. For example, losing $1,000 by not responding to your bank in time is just as bad as losing $1,000 by throwing them out of the window. A business person who judges the two equivalent losses equally will ceteris paribus be more successful. Nonetheless, most people distinguish between act and omission in cases like the fat man trolley problem.

Scope neglect: The scope or size of something often has little or no effect on people’s thinking when it should have. For example, when three groups of people were asked what they would pay for interventions that would affect 2,000, 20,000, or 200,000 birds, people were willing to pay roughly the same amount of money irrespective of the number of birds. While scope neglect seems clearly wrong in this (moral) decision, it is less clearly so in other areas. For example, is a flourishing posthuman civilization with 2 trillion inhabitants really twice as good as one with 1 trillion? It is not clear to me whether answering “no” should be regarded as a judgment clouded by scope neglect (caused, e.g., by our inability to imagine the two civilizations in question) or a moral judgment that is to be accepted.

Contrast effect (also see decoy effect, social comparison bias, Ariely on relativity, mere subtraction paradox, Less-is-better effect): Consider the following market of computer hard drives, from which you are to choose one.

Hard drive model	Model 1	Model 2	Model 3 (decoy)
Price	$80	$120	$130
Capacity	250GB	500GB	360GB

Generally, one wants to expend as little money as possible while maximizing capacity. In the absence of model 3, the decoy, people may be undecided between models 1 and 2. However, when model 3 is introduced into the market, it provides a new reference point. Model 2 is better than model 3 in all regards, which increases its attractiveness to people, even relative to model 1. That is, models 1 and 2 are judged by how they compare with model 3 rather than by their own features. The effect clearly exposes an instance of irrationality: the existence of model 3 doesn’t affect how model 1 compares with model 2. When applied to ethical evaluation, however, it calls into question a firmly held intrinsic moral preference for social equality and fairness. Proponents of fairness seem to assess a person’s situation by comparing it to that of Bill Gates rather than judging each person’s situation separately. Similar to how the overpriced decoy changes our evaluation of the other products, our judgments of a person’s well-being, wealth, status, etc. may be seen as irrationally depending on the well-being, wealth, status, etc. of others.

Other examples include peak-end rule/extension neglect/evaluation by moments and average utilitarianism; negativity bias and caring more about suffering than about happiness; psychological distance and person-affecting views; status-quo bias and various population ethical views (person-affecting views, the belief that most sentient beings that already exist have lives worth living); moral credential effect; appeal to nature and social Darwinism/normative evolutionary ethics.

Acknowledgment: This work was funded by the Foundational Research Institute (now the Center on Long-Term Risk).

Decision Theory and the Irrelevance of Impossible Outcomes

On January 17, 2017May 6, 2025 By CasparIn General9 Comments

(This post assumes some knowledge of the decision theory of Newcomb-like scenarios.)

One problem in the decision theory of Newcomb-like scenarios (i.e. the study of whether causal, evidential or some other decision theory is true) is that even the seemingly obvious basics are fiercely debated. Newcomb’s problem seems to be fundamental and the solution obvious (to both sides), and yet scholars disagree about its resolution. If we already fail at the basics, how can we ever settle this debate?

In this post, I propose a solution. Specifically, I will introduce a very plausible general principle that decision rules should abide by. One may argue that settling on powerful general rules (like the one I will propose) must be harder than settling single examples (like Newcomb’s problem). However, this is not universally the case. Especially in decision theory, we should expect general principles to be especially convincing because a common defense of two-boxing in Newcomb’s scenario is that Newcomb’s problem is just a weird edge case in which rationality is punished. By introducing a general principle that CDT (or, perhaps, EDT) violates, we can prove the existence of a general flaw.

Without further ado, the principle is: The decisions we make should not depend on the utilities assigned to outcomes that are impossible to occur. To me this principle seems obvious and indeed it is consistent with expected value calculations in non-Newcomb-like scenarios: Imagine having to deterministically choose an action from some set A. (We will ignore mixed strategies.) The next state of the world is sampled from a set of states S via a distribution P and depends on the chosen action. We are also given a utility function U, which assigns values to pairs of a state and an action. Let a be an action and let s be a possible state. If P(s,a) = 0 (or P(s|a)=0 or P(s given the causal implications of a)=0 – we assume all of these to be the equivalent in this non-Newcomb-like scenario), then it doesn’t matter what U(s,a) is, because in an expected value calculation, U(s,a) will always be multiplied with P(s,a)=0. That is to say, any expected value decision rule gives the same outcome regardless of U(s,a). So, expected value decision rules abide by this principle at least in non-Newcomb-like scenarios.

Let us now apply the principle to a Newcomb-like scenario, specifically to the prisoner’s dilemma played against an exact copy of yourself. Your actions are C and D. Your opponent is the “environment” and can also choose between C (cooperation) and D (defection). So, the possible outcomes are (C,C), (C,D), (D,C) and (D,D). The probabilities P(C,D) and P(D,C) are both 0. Applied to this Newcomb-like scenario, the principle of the irrelevance of impossible alternatives states that our decision should only depend on the utilities of (C,C) and (D,D). Evidential decision theory behaves in accordance with this principle. (I leave it as an exercise to the reader to verify this.) Indeed, I suspect that it can be shown that EDT generally abides by the principle of the irrelevance of impossible outcomes. The choice of causal decision theory on the other hand does depend on the utilities of the impossible outcomes U(D,C) and U(C,D). Remember that in the prisoner’s dilemma the payoffs are such that U(D,x)>U(C,x) for any action x of the opponent, i.e. no matter the opponent’s choice it is always better to defect. This dominance is given as the justification for CDT’s decision to defect. But let us say we increase the utility of U(C,D) such that U(C,D)>U(D,D) and decrease the utility of U(D,C) such that U(D,C)<U(C,C). Of course, we must make these changes for the utility functions of both players so as to retain symmetry. After these changes, the dominance relationship is reversed: U(C,x)>U(D,x) for any action x. Of course, the new payoff matrix is not that of a prisoner’s dilemma anymore – the game is different in important ways. But when played against a copy, these differences do not seem significant, because we only changed the utilities of outcomes that were impossible to achieve anyway. Nevertheless, CDT would switch from D to C upon being presented with these changes, thus violating the principle of the irrelevance of impossible outcomes. This is a systematic flaw in CDT: Its decisions depend on the utility of outcomes that it can already know to be impossible.

The principle of the irrelevance of impossible outcomes can be used beyond arguing against CDT. As you may remember from my post on updatelessness, sensible decision theories will precommit to give Omega the money in the counterfactual mugging thought experiment. (If you don’t remember or haven’t read that post in the first place, this is a good time to catch up, because the following thoughts are based on the ideas from the post.) Even EDT, which ignores the utility of impossible outcomes, would self-modify in this way. However, the decision theory resulting from such self-modification violates the principle of the irrelevance of impossible outcomes. Remember that in counterfactual mugging, you give in because this was a good idea to precommit to when you didn’t yet know how the coin came up. However, once you know that the coin came up the unfavorable way, the positive outcome, which gave you the motivation to precommit, has become impossible. Of course, you only give in to counterfactual mugging if the reward in this now impossible branch is sufficiently high. For example, there is no reason to precommit to give in if you lose money in both branches. This means that once you have become updateless, you violate the principle of the irrelevance of impossible outcomes: your decision in counterfactual mugging depends on the utility you assign to an outcome that cannot happen anymore.

Acknowledgment: This work was funded by the Foundational Research Institute (now the Center on Long-Term Risk).

Omoto and Snyder (1995) on motivations to volunteer

On January 6, 2017 By CasparIn GeneralLeave a comment

Omoto and Snyder (1995) is only a single study on volunteerism with a sample of 116 AIDS volunteers, but the results are quite interesting nonetheless. In a Snyder, Omoto and Crain (1999) they summarize:

Motivations [..] foreshadow the length of time that volunteers stay active (Omoto and Snyder, 1995). In one longitudinal study, volunteers who were more motivated […] when they began their work were more likely to still be active 2.5 years later. Interestingly, relatively self-focused motivations (i.e., personal development, understanding, esteem enhancement) were more predictive of volunteers’ duration of service than those that were more other-focused (i.e., values and beliefs, community concern). That is, volunteers remained active to the extent that they more strongly endorsed relatively self-focused motivations for their work. Other-focused motives, even though they may provide considerable impetus for people to become volunteers, may not sustain volunteers faced with the tough realities and personal costs of volunteering.

The study also has other interesting results. For instance, “90% of respondents expected to continue volunteering with the agency for at least another year. In actuality, 54% of the volunteers were still active 1 year later, whereas only 16% of them were still active 2.5 years later.” (Omoto and Snyder 1995, p. 677)

I haven’t looked into the literature much more but this seems to be exactly the kind of research one should turn to if I wanted to design a successful social movement.

Thoughts on Updatelessness

On November 21, 2016March 26, 2020 By CasparIn General11 Comments

[This post assumes knowledge of decision theory, as discussed in Eliezer Yudkowsky’s Timeless Decision Theory.]

One interesting feature of some decision theories that I used to be a bit confused about is “updatelessness”. A thought experiment suitable for explaining the concept is counterfactual mugging: “Omega [a being to be assumed a perfect predictor and absolutely trustworthy] appears and says that it has just tossed a fair coin, and given that the coin came up tails, it decided to ask you to give it $100. Whatever you do in this situation, nothing else will happen differently in reality as a result. Naturally you don’t want to give up your $100. But Omega also tells you that if the coin came up heads instead of tails, it’d give you $10000, but only if you’d agree to give it $100 if the coin came up tails.”

There are various alternatives to this experiment, which seem to illustrate a similar concept, although they are not all structurally isomorphic. For example Gary Drescher discusses Newcomb’s problem with transparent boxes in ch. 6.2 and retribution in ch. 7.3.1 of his book Good and Real. Another relevant example is Parfit’s hitchhiker.

Of course, you win by refusing to pay. To strengthen the intuition that this is the case, imagine that the whole world just consists of one instance of counterfactual mugging and that you already know for certain that the coin came up tails. (We will assume that there is no anthropic uncertainty about whether you are in a simulation used to predict whether you would give in to counterfactual mugging. That is, Omega used some (not necessarily fully reliable) way of figuring out what you’d do. For example, Omega may have created you in a way that implies giving in or not giving in to counterfactual mugging.) Instead of giving money, let’s say thousands of people will be burnt alive if you give in while millions could have been saved if the coin had come up heads. Nothing else will be different as a result of that action. I don’t think there is any dispute over what choices maximizes expected utility for this agent.

The cause of dispute is that agents who give in to counterfactual mugging win in terms of expected value as judged from before learning the result of the coin toss. That is, prior to being told that the coin came up tails, an agent better be one that gives in to counterfactual mugging. After all, this will give her 0.5*$10,000 – 0.5*$100 in expectation. So, there is a conflict between what the agent would rationally want her future self to choose and what is rational for her future self to do. (Another example of this is the absent-minded driver.) There is nothing particularly confusing about the existence of problems with such inconsistency.

Because being an “updateless” agent, i.e. one that makes the choice based on how it would have wanted the choice to be prior to updating, is better for future instances of mugging, sensible decision theories would self-modify into being updateless with regard to all future information they receive. (Note that being updatelessness doesn’t mean that one doesn’t change one’s behavior based on new information, but that one goes through with the plans that one would have committed oneself to pursue before learning that information.) That is, an agent using a decision theory like (non-naive) evidential decision theory (EDT) would commit to giving in to counterfactual mugging and similar decision problems prior to learning that it ended up in the “losing branch”. However, if the EDT agent already knows that it is in the losing branch of counterfactual mugging and hasn’t thought about updatelessness, yet, it wouldn’t give in, although it might (if it is smart enough) self-modify into being updateless in the future.

One immediate consequence of the fact that updateless agents are better off is that one would want to program an AI to be updateless from the start. I guess it is this sense in which people like the researchers of the Machine Intelligence Research Institute consider updatelessness to be correct despite the fact that it doesn’t maximize expected utility in counterfactual mugging.

But maybe updateless is not even needed explicitly if the decision theory can take over epistemics. Consider the EDT agent, to whom Omega explains counterfactual mugging. For simplicity’s sake, let us assume that Omega explains counterfactual mugging and only then states which way the coin came up. After the explanation, the EDT agent could precommit, but let’s assume it can’t do so. Now, Omega opens her mouth to tell the EDT agent how the coin came up. Usually, decision theories are not connected to epistemics, so upon Omega uttering the words “the coin came up heads/tails”, Bayesian updating would run its due course. And that’s the problem, since after Bayesian updating the agent will be tempted to reject giving in, which is bad from the point of view of before learning which way the coin came up. To gain good evidence about Omega’s prediction of oneself, EDT may update in a different way to ensure that it would receive the money if the coin came up heads. For example, it could update towards the existence of both branches (which is basically equivalent to the updateless view of continuing to maintain the original position). Of course, self-modifying or just using some decision theory that has updatelessness built in is the much cleaner way to go.

Overall, this suggests a slightly different view of updatelessness. Updatelessness is not necessarily a property of decision theories. It is the natural thing to happen when you apply acausal decision theory to updating based on new information.

Acknowledgment: This work was funded by the Foundational Research Institute (now the Center on Long-Term Risk).

Environmental and Logical Uncertainty: Reported Environmental Probabilities as Expected Environmental Probabilities under Logical Uncertainty

On October 21, 2016January 20, 2017 By CasparIn GeneralLeave a comment

[Readers should be familiar with the Bayesian view of probability]

Let’s differentiate between environmental and logical uncertainty, and, consequently, environmental and logical probabilities. Environmental probabilities are the ones most of my readers will be closely familiar with. They are about the kinds of things that you can’t figure out even if you have infinite amounts of computing power until you have seen enough evidence.

Logical uncertainty is a different kind of uncertainty. For example, what is the 1,000th digit of the mathematical constant e? You know how e is defined. The definition uniquely implies what the 1,000th digit of e is and yet you’re uncertain as to the value of the 1,000th digit of e. Perhaps, you would assign logical probabilities: the probability that the 1,000th digit of e is 0 is 10%. For more detail, consider the MIRI paper Questions of Reasoning Under Logical Uncertainty by Nate Soares and Benja Fallenstein.

Now, I would like to draw attention to what happens when an agent is forced to quantify its environmental uncertainty, for example, when it needs to perform an expected value calculation. It’s good to think in terms of simplified artificial minds rather than humans, because human minds are so messy. If you think that any proper artificial intelligence would obviously know the values of its environmental probabilities, then think again: Proper ways of updating environmental probabilities based on new evidence (like Solomonoff induction) tend to be incomputable. So, an AI usually can’t quantify what exact values it should assign to certain environmental probabilities. This may remind you of the 1,000th digit of e: In both cases, there is a precise definition for something, but you can’t infer the exact numbers from that definition, because you and the AI are not intelligent enough.

Given that computing the exact probabilities is so difficult, the designers of an AI may fail with abandon and decide to implement some computable mechanism for approximating the probabilities. After all, “probabilities are subjective” anyway… Granted, an AI probably needs an efficient algorithm for quantifying its environmental uncertainty (or it needs to be able to come up with such a mechanism on its own). Sometimes you have to quickly compute the expected utility of a few actions, which requires numeric probabilities. However, any ambitious artificial intelligence should also keep in mind that there is a different, more accurate way of assigning these probabilities. Otherwise, it will forever and always be stuck with the programmers’ approximation.

The most elegant approach is to view the approximation of the correct environmental probabilities as a special case of logical induction (i.e. reasoning over logical uncertainty) possibly without even designing an algorithm for this specific task. On this view, we have logical meta-probability distributions over the correct environmental probabilities. Consider, for example, the probability P(T|E) that we assign to some physical theory T given our evidence E. There is some objectively correct subjective probability P(T|E) (assuming, for example, Solomonoff’s prior probability distribution), but the AI can’t calculate its exact value. It can, however, use logical induction to assign probabilities to statements like P(T|E) probabilities densities to statements like P(T|E)=0.368. These probabilities may be called logical meta-probabilities – they are logical probabilities about the correct environmental probabilities. With these meta-probabilities all our uncertainty is quantified again, which means we can perform expected value calculations.

Let’s say we have to decide whether to taking action a. We know that if we take action a, one of the outcomes A, B and C will happen. The expected value of a is therefore

E[a] = P(A|a)*u(A) + P(B|a)*u(B) + P(C|a)*u(C),

where u(A), u(B) and u(C) denote the utilities an agent assigns to outcomes A, B and C, respectively. To find out the expected value of a given our lack of logical omniscience, we now calculate the “expected expected value”, where the outer expectation operator is a logical one:

The expected values E[P(A|a)], E[P(B|a)] and E[P(C|a)] are the expected environmental probabilities of the outcomes given a and can be computed using integrals. (In practice, these will have to be subject to approximation again. You can’t apply logical induction if you want to avoid an infinite regress.) These expected probabilities are the answers an agent/AI would give to questions like, “What probability do you assign to A happening?”

This view of reported environmental probabilities makes sense of a couple of intuitions that we have about environmental probabilities:

We don’t know which probabilities we assign to a given statement, even if we are convinced of Bayes’ theorem and a certain prior probability distribution.
We can update our environmental probability assignments without gathering new evidence. We can simply reconsider old evidence and compute a more accurate approximation of the proper Bayesian updating mechanism (e.g. via logical induction).
We can argue about probabilities with others and update. For example, people can bring new explanations of the data to our attention. This is not Bayesian evidence (ignoring that the source of such arguments may reveal its biases and beliefs through such argumentation). After all, we could have come up with these explanations ourselves. But these explanations can shift our logical meta-probabilities. (Formally: what external agents tell you can probably be viewed as (part of) a deductive process, see MIRI’s newest paper on logical induction.)

Introducing logical uncertainty into assigning environmental probabilities doesn’t solve the problem of assigning appropriate environmental probabilities. MIRI has described a logical induction algorithm, but it’s inefficient.

The Age of Em – summary of policy-relevant information

On August 30, 2016November 3, 2017 By CasparIn GeneralLeave a comment

In this post I summarize the main (potentially) policy-relevant points that Robin Hanson makes in The Age of Em.

If you don’t have time to read the whole post, only read the section The three most important take-aways. My friend and colleague Ruairí also recommends to skip directly to the section on conflict and compromise if you already know the basics of Hanson’s em scenario.

You may check whether Hanson really makes the statements I ascribe to him by looking them up in the book. The page numbers all refer to the print edition, where the main text ends on page 384.

Mini-Review

Many parts of the book are not overly interesting and not very policy-relevant. For example, Hanson dedicates a lot of space to discussing how em cities will have to be cooled. Some things are very interesting, because they are weird. For example, faster ems will have smaller bodies (some of them (if not most) have no bodies at all, though). And some things could be policy-relevant. Also, I learned a lot of interesting stuff on the go. E.g., what kind of hand gestures successful and unsuccessful people make. Or that employees are apparently happier and more productive if they just try to satisfy their bosses instead of trying to do good work. Hanson makes extensive use of valuable references to support such claims. In addition to the intrinsic importance of the content, the book serves as a great example in futurology without groundless speculation.

There is also a nice review by Slate Star Codex, which also gives an overview of some of the more basic ideas (section III). I find the whole Age of Em scenario a lot less weird than Scott does and also disagree with the “science fiction” criticism in section VI. Section V rips apart (successfully, in my opinion) the arguments that Hanson gives (in the book) to support the assumption that whole brain emulation will arrive before de novo AI.

What’s an em, anyway?

Hanson’s book argues that soon, human brains can be scanned and then run in a way that preserves their functionality. These scans are called mind uploads, whole brain emulations or ems. Given the advantages that these digital versions have over meat-humans (such as the possibility of speed up, copiability, etc.), these ems would quickly come to dominate the economy. Ems are similar to humans in many regards, but the fundamental differences of being digital have a variety of interesting consequences for an em-dominated world. And this is what Hanson’s book is about.

If you are not familiar with these ideas at all, consider for example Hanson’s TEDx talk or section III in the Slate Star Codex review.

The three most important take-aways

The elites of our world will dominate the em world. So, focusing on certain elites today is more important for the em scenario. Also, our memes should be tailored more to elites than what would be the case in a scenario without ems.
The transition to an em world could cause major upheavals in moral values. It’s conceivable that in some em scenarios, the world could end up much closer to my values (panpsychic, welfarist, more willing to see some lives as not worth living, etc.) than in non-em scenarios. However, ems could also be largely egoistic and not care about philosophy much.
AI safety will probably be easier to solve for ems, i.e. ems are more likely to create de novo AI that is aligned with their values.

Competition and Malthusian wages

Without substantial regulation, the em world will be a lot more competitive (see p. 156ff.).

“The main way that em labor markets differ from labor markets today is that ems can be easily copied. Copying causes many large changes to em labor markets. For example, with copying there can be sufficient competition for a particular pre-skill type given demand from many competing employers, and supply from at least two competing ems of that type. For these two ems, all we need is that when faced with a take it or leave it wage offer, they each accept a wage of twice the full hardware cost [if they want to work at most half of their time].” (p. 144)

So even if for each job there are just two ems who are willing to do the job at very low wages, wages would fall to near-subsistence level almost immediately.

Who is in power in the em world?

In the em scenario, an elite-focus for our movement is more important than it already is. The elites (in terms of intelligence, productivity, wealth etc.) of our world will completely dominate (by number!) the em world. Therefore, influencing them is strategically much more important for influencing the em world. The elites within the em world will also be more important, e.g. because the em world may be less democratic or have a more rigid class and power hierarchy. This also suggests that it may be a little more important than we thought that the memes of our movement should make sense to elites.

Who becomes an em?

Which humans will be chosen to become ems and be copied potentially billions of times?

Young people (that is, people who are young when the ems are created) are probably more important, because living in the em world will require many new skills that young people are more likely to be able to acquire. (p. 149)
Because ems can be copied, there is not really a need to have many different ems. One can basically just take the 1000 most able humans (or the most talented human in every relevant area) and produce many copies of them (see pp.161). Therefore, the em world will be completely dominated by the elites of the human world.
The first people who become ems will tend to be rich or supported by large companies or other financiers, because scanning will be expensive in the beginning. Also, the chance of success will be fairly low in the first years of whole brain emulation, so classic egoists may have inhibitions against uploading. (On the other hand, they may want to dominate the em world, or want to be scanned while they still have a chance to gain a foothold in the em world.) The very first ems may thus be over-proportionately crazy/desperate, altruists who want to influence the em era, terminally ill, and maybe cryonics customers who are legally dead (see p. 148). Because first movers have an advantage (p. 150), it seems especially promising for altruists to try to get scanned in the early days when chances of success are at rates like 20% (and the original human is destroyed in the process of scanning) which would discourage others from daring the step into the em world. Having some altruistic elite members is therefore more important for an altruistic movement in this scenario than having many not so committed or not sufficiently talented members.
“It is possible that the first ems will come predominantly from particular nations and cultures. If so, typical em values may tend to be close to the values of whatever nations provided most of the ordinary humans whose brains were scanned for these first ems.” (p. 322) This suggests that not only personal eliteness but also being a national of an elite country will become important. This is similar to space travel (and maybe other frontiers), e.g. NASA employs only US citizens. Off the cuff, the most important countries in this regard are then probably the US, China, Switzerland (because of the Blue Brain project), some EU countries (because of high GDP and recent ESA success) and Japan.
- “The first em cities might plausibly form around big computer data centers, such as those built today by Google, Amazon, and Microsoft. Such centers likely have ample and cheap supporting resources such as energy, are relatively safe from storms and social disruptions, and are also close to initial em customers, suppliers, and collaborators in the richest parts of the industrial economy. These centers prefer access to cheap cold water and air for cooling, such as found toward Earth’s poles, and prefer to be in a nation that is either relatively free from regulations or that is small and controlled by friendly parties. These criteria suggest that the first em city arises in a low-regulation Nordic nation such as Norway.” (p. 360) Of course, such low-regulation countries in which em cities are built could nonetheless have little influence on the policies and values of the em world itself, e.g. an em city in Norway may consist of brains that were scanned in the USA.

More stability in an em world

Overall, the class hierarchy of the em era will probably be more rigid than in the human era.

“Ems with very different speeds or sizes might fit awkwardly into the same space, be that physical or virtual. Fast ems whizzing past could be disorienting to slower ems, and large ems may block the movement or view of small ems.” (p. 110) Some kind of segregation seems convenient: either areas of an em city have a certain standard speed or ems of the wrong speed class will be filtered out from what ems can see. “So there may be views that hide lower status ems, and only show higher status ems. This could be similar to how today servants such as waiters often try to seem invisible, and are often treated by those they serve as if invisible. The more possible views that are commonly used, the harder it will be for typical ems to know how things look from others’ typical points of view.” (p.111, also see p. 218)
There will probably be a few distinct speeds at which ems run as opposed to all kinds of em speeds being common because ems at the same speed can communicate well, whereas ems at speeds differing by a factor of 1.5 or more will probably have problems. (See pp. 222, 326)
Many of the ways in which more regulation is possible make it possible to prevent upheavals and oppress non-conformist positions.
Since ems can be copied after training, few ems will be in training. Instead, most ems will be at their peak productivity age, which for humans is, according to Hanson, usually between the ages 40 and 50—but could be much higher for ems given that their brains don’t deteriorate. (p.202ff.) So, ems may be somewhat older (in terms of subjective age) than humans. (See Wikipedia: List of countries by median age.)
Many aspects of aging can be stopped in ems. Therefore, ems may be able to work productively longer and hold on to their power much longer (p. 128f.). This means there will be fewer generation changes (per unit of subjective time). Since people tend to change their ways less often when they are old, the overall moral and political views of the em worlds might also be a lot more stable (judged by subjective em-time).
Em societies may be non-democratic (p. 259).
“Political violence, regime instability, and policy instability all seem to be negatively correlated with economic growth.” (p. 262) The stable em cities may come to dominate.
“As ems have near subsistence (although hardly miserable) income levels, and as wealth levels seem to cause cultural changes, we should expect em culture values to be more like those of poor nations today. As Eastern cultures grow faster today, and as they may be more common in denser areas, em values may be more likely to be like those of Eastern nations today. Together, these suggest that em culture […] values […] authority.” (p. 322f.)
“[Because] ems [will probably be] more farmer-like, they tend to envy less, and to more accept authority and hierarchy, including hereditary elites and ranking by gender, age, and class. They are more comfortable with […] material inequalities, and push less for sharing and redistribution. They are less bothered by violence and domination toward the historical targets of such conflicts […]. […] Leaders lead less by the appearance of consensus, and do less to give the appearance that everyone has an equal voice and is free to speak their minds. Fewer topics are open for discussion or negotiation. Farmer-like ems […] enforce more conformity and social rules, and care more for cleanliness and order.” (p. 327f.)

Conflict and compromise

It’s unclear whether cooperation and compromise will be more easy to achieve in an em world and whether there would be more or less risk of conflict and AI arms races.

CEV-like approaches to value-loading might be easier to implement (see the section on AI safety).
Because ems can travel more quickly, it will probably be easier for them to communicate more often with ems in other parts of the world (pp.75-77).
“Groups of ems meeting in virtual reality might find use for a social ‘undo’ feature, allowing them to, for example, erase unwelcome social gaffes. At least they could do this if they periodically archived copies of their minds and meeting setting, and limited the signals they sent to others outside their group. When the undo feature is invoked, it specifies a particular past archived moment to be revived. Some group members might be allowed a limited memory of the undo, such as by writing a short message to their new selves. When the undo feature is triggered, all group members are then erased (or retired) and replaced by copies from that past archive moment, each of whom receives the short message composed by its erased version.” (p. 104) I am not sure such a feature would be used in diplomacy, because being able to undo and retry makes signals of cooperativeness and honesty less credible. Of course, this could be addressed with the limited memory of the undo. If such a feature were used in diplomacy, it could make interaction across cultural difference more smooth.
“As the em era allows selection to more strongly emphasize the clans who are most successful at gaining power, we should expect positions of power in the em world to be dominated even more by people with habits and features conducive to gaining power.” (p.175) Such people tend to be more suspicious of potential work rivals (p. 176) and often refer to us-them concepts (p. 177). This should increase risks of conflict.
Fewer restrictions on international trade and immigration are economically more efficient (p. 179). To the extent that different em cities are indeed competing strongly, we would expect such behaviors from em governments as well. Fewer restrictions in these regards might decrease differences between cultures.
For various reasons ems may overall be more rational, which increases the probability that they will be able to avoid “stupid” scenarios like escalating arms races. E.g. they could implement combinatorial auctions (see p. 184ff.) (humans can probably do so as well, though), have more trustworthy advice from their own copies (pp. 180, 315ff.), lie less (p. 205), can be better prepared for tasks (you only have to prepare one em and then can copy that em as often as you wish) (p. 208ff.).
Because shipping physical goods across the globe will take ages in fast ems’ subjective time (cargo ships probably can’t be sped up nearly as much as the thinking of ems, so cargo ships will seem extremely slow to fast ems) trade of such physical goods between em cities may hardly happen at all (p. 225).
Most ems will be at the peak productivity age, i.e. 40-50 or above (p. 202ff.). 50-year-olds tend to be less supportive of war than younger people (p. 250). Again see Wikipedia: List of countries by median age.
Poorer nations wage war more often and most ems will be poor (p. 250).
Having no children may make people more belligerent (p. 250).
The gender imbalance (more males than females) may increase the probability of war (p. 250).
If male ems are “castrated” (or, rather, something analogous to it) because of the gender imbalances and the obsoleteness of sexual reproduction, they will tend to be less aggressive and more sensitive, sympathetic and social. (p. 285)
Similar to family clans, the importance of copy clans may lead to less trust, fairness, rule of law and willingness to move or marry those from different cultures (p. 253).
Ems can have on call advisors, which can answer questions all the time. (pp. 315ff.) This could make diplomacy smoother, because the advisors are more likely to assume a long-term perspective (i.e. a far view), which, e.g., could make diplomats less driven by emotions like impatience, fear, anger etc.
“As ems have near subsistence (although hardly miserable) income levels, and as wealth levels seem to cause cultural changes, we should expect em culture values to be more like those of poor nations today. As Eastern cultures grow faster today, and as they may be more common in denser areas, em values might be similar to those of Eastern nations today. Together, these suggest that em culture […] values […] good and evil and local job protection.” (p. 322f.) This could increase the probability of conflicts.
There is a possibility of conflict between ems that come from our era and ems that grew up in the em era. “[T]he latter ems are likely to be better adapted to the em world, but the former will have locked in many first mover advantages to gain enviable social positions.” (p. 324) Similarly, there could be a conflict between humans and ems. (p. 324f., 361) In both cases, the newcomers may be very different due to competitiveness and thus could have a strong motivation to change the status quo.
“A larger total em population should [..] lead us to expect more cultural fragmentation. After all, if local groups differentiate their cultures to help members signal local loyalties, then the more people that are included within a region, the more total cultural variation we might expect to lie within that region. So a city containing billions or more ems could contain a great many diverse local cultural elements.” (p. 326) This suggests a higher probability of at least smaller conflicts.
“Poorer ems seem likely to return to conservative (farmer) cultural values, relative to liberal (forager) cultural values. […] Today, liberals tend to be more open-minded […]. If, relative to us, ems prefer farmer-like values to forager-like values, then ems more value things such as […] patriotism and less value […] tolerance […]. […] They are more comfortable with war […]. They are less bothered by violence and domination toward the historical targets of such conflicts, including foreigners […]. […] Conservative jobs today tend to focus on a fear of bad things, and protecting against them.” (p. 327f.)
“Today, ‘fast-moving’ action movies and games often feature a few key actors taking many actions with major consequences, but with very little time for thoughtful consideration of those actions. However, for ems this scenario mainly makes sense for rare isolated characters or for those whose minds are maximally fast. Other characters usually speed up their minds temporarily to think carefully about important actions.” (p. 332) In this way, even action movies could set norms for thoughtfulness, whereas nowadays they propagate a “shoot first, ask questions later” mentality.
As described in the section on AI safety, ems may have a much better understanding of decision theory, which makes compromise and the avoidance of defection in prisoner’s dilemma-like scenarios much easier.
“[M]ost ems might [..] be found in a few very large cities. Most ems might live in a handful of huge dense cities, or perhaps even just one gigantic city. If this happened, nations and cities would merge; there would be only a few huge nations that mattered.” (p. 216) This would make coordination a lot easier.

Values

It is highly unclear to me what values ems will adopt.

Ems have no reasons to farm animals for food or use animals for testing of drugs. Cognitive dissonance theory suggests that this will make the majority care about animals more than they do today.
“As a cat brain has about 1% as many neurons as a human brain, virtual cat characters are an affordable if non-trivial expense. Most pet brains also require the equivalent of a small fraction of a human brain to emulate. The ability to pause a pet while not interacting with it would make pets even cheaper. Thus emulated animals tend to be cheap unless one wants many of them, very complex ones, or to have them run for long times while one isn’t attending to them. Birds might fly far above, animals creep in the distance, or crowds mill about over there, but one could not often afford to interact with many complex creatures who have long complex histories between your interactions with them.” (p. 105)
As opposed to most humans, em copies will mostly be created on demand. I.e. if you are an em, you apply to jobs (or employers offer them to you) and for every job that you get, you create a copy that fills this jobs. (In some unregulated dystopian scenarios it is also possible, of course, that ems can’t veto on whether they want to have a copy made of themselves.) This means that the question of “will this specific life be worth living?” will be more common among ems (indeed, more forced upon ems) than humans, who usually don’t know what the lives of their children will be like. They will also feel more responsible for having made the decision to live their current lives, so unless they decided to make a copy for ethical reasons, they are much less likely to be anti-natalist. After all, they decided themselves to be copied (see p. 120). Also, there is strong selection pressure favoring ems who consider, say, a life without much leisure to be still positive (see p. 123). Similarly, there are selection pressures towards ems wanting to make many copies of themselves.
There is a strong selection pressure against ems who are not willing to create short-lived (i.e., quickly deleted) copies of themselves. If competition will be strong enough (and human nature sufficiently flexible), ems will value that at least one of their copies will survive, but they likely would not disvalue the death of a single of their copies much. This could lead to values along the lines of “biodiversity applied to humans”, where copy clans count as the morally relevant entities, as opposed to individuals. This would be similar to how many people care about preserving certain species instead of the welfare of individuals. This would not only be bad for em welfare but also move the moral views of ems farther away from mine. On the other hand, hedonic well-being could fill the gap of death as the center for moral concern.
Hanson argues that ems probably won’t suffer much (p. 153, 371), because their virtual reality (and even their own brain) can be controlled so well. Given that experiencing suffering is probably correlated with caring about suffering, this might be bad in the long term.
Assuming that ems can be tweaked, they may be made especially thoughtful, friendly and so on.
Because of higher competition, ems work more (e.g. see pp. 167ff., 207) and are paid less. Therefore, they don’t have the resources for altruistic activities that modern elites have.
People who are more productive tend to be married, intelligent extroverted, conscientious and non-neurotic. Smarter people are more cooperative, patient, rational and law-abiding. They also tend to favor trading with foreigners more. So, because ems will be selected for productivity, they will tend to have these features as well. (p. 163)
- It is somewhat unclear whether ems will be more or less religious. Apparently religious people are more productive, but they are also less innovative. (p. 276, 311) Hanson expects that religions will be able to adapt to the em world’s weirdnesses (p. 312).
Workaholics tend to be male and males are also more competitive, so the em world may well be dominated by males (p. 167), which are less compassionate and less likely to be vegan or vegetarian.
“While successful ems work hard and accept unpleasant working conditions, they are not much more likely to seriously resent or rail against these conditions than do hard-working billionaires or winners of Oscars or Olympic gold medals today. While such people often work very hard under grueling conditions, they usually accept such conditions as a price for their chance at extreme success.” (p. 169) So perhaps ems won’t take the suffering of less fortunate individuals very seriously.
“[O]lder people tend to associate happiness more with peacefulness, as opposed to excitement.” (p. 205) So, old people may be more focused on avoiding very bad experiences relative to bringing about very pleasurable ones.
Most ems don’t have children (p. 211f.), which could make them more compassionate towards others.
At some point, it may become attractive to scan children to turn them into ems, because they can better adapt to the em world (p. 212). This could give an advantage to ruthless countries and children of psychopathic parents, who are themselves more likely to be psychopathic.
Space will lose some appeal, because it takes ages of subjective time to get there (p. 225).
If male ems are “castrated” (however that would exactly work for ems) because of the gender imbalances and the obsoleteness of sexual reproduction, then they tend to be more sympathetic. (p. 285)
“Ems can travel more cheaply to virtual nature parks, and need have little fear that killing nature will somehow kill them.” (p. 303)
The classic targets of charity—alms, schools and hospitals—will all be a lot less necessary than today (p. 302). This may lead ems to support other kinds of charity.
“New em copies and their teams are typically created in response to new job opportunities. Such teams typically end or retire when these jobs are completed. Thus ems are likely to identify strongly with their particular jobs; their jobs are literally their reason for existing.” (p. 306, also see p. 328) Maybe this implies that ems will be less involved in pursuing ethical causes.
For ems it is obviously much more natural to be anti-substratist.
For ems, it is more natural to consider consciousness as coming in degrees. For example, em minds differ in speed, but there could also be partial minds (p. 341ff.).
“If ems are indeed more farmer-like, […] they are less bothered by violence and domination toward the historical targets of such conflicts, including foreigners, children, slaves, animals, and nature.” (p. 327)
Ems will care more about their copies than humans that have never been copied.

AI safety

Overall, ems seem more likely to get AI safety right. Arguments beyond Hanson’s are given in a talk by Anna Salomon (and Carl Shulman). Consider also a workshop report by Anna Salamon and Luke Muehlhauser on the topic.

Because ems tend to have many copies, decision and game theoretical ideas that are relevant for AI safety will be more common and practically tested in em society.
- There is the possibility of mind theft, i.e. that someone steals a copy of an em to interrogate it. (p.60f.) So, ems may pre-commit against giving in to anything like torture to disincentivize mind theft (p. 63).
- There may be “open source” ems (p. 61), which are free for everyone to copy. These must have pre-commitments against any kind of coercion to enforce a policy of only working for those which grant them a certain standard of living.
- “An em might be fooled […] by misleading information about its copy history. If many copies were made of an em and then only a few selected according to some criteria, then knowing about such selection criteria is valuable information to those selected ems. For example, imagine that someone created 10,000 copies of an em, exposed each copy to different arguments in favor of committing some act of sabotage, and then allowed only the most persuaded copy to continue. This strategy might in effect persuade this em to commit the sabotage. However, if the em knew this fact about its copy history, that could convince this remaining copy to greatly reduce its willingness to commit to sabotage.” (p. 112, also see pp. 60, 120) Such weird processes could make ems a lot better at anthropic reasoning.
- It will be easy to put ems into simulations to test their behavior in certain situations (p. 115ff.). So, Newcomb-like problems are a very practical problem in the em word. Ems also often interact with copies of themselves, which could sometimes be similar to a corresponding variant of the prisoner’s dilemma.
The possibility of mind theft (or in general the fact that ems live in the digital world) lead ems to increase spending in computer security (p. 61f.), which makes both AI control (e.g. via provably secure operating systems) and AI boxing easier. AI boxing is also made easier by fast ems being able to “directly monitor and react to an AI at a much higher time resolution.” (p. 369)
Whole brain emulation makes CEV-like approaches to AI safety easier.
- “Mild mindreading might be used to allow ems to better intuit and share their reaction to a particular topic or person. For example, a group of ems might all try to think at the same time about a particular person, say ‘George.’ Then their brain states in the region of their minds associated with this thought might be weakly driven toward the average state of this group. In this way this group might come to intuitively feel how the group feels on average about George.” (p. 55)
- Hanson believes that there may be “methods to usefully merge two em minds that had once split from a common ancestor, with the combined mind requiring not much more space and processing power than did each original mind, yet retaining most of the skills and memories of both originals.” (p. 358)
Messier AI designs may be feasible in an em world. Such designs might be less controllable, e.g. because the goal is less explicit.
- On p. 50, Hanson writes that small-scale cognitive enhancements may be possible for ems. Some of them may allow ems to have much better memory, which allows them to work on less modular AI designs.
- One can save a copy of a programmer who wrote a piece of software to later let them rewrite that piece of software. (p. 278) This avoids some of the typical problems of legacy systems and could lower quality standards.
- Once serial computer speed hits a wall, em software needs to be very parallel to not appear sluggish to fast (highly parallelized) ems (p. 279). So, ems will become much better at writing parallel computer programs. This may lead to more messy approaches to AI (e.g., society of mind, many subagencies etc.), which are more difficult to control. However, there could probably be very systematic approaches to parallel computing as well. In that case, the parallel computing trend would not make a big difference.
- Programmers can have many copies and run at very, very high speeds and then finish huge pieces of software on their own (p. 280f.). This could allow them to get away with idiosyncratic systems that would otherwise be impossible to implement for a human team. Again, this could lead to more messy approaches to AI.
De novo AIs could still be partly ems, for example they could be created by replacing subsystems of ems by more efficient programs while keeping the motivation system intact.
Because ems live longer (potentially forever), they have less motivation to create AI quickly.

Appendix A: Why regulation might be easier to enforce in an em world

Hanson largely assumes a scenario with low regulation, which makes sense—if only to be able to make predictions at all. However, there are also many reasons to believe that much stricter regulation could be enforced in the em world:

Mind reading might be possible to some extent (pp. 55ff.).
One can test an em’s loyalty by putting it into a simulation (pp. 115ff.).
Virtual reality makes surveillance much easier (pp. 124ff., 273).
If death of a single copy is considered to be no great harm (pp. 134ff.), you could quite easily shut off all copies of a criminal (except, maybe, one which you retire at very low speed.
The em world is probably dominated by a few hundred “copy clans”. This should make coordination a lot easier.
Most ems will probably live in just a few cities (p. 214ff.). This makes coordination easier.
Crimes can be discouraged more decisively by holding copy clans legally liable for the behaviors of members (p. 229).
Em firms will be larger than today’s firms (p. 231).
“[T]here is a possibility that ems may create stable totalitarian regimes that govern em nations.” (p. 259, also see pp. 264ff.)
“Archived copies of minds from just before a key event could be used to infer the intent and state of knowledge of ems at that key event.” (p. 271) This makes jurisdiction easier.
It seems plausible that after the first em is created, the technology will be in the hands of one country or coalition for a while. (On the other hand, the USA caught up within months after Sputnik.) This will make it easy to set up an em world in a way that conforms with the agenda of this coalition. Assuming that the creation of ems will yield a transition as significant as Hanson makes it out to be, the first coalition might already have a decisive strategic advantage. Of course, this just means that there will be an arms race towards whole-brain emulation instead of one towards de novo AI, but the former wouldn’t have many of the negative consequences of the latter, because ems can’t be uncontrolled.

Appendix B: Robin Hanson’s moral values

What’s interesting about the book is that, while the scenario Hanson outlines would be considered dystopian by many, Hanson seems to consider it an acceptable outcome. (Consider his “Evaluation” section (pp. 367ff.).) Some striking examples are the following statements:

“Of course, lives of quiet desperation can still be worth living.” (p. 43)
“[A] disaster so big that civilization is destroyed and can never rise again […] harms not only everyone living in the world at at the time, but also everyone who might have lived afterward, until either a similar disaster later, or the end of the universe.” (p. 369, emphasis added)

Lexicographic utility functions

On August 8, 2016October 16, 2016 By CasparIn General1 Comment

Intuitions about there being extreme kinds of suffering that cannot be outweighed by any amount of happiness and that are more important than any amount of mild suffering violate the continuity axiom* of the Von Neumann-Morgenstern (vNM) utility theorem. Does that mean that holding extreme suffering to be impossible to outweigh (as, for example, threshold negative utilitarians do) makes it impossible to represent your preferences with a utility function? Can you not maximize expected utility? Is it irrational to hold such preferences?

It turns out, there’s a theorem which basically says that such preferences can still be represented by a utility function, it just has to be taken from a broader space. The function does not necessarily map to real numbers, but to some larger set of possible utilities. Specifically, without continuity there is still always a utility function that maps outcomes onto members of a lexicographically ordered real-valued vector space and that accurately represents the given preferences. A very good and (to the mathematically literate) fairly accessible exposition is given by Blume, Brandenburger and Dekel (1989), ch. 1 and 2.

Complications arise when the space of possible outcomes (the things that utility is assigned to) is infinite, which would allow for an infinite number of thresholds – an infinite hierarchy of outcomes, each of which is infinitely better than the lower ones. This can’t be captured with a finite-dimensional, lexicographically ordered, real-valued vector space, anymore. However, in this case, one can map lotteries into an infinite-dimensional space with a lexicographic ordering. Alternatively, one can add an axiom which limits the number of “levels” to a finite number n and then an n-dimensional real-valued vector space suffices again. The latter is done by Fishburn (1971) in A Study of Lexicographic Expected Utility, which is pay-walled and not as readable.

It would be good if those interested in suffering-focused ethics knew that continuity in the vNM axioms is not really an argument against thresholds. (In general, continuity seems less compelling than completeness and transitivity.) Saying that holding extreme suffering to be impossible to outweigh is irrational because it violates the vNM “rationality axioms” is an objection that I would expect to be raised, and it would be good if proponents of such a view could easily refer to some place for a clarification without spending too much time on this red herring. Personally, I don’t think I’d defend this view myself, but despite moral anti-realism “what can be destroyed by the truth, should be”, even in the field of ethics.

*E.g., if N and M are mild amounts of happiness/suffering such that M

Edit: Simon Knutsson made me aware of some discussion of the continuity axiom in the philosophical literature:

Wolf, C. (1997): Person-Affecting Utilitarianism and Population Policy or, Sissy Jupe’s Theory of Social Choice. In J. Heller and N. Fotion (Eds.), Contingent Future Persons.
Arrhenius, G., & Rabinowicz, W. (2005): Value and Unacceptable Risk. Economics and Philosophy, 21(2), 177–197
Danielsson, S. (2004): Temkin, Archimedes and the transitivity of ‘Better’. Patterns of Value: Essays on Formal Axiology and Value Analysis, 2, 175–179.
Klint Jensen, K. (2012): Unacceptable risks and the continuity axiom. Economics and Philosophy, 28(1), 31–42.
Temkin, L. (2001): Worries about continuity, transitivity, expected utility theory, and practical reasoning. In D. Egonsson, J. Josefsson, B. Petersson, & T. Rønnow-Rasmusen (Eds.), Exploring Practical Philosophy (pp. 95–108).
Note 7 in Hájek, A. (2012): Pascal’s Wager. In: Edward N. Zalta (ed.), The Stanford Encyclopedia of Philosophy (Winter 2012 Edition).

	Lukas Finnveden on “Betting on the Past” by Arif…
	Jesse Clifton on Decision Theory and the Irrele…
	Lukas Finnveden on Cooperative AI competitions wi…
	Caspar on Cooperative AI competitions wi…
	Lukas Finnveden on Cooperative AI competitions wi…

The Universe from an Intentional Stance

Author: Caspar

A Non-Comprehensive List of Human Values

Joyce’s Better Framing of Newcomb’s Problem

Peter Thiel on Startup Culture

Is it a bias or just a preference? An interesting issue in preference idealization

Decision Theory and the Irrelevance of Impossible Outcomes

Omoto and Snyder (1995) on motivations to volunteer

Thoughts on Updatelessness

Environmental and Logical Uncertainty: Reported Environmental Probabilities as Expected Environmental Probabilities under Logical Uncertainty

The Age of Em – summary of policy-relevant information

Mini-Review

What’s an em, anyway?

The three most important take-aways

Competition and Malthusian wages

Who is in power in the em world?

Who becomes an em?

More stability in an em world

Conflict and compromise

Values

AI safety

Appendix A: Why regulation might be easier to enforce in an em world

Appendix B: Robin Hanson’s moral values

Lexicographic utility functions