Summary of Achen and Bartel’s Democracy for Realists

On June 18, 2017June 18, 2017 By CasparIn General2 Comments

I just finished binge-reading Achen and Bartel’s great book Democracy for Realists and decided to write up a summary and a few comments to aid my memory and share some of the most interesting insights.

The folk theory of democracy

(Since chapter 1 contains little of interest besides giving a foretaste of later chapters, I will start with the content of chapter 2.) The “folk theory” of democracy is roughly the following:

Voters have a set of informed policy preferences (e.g., on abortion, social security, climate change, taxes, etc.) and vote for the candidate or party whose policy preferences most resemble their own (similar to how vote advice applications operate). That is, people vote based on the issues. Parties are then assumed to cater to the voters’ preferences to maximize their chance of getting elected. This way the people get what they want (as is guaranteed under certain theoretical assumptions, by the median voter theorem).

Achen and Bartel argue that this folk theory of democracy does not describe what is happening in real-world democracies:

Voters are often badly informed: “Michael Delli Carpini and Scott Keeter (1996) surveyed responses to hundreds of specific factual questions in U.S. opinion surveys over the preceding 50 years to provide an authoritative summary of What Americans Know about Politics and Why It Matters. In 1952, Delli Carpini and Keeter found, only 44% of Americans could name at least one branch of government. In 1972, only 22% knew something about Watergate. In 1985, only 59% knew whether their own state’s governor was a Democrat or a Republican. In 1986, only 49% knew which one nation in the world had used nuclear weapons (Delli Carpini and Keeter 1996, 70, 81, 74, 84). Delli Carpini and Keeter (1996, 270) concluded from these and scores of similar findings that ‘large numbers of American citizens are woefully underinformed and that overall levels of knowledge are modest at best.’” (p. 36f.)
- Interestingly, the increasing availability of information has done little to change this. “[I]t is striking how little seems to have changed in the decades since survey research began to shed systematic light on the nature of public opinion. Changes in the structure of the mass media have allowed people with an uncommon taste for public affairs to find an unprecedented quantity and variety of political news; but they have also allowed people with more typical tastes to abandon traditional newspapers and television news for round-the-clock sports, pet tricks, or pornography, producing an increase in the variance of political information levels but no change in the average level of political information (Baum and Kernell 1999; Prior 2007). Similarly, while formal education remains a strong predictor of individuals’ knowledge about politics, substantial increases in American educational attainment have produced little apparent increase in overall levels of political knowledge. When Delli Carpini and Keeter (1996, 17) compared responses to scores of factual questions asked repeatedly in opinion surveys over the past half century, they found that ‘the public’s level of political knowledge is little different today than it was fifty years ago.’” (p. 37)
- This lack of knowledge seems to matter for policy preferences – uninformed voters cannot use heuristics to mimic the choices of informed voters. “[S]ome scholars have […] asked whether uninformed citizens – using whatever ‘information shortcuts’ are available to them – manage to mimic the preferences and choices of better informed people. Alas, statistical analyses of the impact of political information on policy preferences have produced ample evidence of substantial divergences between the preferences of relatively uninformed and better informed citizens (Delli Carpini and Keeter 1996, chap. 6; Althaus 1998). Similarly, when ordinary people are exposed to intensive political education and conversation on specific policy issues, they often change their mind (Luskin, Fishkin, and Jowell 2002; Sturgis 2003). Parallel analyses of voting behavior have likewise found that uninformed citizens cast significantly different votes than those who were better informed. For example, Bartels (1996) estimated that actual vote choices fell about halfway between what they would have been if voters had been fully informed and what they would have been if everyone had picked candidates by flipping coins.” (p. 39f.)
- Wisdom of the crowd-type arguments often don’t apply in politics because the opinions of different people are often biased in the the same direction: “Optimism about the competence of democratic electorates has often been bolstered (at least among political scientists) by appeals to what Converse (1990) dubbed the ‘miracle of aggregation’ – an idea formalized by the Marquis de Condorcet more than 200 years ago and forcefully argued with empirical evidence by Benjamin Page and Robert Shapiro (1992). Condorcet demonstrated mathematically that if several jurors make independent judgments of a suspect’s guilt or innocence, a majority are quite likely to judge correctly even if every individual juror is only modestly more likely than chance to reach the correct conclusion.
  
  Applied to electoral politics, Condorcet’s logic suggests that the electorate as a whole may be much wiser than any individual voter. The crucial problem with this mathematically elegant argument is that it does not work very well in practice. Real voters’ errors are quite unlikely to be statistically independent, as Condorcet’s logic requires. When thousands or millions of voters misconstrue the same relevant fact or are swayed by the same vivid campaign ad, no amount of aggregation will produce the requisite miracle; individual voters’ ‘errors’ will not cancel out in the overall election outcome, especially when they are based on constricted flows of information (Page and Shapiro 1992, chaps. 5, 9). If an incumbent government censors or distorts information regarding foreign policy or national security, the resulting errors in citizens’ judgments obviously will not be random. Less obviously, even unintentional errors by politically neutral purveyors of information may significantly distort collective judgment, as when statistical agencies or the news media overstate or understate the strength of the economy in the run-up to an election (Hetherington 1996).” (p.40f.)
Voters don’t have many strong policy preferences.
- Their stated preferences are sensitive to framing effects. Some examples from p. 30f:
  “[E]xpressed political attitudes can be remarkably sensitive to seemingly innocuous variations in question wording or context. For example, 63% to 65% of Americans in the mid-1980s said that the federal government was spending too little on “assistance to the poor”; but only 20% to 25% said that it was spending too little on “welfare” (Rasinski 1989, 391). “Welfare” clearly had deeply negative connotations for many Americans, probably because it stimulated rather different mental images than “assistance to the poor” (Gilens 1999). Would additional federal spending in this domain have reflected the will of the majority, or not? We can suggest no sensible way to answer that question. […] [I]n three separate experiments conducted in the mid-1970s, almost half of Americans said they would “not allow” a communist to give a speech, while only about one-fourth said they would “forbid” him or her from doing so (Schuman and Presser 1981, 277). In the weeks leading up to the 1991 Gulf War, almost two-thirds of Americans were willing to “use military force,” but fewer than half were willing to “engage in combat,” and fewer than 30% were willing to “go to war” (Mueller 1994, 30).
- Many voters have no opinions on many current issues (p. 31f.).
- People’s policy preferences are remarkably inconsistent over time with correlations of just 0.3 to 0.5 between the stated policy preferences on two occasions that are two years apart.
Many voters don’t know the positions of the competing parties on the issues, which makes it hard for them to vote for a party based on their policy preferences (p. 32).
- Lau and Redlawsk (1997; 2006) “found that about 70% of voters, on average, chose the candidate who best matched their own expressed preferences.” (p. 40)
If one asks people to place their own policy positions and that of parties on a seven-point issue scale, then issue proximity and vote choice will correlate. But this can be explained by more than one set of causal relationships. Of course, the naive interpretation is that people form a policy opinion and learn about the candidates’ opinions independently. Based on those, they decide which party to vote for. But this model of policy-oriented evaluation is only one possible explanation of the observed correlation between perceived issue proximity and voting behavior. Another is persuasion: Voters already prefer some party, know that party’s policies and then adjust their opinions to better match that party’s opinion. The third is projection: People already know which party to vote for, have some opinions on policy but don’t actually know what the party stands for. They then project their policy positions onto those of the party. (p. 42) Achen and Bartels report on evidence showing that policy-oriented evaluation is only a small contributor to the correlation between perceived issue proximity and vote choices. (p. 42-45)
They argue that, empirically, elected candidates often don’t represent the median voter. (p. 45-49)
To my surprise, they use Arrow’s impossibility theorem to argue against the feasibility of fair preference aggregation (pp. 26ff.). (See here for a nice video introduction.) Somehow, I always had the impression that Arrow’s impossibility theorem wouldn’t make a difference in practice. (As Arrow himself said, “Most [voting] systems are not going to work badly all of the time. All I proved is that all can work badly at times.”)

A weaker form of the folk theory is that, while voters may not know specific issues well enough to have an opinion, they do have some ideological preference (such as liberalism or conservatism). But this fails for similar reasons:

“Converse […] scrutinized respondents’ answers to open-ended questions about political parties and candidates for evidence that they understood and spontaneously employed the ideological concepts at the core of elite political discourse. He found that about 3% of voters were clearly classiffiable as “ideologues,” with another 12% qualifying as “near-ideologues”; the vast majority of voters (and an even larger proportion of nonvoters) seemed to think about parties and candidates in terms of group interests or the “nature of the times,” or in ways that conveyed “no shred of policy significance whatever” (Converse 1964, 217–218; also Campbell et al. 1960, chap. 10).”
Correlations between different policy views are only modest. This itself is not necessarily a bad thing but evidence against ideological voting. (If people fell into distinct ideological groups like liberals, conservatives, etc., one would observe such correlations. E.g., one may expect strong correlations between positions on foreign and domestic policy given that there are such correlations among political parties.) (p. 32f.)
- This appears to conflict to some extent with how Haidt’s moral foundations theory characterizes the differences between liberals and conservatives. According to Haidt, conservatives form a cluster of people who care much more about loyalty, authority and sanctity than liberals. This predicts correlations between positions on topics in these domains, e.g. gay marriage and immigration (assuming that people’s loyalty, authority and sanctity intuitions tend to have similar content). However, it doesn’t seem to predict correlations between views on, say, aid to education and isolationism, which were the type of variables asked about in the study by Converse (1964) that Achen and Bartels refer to.
“Even in France, the presumed home of ideological politics, Converse and Pierce (1986, chap. 4) found that most voters did not understand political ‘left’ and ‘right.’ When citizens do understand the terms, they may still be uncertain or confused about where the parties stand on the left-right dimension (Butler and Stokes 1974, 323–337). Perhaps as a result, their partisan loyalties and issue preferences are often badly misaligned. In a 1968 survey in Italy, for example, 50% of those who identified with the right-wing Monarchist party took left-wing policy positions (Barnes 1971, 170). […] [C]areful recent studies have repeatedly turned up similar findings. For example, Elizabeth Zechmeister (2006, 162) found “striking, systematic differences … both within and across the countries” in the conceptions of “left” and “right” offered by elite private college students in Mexico and Argentina, while André Blais (personal communication) found half of German voters unable to place the party called “Die Linke” – the Left – on a left-right scale.” (p. 34f.)

Direct democracy

Chapter 3 discusses direct democracy. Besides making the point that everyone seems to believe that “more democracy” is a good thing (pp. 52-60, 70), they argue against a direct democracy version of the folk theory. In my view, the evidence presented in chapter 2 of the book (and the previous section of this summary) already provides strong reasons for skepticism and I think the best case against a direct democracy folk theory is based on arguments of this sort. In line with this view, Achen and Bartels re-iterate some of the arguments, e.g. that the average Joe often adopts other people’s policy preferences rather than making up his own mind (p. 73-76).

Most of the qualitatively new evidence presented in this section, on the other hand, seems quite weak to me. Much of it seems to be aimed at showing that direct democracy has yielded bad results. For example, based on the ratings of Arthur Schlesinger Jr., the Wall Street Journal, C-SPAN and Siena College, the introduction of primary elections hasn’t increased the quality of presidents (p. 66). As they concede themselves, the data set so small and the ratings of presidents contentious, so this evidence is not very strong at all. They also argue that direct democracy sometimes leads to transparently silly decisions, but the evidence seems essentially anecdotal to me.

Another interesting point of the section is that, in addition to potential ideological motives, politicians usually have strategic reasons to support the introduction of “more democratic” procedures:

[T]hroughout American history, debates about desirable democratic procedures have not been carried out in the abstract. They have always been entangled with struggles for substantive political advantage. In 1824, “politicos in all camps recognized” that the traditional congressional caucus system would probably nominate William Crawford; thus, “how people felt about the proper nominating method was correlated very highly indeed with which candidate they supported” (Ranney 1975, 66). In 1832, “America’s second great party reform was accomplished, not because the principle of nomination by delegate conventions won more adherents than the principle of nomination by legislative caucuses, but largely because the dominant factional interests … decided that national conventions would make things easier for them” (Ranney 1975, 69).

Similarly, Ranney (1975, 122) noted that the most influential champion of the direct primary, Robert La Follette, was inspired “to destroy boss rule at its very roots” when the Republican Party bosses of Wisconsin twice passed him over for the gubernatorial nomination. And in the early 1970s, George McGovern helped to engineer the Democratic Party’s new rules for delegate selection as cochair of the party’s McGovern-Fraser Commission, and “praised them repeatedly during his campaign for the 1972 nomination”; but less than a year later he advocated repealing some of the most significant rules changes. Asked why McGovern’s views had changed, “an aide said, ‘We were running for president then’” (Ranney 1975, 73–74).

I expect that this is a quite common phenomenon in deciding which decision process to use. E.g., when an organization decides which decision procedure to use (e.g., who will make the decision, what kind of evidence is accepted as valid), members of the organization might base a decision on these processes less on general principles (e.g., balance, avoidance of cognitive biases and groupthink) than on which decision process will yield the favored results in specific object-level decisions (e.g., who gets a raise, whether my prefered project is funded).

I guess processes that are instantiated for only a single decision are affected even more strongly by this problem. An example is deciding on how to do AI value loading, e.g. which idealization procedures to use.

The Retrospective Theory of Political Accountability

In chapter 4, Achen and Bartels discuss an attractive alternative to the folk theory: retrospective voting. On this view, voters decide not so much based on policy preferences but on how well the candidates or parties has performed in the past. For example, a president under which the economy improved may be re-elected. This theory is plausible as a descriptive theory for a number of reasons:

There is quite some empirical evidence that retrospective voting describes what voters are doing (ch. 5-7).
Retrospective voting, i.e. evaluating whether the passing term went well, is much easier than policy-based voting, i.e. deciding which candidate’s proposed policies will work better in the future (p. 91f.).

The retrospective theory also has some normative appeal:

It selects for good leaders (p. 98-100).
It incentivizes politicians to do what is best for the voters (p. 100-102).
To some extent it allows politicians to do what is best for the voters even if the voters disagree on what is best (p. 91).

While Achen and Bartels agree that retrospective voting is a large part of the descriptive picture, they also argue that, at least in the way it is implemented by real-world voters, “its implications for democracy are less unambiguously positive than existing literature tends to suggest”:

Proceeding on the theme of the ignorance of the electorate, voters’ evaluation of the past term and the current situation is unreliable (p. 92f.). For example, their perception of environmental threats does not correlate much with that of experts (p. 106), they think crime is increasing when it is in fact stable or decreasing (p. 107) and they cannot assess the state of the economy (p. 107f.).
- Media coverage, partisan bias, popular culture, etc. often shape people’s judgments (p. 107, 138-142).
Voters are unable to differentiate whether bad times are an incumbent’s fault or not (p. 93). Consequently, there is some evidence that incumbents tend to be punished for shark attacks, droughts and floods (ch. 5).
“The theories of retrospective voting we have considered assume that voters base their choices at the polls entirely on assessments of how much the incumbent party has contributed to their own or the nation’s well-being. However, when voters have their own ideas about good policy, sensible or not, they may be tempted to vote for candidates who share those ideas, as in the spatial model of voting discussed in chapter 2. In that case incumbent politicians may face a dilemma: should they implement the policies voters want or the policies that will turn out to contribute to voters’ welfare?” (p. 109, also see pp. 108-111)
- “[E]lected officials facing the issue of fluoridating drinking water in the 1950s and 1960s were significantly less likely to pander to their constituents’ ungrounded fears when longer terms gave them some protection from the “sudden breezes of passion” that Hamilton associated with public opinion.” (p. 110)
The electorate’s decisions are often based only on the most recent events, in particular the economic growth in the past year or so (cf. the peak-end rule). This not only makes their judgments worse than necessary (as they throw information away), it also sets the wrong incentives to the incumbent. Indeed, there is some evidence of a “political business cycle”, i.e. politicians attempting to maximize for growth, in particular growth of real income, in the last year of their term. (See chapter 6. Additional evidence is given in ch. 7.)
“Another way to examine the effectiveness of retrospective voting is to see what happens after each election. If we take seriously the notion that reelection hinges on economic competence, one implication is that we should expect to see more economic growth when the incumbent party is reelected than when it is dismissed by the voters. In the former case the incumbent party has presumably been retained because its past performance makes it a better than average bet to provide good economic management in the future. In the latter case the new administration is presumably an unknown quantity, a random draw from some underlying distribution of economic competence. A secondary implication of this logic is that future economic performance should be less variable when the incumbent party is retained, since reelected administrations are a truncated subsample of the underlying distribution of economic competence (the worst economic performers having presumably been weeded out at reelection time).” (p. 164) Based on a tiny sample (US presidential elections between 1948-2008), this does not seem to be the case. Of course, one could argue that the new administration often is not a random quantity – the parties in US presidential elections are almost always the same and the candidates have often proven themselves in previous political roles. In fact, the challenger may have a longer track record than the incumbent. For example, this may come to be the case in 2020.
Using a subset of the same tiny sample, they show that post-reelection economic growth is not a predictor of popular vote margin (p. 166-168). So, retrospective voting as current voters apply it doesn’t seem to work in selecting competent leaders. That said, and as Achen and Bartels acknowledge themselves (p. 168), the evidence they use is only very tentative.

Overall, the electorate’s evaluation of a candidate may be some indicator of how well they are going to perform in the future, but it is an imperfect and manipulable one.

Group loyalties and social identities

In addition to retrospective voting, Achen and Bartels tentatively propose that group loyalties and social identities play a big role for politics. Whereas the retrospection theory appears to be relatively well-studied, this new theory is much less worked out, yet (pp. 230f.).

It seems clear that vast parts of psychology and social psychology in particular – Achen and Bartels refer to ingroups and outgroups, Asch’s conformity experiments, cognitive dissonance, rationalization, etc. – should be a significant explanatory factor in political science. Indeed, Achen and Bartels start chapter 8 by stating that the relevance of social psychology for politics has been recognized by past generations of researchers (pp. 213-222), it only became unpopular when some theories that it was associated with failed (pp. 222-225).

Achen and Bartels discuss a few ways in which social groups, identities and loyalties influence voting behavior:

While voters’ retrospection focuses on the months leading up to the election, these short-term retrospections translate into the formation of long-term partisan loyalties. So, in a way, partisan loyalties are, in part, the cumulation of these short-term retrospections (p. 197-199).
Many people are loyal to one party (p. 233).
People adopt the political views of the groups they belong to or identify with (p. 219f., 222f., 246-, p. 314).
- People often adopt the party loyalties of their parents (p. 233f.).
- People adopt the views of their party (or project their views onto the party) (ch. 10). Party identification also influences one’s beliefs about factual matters. For example, when an opposing party is in office people judge the economy as worse (pp. 276-284).
People reject the political views of groups that they dislike (pp. 284-294).
People choose candidates based on what they perceive to be best for their group (p. 229).
Catholic voters (even one’s who rarely go to church) tend to prefer catholic candidates, even if the candidate emphasizes the separation of church and state (pp. 238-246).
If, say, Catholics discriminate against Jews, then Jews are much less likely to vote for a Catholic candidate or a party dominated by Catholics (p. 237f.).
Better-informed voters are often influenced more strongly by identity issues, presumably because they are more aware of them (pp. 284-294). For example, they are sometimes less likely than worse-informed voters to get the facts right (p. 283).
“When political candidates court the support of groups, they are judged in part on whether they can ‘speak our language.’ Small-business owners, union members, evangelical Christians, international corporations – each of these has a set of ongoing concerns and challenges, and a vocabulary for discussing them. Knowing those concerns, using that vocabulary, and making commitments to take them seriously is likely to be crucial for a politician to win their support (Fenno 1978).“

Unfortunately, I think that Achen and Bartels stretch the concept of identity-based voting a bit too much. The clearest example is their analysis of the case of abortion (pp. 258-266). Women tend to have more stable views on abortion than men. They are also more likely to leave the Republican party if they are pro-choice and less likely to assimilate their opinions to that of their party. Achen and Bartels’ explanation is that women’s vote is affected by their identifying as women. But I don’t see why it is necessary to bring the concept of identity into this. A much simpler explanation would be that voters are, to some extent, selfish and thus put more weight on the issues that are most relevant to them. If this counts as voting based on identity, is there any voting behavior that cannot be ascribed to identities?

I also find many of the explanations based on social identity unsatisfactory – they often don’t really explain a phenomenon. For example, Achen and Bartels argue that the partisan realignment of white southerners in the second half of the 20th century was not so much driven by racial policy issues but by white southern identity (pp. 246-258). But they don’t explain how white southern identity led people into the open arms of the Republicans. For example, was it that Republicans explicitly appealed to that identity? Or did southern opinion leaders change their mind based on policy issues?

Implications for democracy

Chapter 11 serves as a conclusion of the book. It summarizes some of the points made in earlier sections but also discusses the normative implications.

Unsurprisingly, Achen and Bartels argue against naive democratization:

[E]ffective democracy requires an appropriate balance between popular preferences and elite expertise. The point of reform should not simply be to maximize popular influence in the political process but to facilitate more effective popular influence. We need to learn to let political parties and political leaders do their jobs, too. Simple-minded attempts to thwart or control political elites through initiatives, direct primaries, and term limits will often be counterproductive. Far from empowering the citizenry, the plebiscitary implications of the folk theory have often damaged people’s real interests. (p. 303)

At the same time, they again point out that elite political judgment is often not much better than that of the worse-informed majority. In addition to being more aware of identity issues, the elites are a lot better at rationalizing, which makes them sound more rational, but often does not yield more rational opinions (p. 309-311).

Another interesting point they make is that it is usually the least-informed voters who decide who wins an election because the non-partisan swing voters tend to be relatively uninformed (p. 312, also p.32).

Achen and Bartels give some reasons why democracy might be better than its alternatives. I think the arguments, as given in the book, drastically vary in appeal, but here all five:

“[E]lections generally provide authoritative, widely accepted agreement about who shall rule. In the United States, for example, even the bitterly contested 2000 presidential election – which turned on a few hundred votes in a single state and a much-criticized five-to-four Supreme Court decision – was widely accepted as legitimate. A few Democratic partisans continued to grumble that the election had been “stolen”; but the winner, George W. Bush, took office without bloodshed, or even significant protest, and public attention quickly turned to other matters.” This makes sense, although it would have been interesting to test this argument empirically. I.e., is violent power struggle more or less prevalent in democracies than in other forms of government, such as hereditary monarchies? (I would guess that it is less prevalent in democracies.)
“[I]n well-functioning democratic systems, parties that win office are inevitably defeated at a subsequent election. They may be defeated more or less randomly, due to droughts, floods, or untimely economic slumps, but they are defeated nonetheless. Moreover, voters seem increasingly likely to reject the incumbent party the longer it has held office, reinforcing the tendency for governmental power to change hands. This turnover is a key indicator of democratic health and stability. It implies that no one group or coalition can become entrenched in power, unlike in dictatorships or one-party states where power is often exercised persistently by a single privileged segment of society. And because the losers in each election can reasonably expect the wheel of political fortune to turn in the not-too-distant future, they are more likely to accept the outcome than to take to the streets.” (p. 317) Here it is not so clear whether this constant change is a good thing. Having the same party, group or person rule for long stretches of time ensures stability and avoids friction between consecutive legislations. It also ensures that office is most of the time held by politicians with experience. Presumably, Achen and Bartels are right in judging high turnover as beneficial, but they have little evidence to back it up.
“[E]lectoral competition also provides some incentives for rulers at any given moment to tolerate opposition. The notion that citizens can oppose the incumbent rulers and organize to replace them, yet remain loyal to the nation, is fundamental both to real democracy and to social harmony.” (p. 317f.) This also seems non-obvious. Perhaps the monarchist could argue that only rulers who do not have to worry about losing their position can fruitfully engage with criticism. They also have less reason to get the press under their control (although, empirically, dictators usually use their power to limit the press in ways that democratic governments cannot).
“[A] long tradition in political theory stemming from John Stuart Mill (1861, chap. 3) has emphasized the potential benefits of democratic citizenship for the development of human character (Pateman 1970). Empirical scholarship focusing squarely on effects of this sort is scant, but it suggests that democratic political engagement may indeed have important implications for civic competence and other virtues (Finkel 1985; 1987; Campbell 2003; Mettler 2005). Thus, participation in democratic processes may contribute to better citizenship, producing both self-reinforcing improvements in ‘civic culture’ (Almond and Verba 1963) and broader contributions to human development.” (p. 318) This may be true, but it appears to be a relatively weak consideration. Perhaps, the monarchist could counter that doing away with elections saves people more time than the improvements in “civic culture” are worth. They may not be as virtuous, but maybe they can nonetheless spend more time with their family and friends or create more economic value.
“Finally, reelection-seeking politicians in well-functioning democracies will strive to avoid being caught violating consensual ethical norms in their society. As Key (1961a, 282) put it, public opinion in a democracy ‘establishes vague limits of permissiveness within which governmental action may occur without arousing a commotion.’ Thus, no president will strangle a kitten on the White House lawn in view of the television cameras. Easily managed governmental tasks will get taken care of, too. Chicago mayors will either get the snow cleared or be replaced, as Mayor Michael Bilandic learned in the winter of 1979. Openly taking bribes will generally be punished. When the causal chain is clear, the outcome is unambiguous, and the evaluation is widely shared, accountability will be enforced (Arnold 1990, chap. 3). So long as a free press can report dubious goings-on and a literate public can learn about them, politicians have strong incentives to avoid doing what is widely despised. Violations occur, of course, but they are expensive; removal from office is likely. By contrast, in dictatorships, moral or financial corruption is more common because public outrage has no obvious, organized outlet. This is a modest victory for political accountability.” (p. 318f.) Of the five reasons given, I find this one the most convincing. It basically states that retrospective voting and to some extent even the folk theory work, they just don’t work as well as one might naively imagine. So, real-world democracy doesn’t do a better job than a coin flip at representing people’s “real opinions” on controversial issues like abortion. Democracy does ensure, however, that important, universally agreed upon measures will be implemented.

In their last section, Achen and Bartels propose an idea for how to make governments more responsive to the interests of the people. Noting that elites have much more influence, they suggest that economic and social equality, as well as limitations on lobbying and campaign financing, could make governments more responsive to the preferences of the people. While plausibly helpful, these ideas are much more trite than the rest of the book.

General comments

Overall I recommend reading the book if you’re interested in the topic.
Since I don’t know the subject area particularly well, I read a few reviews of the book (Paris 2016; Schwennicke, Cohen, Roberts, Sabl, Mares, and Wright 2017; Malhotra 2016; Mann 2016; Cox 2017; Somin 2016). All of these seemed positive overall. Some even said that large parts of the book are more mainstream than the authors claim (which is a good thing in my book).
It’s quite Americentric. Sometimes an analysis of studies conducted in the US is followed by references to papers confirming the results in other countries, but often it is not. In many ways, politics in the US is different than in other countries, e.g. only two parties matter and the variability in wealth and education within the US is much bigger than in many other Western nations. This makes me unsure to which extent many of the results carry over to other countries. Often it is also an unnecessary limitation of sample sizes. E.g., one analysis (p. 165) relates whether the incumbent party was replaced to post-presidential-election income and GDP growth in the years 1948-2008 in the US. It seems hard to conclude all that much from 16 data points. Perhaps taking a look at other countries would have been a cheap way to increase the sample size. Because the book is not about the details of particular democratic systems, the book seems quite accessible to non-US American readers with only superficial knowledge of US politics and history.
It often gives a lot of detail on how empirical evidence was gathered and analyzed. E.g., the entire chapter seven is about how people’s voting behavior after the Great Depression – which is often explained by policy preferences (in the US related to Roosevelt’s New Deal) – can be explained well by retrospective voting.
I also feel like the book is somewhat balanced despite their view differing somewhat from the mainstream within political science. E.g., they often mention explicitly what the mainstream view is and refer to studies supporting that view. I also feel like they are relatively transparent about how reliable or tentative the empirical evidence for some parts of the book is.
A similar book is Jason Brennan’s Against Democracy, which I haven’t read. As suggested by the names, Against Democracy differs from Democracy for Realists in that it proposes epistocracy as an alternative form of government.

Acknowledgements

I thank Max Daniel and Stefan Torges for comments.

Talk on Multiverse-wide Cooperation via Correlated Decision-Making

On May 16, 2017 By CasparIn General4 Comments

In the past few months, I thought a lot about the implications of non-causal decision theory. In addition to writing up my thoughts in a long paper that we plan to publish on the FRI website soon, I also prepared a presentation, which I delivered to some researchers at FHI and my colleagues at FRI/EAF. Below you can find a recording of the talk.

The slides are available here.

Given the original target audiences, the talk assumes prior knowledge of a few topics:

Some decision theory
- The prisoner’s dilemma
- Newcomb-like problems and Douglas Hofstadter’s superrationality
  - Although it doesn’t cover cooperation/superrationality, I think Yudkowsky’s Newcomb’s Problem and Regret of Rationality is a good first exposition.
  - Paul Almond (2010): On Causation and Correlation, Part 1: Evidential decision theory is correct.
  - Douglas Hofstadter (1983): Dilemmas for Superrational Thinkers, Leading Up to a Luring Lottery. Scientific American, 248(6).
  - Acausal trade
  - If you want a lot of detail on this branch of decision theory, see Arif Ahmed’s book on Evidence, Decision, and Causality.
- The orthogonality thesis
Superintelligence. See, e.g.,
- the very accessible two-part introduction (part 1, part 2) by Tim Urban on Wait But Why,
- the more elaborate and academic Superintelligence by Nick Bostrom, and
- Altruists Should Prioritize Artificial Intelligence by Lukas Gloor.
The differentiation between consequentialism versus deontology and virtue ethics. Also consider effective altruism.
On the notion of the multiverse or parallel universes, see Max Tegmark’s Parallel Universes.
Gains from trade, see Brian Tomasik’s Gains from Trade through Compromise.

Anthropic uncertainty in the Evidential Blackmail

On May 12, 2017March 27, 2020 By Johannes TreutleinIn General3 Comments

I’m currently writing a piece on anthropic uncertainty in Newcomb problems. The idea is that whenever someone simulates us to predict our actions, this leads us to have anthropic uncertainty about whether we’re in this simulation or not. (If we knew whether we were in the real world or in the simulation, then the simulation wouldn’t fulfill its purpose anymore.) This kind of reasoning changes quite a lot about the answers that decision theories give in predictive dilemmas. It makes their reasoning “more updateless”, since they reason from a more impartial stance: a stance from which they don’t know their exact position in the thought experiment, yet.

This topic isn’t new, but it hasn’t been discussed in-depth before. As far as I am aware, it has been brought up on LessWrong by gRR and in two blog posts by Stuart Armstrong. Outside LessWrong, there is a post by Scott Aaronson, and one by Andrew Critch. The idea is also mentioned in passing by Neal (2006, p. 13). Are there any other sources and discussions of it that I have overlooked?

In this post, I examine what the assumption that predictions or simulations lead to anthropic uncertainty implies for the Evidential Blackmail (also XOR Blackmail), a problem which is often presented as a counter-example to evidential decision theory (EDT) (Cf. Soares & Fallenstein, 2015, p. 5; Soares & Levinstein, 2017, pp. 3–4). A similar problem has been introduced as “Yankees vs. Red Sox” by Arntzenius (2008), and discussed by Ahmed and Price (2012). I would be very grateful for any kind of feedback on my post.

We could formalize the blackmailer’s procedure in the Evidential Blackmail something like this:

def blackmailer(): your_action = your_policy(receive_letter) if predict_stock() == “retain” and your_action == “pay”: return “letter” elif predict_stock() == “fall” and your_action == “not pay”: return “letter” else return “no letter”

Let p denote the probability P(retain) with which our stock retains its value a. The blackmailer asks us for an amount of money b, where 0<b<a. The ex ante expected utilities are now:

EU(pay) = P(letter|pay) * (a – b) + P(no letter & retain|pay) * a = p (a – b),

EU(not pay) = P(no letter & retain|not pay) * a = p a.

According to the problem description, P(no letter & retain|pay) is 0, and P(no letter & retain|not pay) is p.¹ As long as we don’t know whether a letter has been sent or not (even if it might already be on its way to us), committing to not paying gives us only information about whether the letter has been sent, not about our stock, so we should commit not to pay.

Now for the situation in which we have already received the letter. (All of the following probabilities will be conditioned on “letter”.) We don’t know whether we’re in the simulation or not. But what we do if we’re in the simulation can actually change our probability that we’re in the simulation in the first place. Note that the blackmailer has to simulate us one time in any case, regardless of whether our stock goes down or not. So if we are in the simulation and we receive the letter, P(retain|pay) is still equal to P(retain|not pay): neither paying nor not paying gives us any evidence about whether our stock retains its value or not, conditional on being in the simulation. But if we are in the simulation, we can influence whether the blackmailer sends us a letter in the real world. In the simulation, our action decides over whether we receive the letter in the cases where we keep our money, or whether we receive the letter when we lose.

Let’s begin by calculating EDT’s expected utility of not paying. We will lose all money for certain if we’re in the real world and don’t pay, so we only consider the case where we’re in the simulation:

EU(not pay) = P(sim & retain|not pay) * a.

For both SSA and SIA, if our stock doesn’t go down and we don’t pay up, then we’re certain to be in the simulation: P(sim|retain, not pay) = 1, while we could be either simulated or real if our stock falls: P(sim|fall, not pay) = 1/2. Moreover, P(sim & retain|not pay) = P(retain|sim, not pay) * P(sim) = P(sim|retain, not pay) * P(retain). Under SSA, P(retain) is just p.² We hence get

EU_SSA(not pay) = P(sim|retain, not pay) * p * a = p a.

Our expected utility for paying is:

EU_SSA(pay) = P(sim & retain|pay) * (a – b) + P(not sim|pay) * (a – b)

= P(sim|retain, pay) * p * (a – b) + P(not sim|pay) * (a – b).

If we pay up and the stock retains its value, there is exactly one of us in the simulation and one of us in the real world, so P(sim|retain, pay) = 1/2, while we’re sure to be in the simulation for the scenario in which our stock falls: P(sim|fall, pay) = 1. Knowing both P(sim & retain|pay) and P(sim & fall|pay), we can calculate P(not sim|pay) = p/2. This gives us

EU_SSA(pay) = 1/2 * p * (a – b) + 1/2 * p * (a – b) = p (a – b).

Great, EDT + SSA seems to calculate exactly the same payoffs as all other decision theories – namely, that by paying the Blackmailer, one just loses the money one pays the blackmailer, but gains nothing.

For SIA probabilities, P(retain|letter) depends on whether we pay or don’t pay. If we pay, then there are (in expectation) 2 p observers in the “retain” world, while there are (1 – p) observers in the “fall” world. So our updated P(retain|letter, pay) should be (2 p)/(1 + p). If we don’t pay, it’s p/(2 – p) respectively. Using the above probabilities and Bayes’ theorem, we have P(sim|pay) = 1/(1 + p) and P(sim|not pay) = 1/(2 – p). Hence,

EU_SIA(not pay) = P(sim & retain|not pay) * a = (p a)/(2 – p),

and

EU_SIA(pay) = P(sim) * P(retain|sim) * (a – b) + P(not sim) * (a – b)

= (p (a – b))/(1 + p) + (p (a – b))/(1 + p)

= (2 p (a – b))/(1 + p).

It seems like paying the blackmailer would be better here than not paying, if p and b are sufficiently low.

Why doesn’t SIA give the ex ante expected utilities, as SSA does? Up until now I have just assumed correlated decision-making, so that the decisions of the simulated us will also be those of the real-world us (and of course the other way around – that’s how the blackmail works in the first place). The simulated us hence also gets attributed the impact of our real copy. The problem is now that SIA thinks we’re more likely to be in worlds with more observers. So the worlds in which we have additional impact due to correlated decision-making get double-counted. In the world where we pay the blackmailer, there are two observers for p, while there is only one observer for (1 – p). If we don’t pay the blackmailer, there is only one observer for p, and two observers for (1 – p). SIA hence slightly favors paying the blackmailer, to make the p-world more likely.

To remediate the problem of double-counting for EDT + SIA, we could use something along the lines of Stuart Armstrong’s Correlated Decision Principle (CDP). First, we aggregate the “EDT + SIA” expected utilities of all observers. Then, we divide this expected utility by the number of individuals who we are deciding for. For EU_CDP(pay), there is with probability 1 an observer in the simulation, and with probability p one in the real world. To get the aggregated expected utility, we thus have to multiply EU(pay) by (1 + p). Since we have decided for two individuals, we divide this EU by 2 and get EU_CDP(pay) = ((2 p (a – b))/(1 + p)) * 1/2 * (1 + p) = p (a – b).

For EU_CDP(not pay), it gets more complex: the number of individuals any observer is making a decision for is actually just 1 – namely, the observer in the simulation. The observer in the real world doesn’t get his expected utility from his own decision, but from influencing the other observer in the simulation. On the other hand, we multiply EU(not pay) by (2 – p), since there is one observer in the simulation with probability 1, and with probability (1 – p) there is another observer in the real world. Putting this together, we get EU_CDP(not pay) = ((p a)/(2 – p)) * (2 – p) = p a. So EDT + SIA + CDP arrives at the same payoffs as EDT + SSA, although it is admittedly a rather messy and informal approach.

I conclude that, when taking into account anthropic uncertainty, EDT doesn’t give in to the Evidential Blackmail. This is true for SSA and possibly also for SIA + CDP. Fortunately, at least for SSA, we have avoided any kind of anthropic funny-business. Note that this is not some kind of dirty hack: if we grant the premise that simulations have to involve anthropic uncertainty, then per definition of the thought experiment – because there is necessarily a simulation involved in the Evidential Blackmail –, EDT doesn’t actually pay the blackmailer. Of course, this still leaves open the question of whether we have anthropic uncertainty in all problems involving simulations, and hence whether my argument applies to all conceivable versions of the problem. Moreover, there are other anthropic problems, such as the one introduced by Conitzer (2015a), in which EDT + SSA are still exploitable (in absence of a method to “bind themselves”).

Acknowledgement

I wrote this post while working for the Foundational Research Institute, which is now the Center on Long-Term Risk.

² This becomes apparent if we compare the Evidential Blackmail to Sleeping Beauty. SSA is the “halfer position”, which means that after updating on being an observer (receiving the letter), we should still assign the prior probability p, regardless of how many observers there are in either of the two possible worlds.

³ The result that EDT and SIA lead to actions that are not optimal ex ante is also featured in several publications about anthropic problems, e.g., Arntzenius, 2002; Briggs, 2010; Conitzer, 2015b; Schwarz, 2015.

Ahmed, A., & Price, H. (2012). Arntzenius on “Why ain”cha rich?’. Erkenntnis. An International Journal of Analytic Philosophy, 77(1), 15–30.

Arntzenius, F. (2002). Reflections on Sleeping Beauty. Analysis, 62(1), 53–62.
Arntzenius, F. (2008). No Regrets, or: Edith Piaf Revamps Decision Theory. Erkenntnis. An International Journal of Analytic Philosophy, 68(2), 277–297.

Briggs, R. (2010). Putting a value on Beauty. Oxford Studies in Epistemology, 3, 3–34.

Conitzer, V. (2015a). A devastating example for the Halfer Rule. Philosophical Studies, 172(8), 1985–1992.

Conitzer, V. (2015b). Can rational choice guide us to correct de se beliefs? Synthese, 192(12), 4107–4119.

Neal, R. M. (2006, August 23). Puzzles of Anthropic Reasoning Resolved Using Full Non-indexical Conditioning. arXiv [math.ST]. Retrieved from http://arxiv.org/abs/math/0608592

Schwarz, W. (2015). Lost memories and useless coins: revisiting the absentminded driver. Synthese, 192(9), 3011–3036.

Soares, N., & Fallenstein, B. (2015, July 7). Toward Idealized Decision Theory. arXiv [cs.AI]. Retrieved from http://arxiv.org/abs/1507.01986

Soares, N., & Levinstein, B. (2017). Cheating Death in Damascus. Retrieved from https://intelligence.org/files/DeathInDamascus.pdf

The average utilitarian’s solipsism wager

On March 15, 2017March 26, 2020 By CasparIn General3 Comments

The following prudential argument is relatively common in my circles: We probably live in a simulation, but if we don’t, our actions matter much more. Thus, expected value calculations are dominated by the utility under the assumption that we (or some copies of ours) are in the real world. Consequently, the simulation argument affects our prioritization only slightly — we should still mostly act under the assumption that we are not in a simulation.

A commonly cited analogy is due to Michael Vassar: “If you think you are Napoleon, and [almost] everyone that thinks this way is in a mental institution, you should still act like Napoleon, because if you are, your actions matter a lot.” An everyday application of this kind of argument is the following: Probably, you will not be in an accident today, but if you are, the consequences for your life are enormous. So, you better fasten your seat belt.

Note how these arguments do not affect the probabilities we assign to some event or hypothesis. They are only about the event’s (or hypothesis’) prudential weight — the extent to which we tailor our actions to the case in which the event occurs (or the hypothesis is true).

For total utilitarians (and many other consequentialist value systems), similar arguments apply to most theories postulating a large universe or multiverse. To the extent that it makes a difference for our actions, we should tailor them to the assumption that we live in a large multiverse with many copies of us because under this assumption we can affect the lives of many more beings.

For average utilitarians, the exact opposite applies. Even if they have many copies, they will have an impact on a much smaller fraction of beings if they live in a large universe or multiverse. Thus, they should usually base their actions on the assumption of a small universe, such as a universe in which Earth is the only inhabited planet. This may already have some implications, e.g. via the simulation argument or the Fermi paradox. If they also take the average over time — I do not know whether this is the default for average utilitarianism — they would also base their actions on the assumption that there are just a few past and future agents. So, average utilitarians are subject to a much stronger Doomsday argument.

Maybe the bearing of such prudential arguments is even more powerful, though. There is some chance that metaphysical solipsism is true: the view that only my (or your) own mind exists and that everything else is just an illusion. If solipsism were true, our impact on average welfare (or average preference fulfillment) would be enormous, perhaps 7.5 billion times bigger than it would be under the assumption that Earth exists — about 100 billion times bigger if you also count humans that have lived in the past. Solipsism seems to deserve a probability larger than one in 6 (or 100) billion. (In fact, I think solipsism is likely enough for this to qualify as a non-Pascalian argument.) So, perhaps average utilitarians should maximize primarily for their own welfare?

Acknowledgements

The idea of this post is partly due to Lukas Gloor. This work was funded by the Foundational Research Institute (now the Center on Long-Term Risk).

A Non-Comprehensive List of Human Values

On February 10, 2017September 16, 2018 By CasparIn General5 Comments

Human values are said to be complex (cf. Stewart-Williams 2015, section “Morality Is a Mess”; Muehlhauser and Helm 2012, ch. 3, 4, 5.3). As evidence, the following is a non-comprehensive list of things that many people care about:

Abundance, achievement, adventure, affiliation, altruism, apatheia, art, asceticism, austerity, autarky, authority, autonomy, beauty, benevolence, bodily integrity, challenge, collective property, commemoration, communism, community, compassion, competence, competition, competitiveness, complexity, comradery, conscientiousness, consciousness, contentment, cooperation, courage, “crabs in a bucket”, creativity, crime, critical thinking, curiosity, democracy, determination, dignity, diligence, discipline, diversity, duties, education, emotion, envy, equality, equanimity, excellence, excitement, experience, fairness, faithfulness, family, fortitude, frankness, free will, freedom, friendship, frugality, fulfillment, fun, good intentions, greed, happiness, harmony, health, honesty, honor, humility, idealism, idolatry, imagination, improvement, incorruptibility, individuality, industriousness, intelligence, justice, knowledge, law abidance, life, love, loyalty, modesty, monogamy, mutual affection, nature, novelty, obedience, openness, optimism, order, organization, pain, parsimony, peace, peace of mind, pity, play, population size, preference fulfillment, privacy, progress, promises, property, prosperity, punctuality, punishment, purity, racism, rationality, reliability, religion, respect, restraint, rights, sadness, safety, sanctity, security, self-control, self-denial, self-determination, self-expression, self-pity, simplicity, sincerity, social parasitism, society, spirituality, stability, straightforwardness, strength, striving, subordination, suffering, surprise, technology, temperance, thought, tolerance, toughness, truth, tradition, transparency, valor, variety, veracity, wealth, welfare, wisdom.

Note that from the inside, most of these values feel distinct from each other. Some of them have strong overlap, however. For instance, industriousness, diligence and conscientiousness often refer to similar things.

Also, note that most of these do not feel instrumental to each other. For example, people often want to find out the truth even when that truth is not useful for, e.g., reducing suffering or preserving tradition.

Some terms subsume multiple very different or even opposing moral views. For instance, progressives would say it’s fair if wealth is taken from the rich and given to the poor while libertarians would say it is fair if everyone receives wealth in proportion to how the market values their work.

Many of the values can be interpreted both deontologically and consequentialistically. For example, “frugality” could refer to the moral maxim “you shall be frugal” or to “you shall care about others being frugal”.

These values should not be understand as being valued additively. People presumably do not care about the amount of consciousness in the world plus the amount of happiness in the world. Instead they may care about the amount of consciousness times the average happiness of the conscious experiences.

Some (articles with) lists that helped me to compile this list are Keith‑Spiegel’s moral characteristics list, moral foundations theory, Your Dictionary’s Examples of Morals, Eliezer Yudkowsky’s 31 laws of fun, table A1 in Bain et al.’s Collective Futures, the examples in the Wikipedia article on Prussian values, the Moral Code of the Builder of Communism, the ten commandments, section IV, chapter 1 in Nussbaum’s (2000) Women and Human Development, Frankena’s (1973) Ethics, 2nd ed., p. 87f. and Peter Levine’s an alternative to Moral Foundations Theory.

“Betting on the Past” by Arif Ahmed

On February 6, 2017March 27, 2020 By Johannes TreutleinIn General5 Comments

[This post assumes knowledge of decision theory, as discussed in Eliezer Yudkowsky’s Timeless Decision Theory and in Arbital’s Introduction to Logical Decision Theory.]

I recently discovered an interesting thought experiment, “Betting on the Past” by Cambridge philosopher Arif Ahmed. It can be found in his book Evidence, Decision and Causality, which is an elaborate defense of Evidential Decision Theory (EDT). I believe that Betting on the Past may be used to money-pump non-EDT agents, refuting Causal Decision Theories (CDT), and potentially even ones that use logical conditioning, such as Timeless Decision Theory (TDT) or Updateless Decision Theory (UDT). At the very least, non-EDT decision theories are unlikely to win this bet. Moreover, no conspicuous perfect predicting powers, genetic influences, or manipulations of decision algorithms are required to make Betting on the Past work, and anyone can replicate the game at home. For these reasons, it might make a more compelling case in favor of EDT than the Coin Flip Creation, a problem I recently proposed in an attempt to defend EDT’s answers in medical Newcomb problems. In Ahmed’s thought experiment, Alice faces the following decision problem:

Betting on the Past: In my pocket (says Bob) I have a slip of paper on which is written a proposition P. You must choose between two bets. Bet 1 is a bet on P at 10:1 for a stake of one dollar. Bet 2 is a bet on P at 1:10 for a stake of ten dollars. So your pay-offs are as in [Figure 1]. Before you choose whether to take Bet 1 or Bet 2 I should tell you what P is. It is the proposition that the past state of the world was such as to cause you now to take Bet 2. [Ahmed 2014, p. 120]

Ahmed goes on to specify that Alice could indicate which bet she’ll take by either raising or lowering her hand. One can find a detailed discussion of the thought experiment’s implications, as well as a formal analysis of CDT’s and EDT’s decisions in Ahmed’s book. In the following, I want to outline a few key points.

Would CDT win in this problem? Alice is betting on a past state of the world. She can’t causally influence the past, and she’s uncertain whether the proposition is true or not. In either case, Bet 1 strictly dominates Bet 2: no matter which state the past is in, Bet 1 always yields a higher utility. For these reasons, causal decision theories would take Bet 1. Nevertheless, as soon as Alice comes to a definite decision, she updates on whether the proposition is true or false. If she’s a causal agent, she then finds out that she has lost: the past state of the world was such as to cause her to take Bet 1, so the proposition is false. If she had taken Bet 2, she would have found out that the proposition was correct, and she would have won, albeit a smaller amount than if she had won with Bet 1.

Betting on the Past seems to qualify as a kind of Newcomb’s paradox; it seems to have an equivalent payoff matrix (Figure 1).

Figure 1: Betting on the past has a similar payoff matrix to Newcomb’s paradox

	P is true	P is false
Take Bet 1	10	-1
Take Bet 2	1	-10

Furthermore, its causal structure seems to resemble those of e.g. the Smoking Lesion or Solomon’s problem, indicating it as a kind of medical Newcomb problem. In medical Newcomb problems, a “Nature” node determines both the present state of the world (whether the agent is sick/will win the bet) and the agent’s decision (see Figure 2). In this regard, they differ from Newcomb’s original problem, where said node refers to the agent’s decision algorithm.

Figure 2: Betting on the past (left) has a similar causal structure to medical Newcomb problems (right).

One could object to Betting on the Past being a medical Newcomb problem, since the outcomes conditional on our actions here are certain, while e.g. in the Smoking Lesion, observing our actions only shifts our probabilities in degrees. I believe this shouldn’t make a crucial difference. On the one hand, we can conceive of absolutely certain medical Newcomb cases like the Coin Flip Creation. On the other hand, Newcomb’s original problem is often formalized with absolute certainties as well. I’d be surprised if probabilistic vs. certain reasoning would make a difference to decision theories. First, we can always approximate certainties to an arbitrarily high degree. We might ask ourselves why a negligible further increase in certainty would at some point suddenly completely change the recommended action, then. Secondly, we’re never really certain in the real world anyway, so if the two cases would be different, this would render all thought experiments useless that use absolute certainties.

If Betting on the Past is indeed a kind of medical Newcomb problem, this would be an interesting conclusion. It would follow that if one prefers Bet 2, one should also one-box in medical Newcomb problems. And taking Bet 2 seems so obviously correct! I point this out because one-boxing in medical Newcomb problems is what EDT would do, and it is often put forward as both a counterexample to EDT and as the decision problem that separates EDT from Logical Decision Theories (LDT), such as TDT or UDT. (See e.g. Yudkowsky 2010, p.67)

Before we examine the case for EDT further, let’s take a closer look at what LDTs would do in Betting on the Past. As far as I understand, LDTs would take correlations with other decision algorithms into account, but they would ignore “retrocausality” (i.e. smoke in the smoker’s lesion, chew gum in the chewing gum problem, etc.). If there is a purely physical cause, then this causal node isn’t altered in the logical counterfactuals that an LDT agent reasons over. Perhaps if the bet was about the state of the world yesterday, LDT would still take Bet 2. Clearly, LDT’s algorithm already existed yesterday, and it can influence this algorithm’s output; so if it chooses Bet 2, it can change yesterday’s world and make the proposition true. But at some point, this reasoning has to break down. If we choose a more distant point in the past as a reference for Alice’s bet – maybe as far back as the birth of our universe – she’ll eventually be unable to exert any possible influence via logical counterfactuals. At some point, the correlation becomes a purely physical one. All she can do at that point is what opponents of evidential reasoning would call “managing the news” (Lewis, 1981) – she can merely try to go for the action that gives her the best Bayesian update.

So, do Logical Decision Theories get it wrong? I’m not sure about that; they come in different versions, and some haven’t yet been properly formalized, so it’s hard for me to judge. I can very well imagine that e.g. Proof-Based Decision Theory would take Bet 2, since it could prove P to be either true or false, contingent on the action it would take. I would argue, though, that if a decision theory takes Bet 2 – and if I’m right about Betting on the Past being a medical Newcomb problem – then it appears it would also have to “one-box”, i.e. take the option recommended by EDT, in other medical Newcomb problems.

If all of this is true, it might imply that we don’t really need LDT’s logical conditioning and that EDT’s simple Bayesian conditioning on actions could suffice. The only remaining difference between LDT and EDT would then be EDT’s lack of updatelessness. What would an updateless version of EDT look like? Some progress on this front has already been made by Everitt, Leike, and Hutter 2015. Caspar Oesterheld and I hope to be able to say more about it soon ourselves.

Acknowledgement

I wrote this post while working for the Foundational Research Institute, which is now the Center on Long-Term Risk.

Joyce’s Better Framing of Newcomb’s Problem

On February 2, 2017September 3, 2018 By CasparIn General3 Comments

While I disagree with James M. Joyce on the correct solution to Newcomb’s problem, I agree with him that the standard framing of Newcomb’s problem (from Nozick 1969) can be improved upon. Indeed, I very much prefer the framing he gives in chapter 5.1 of The Foundations of Causal Decision Theory, which (according to Joyce) is originally due to JH Sobel:

Suppose there is a brilliant (and very rich) psychologist who knows you so well that he can predict your choices with a high degree of accuracy. One Monday as you are on the way to the bank he stops you, holds out a thousand dollar bill, and says: “You may take this if you like, but I must warn you that there is a catch. This past Friday I made a prediction about what your decision would be. I deposited $1,000,000 into your bank account on that day if I thought you would refuse my offer, but I deposited nothing if I thought you would accept. The money is already either in the bank or not, and nothing you now do can change the fact. Do you want the extra $1,000?” You have seen the psychologist carry out this experiment on two hundred people, one hundred of whom took the cash and one hundred of whom did not, and he correctly forecast all but one choice. There is no magic in this. He does not, for instance, have a crystal ball that allows him to “foresee” what you choose. All his predictions were made solely on the basis of knowledge of facts about the history of the world up to Friday. He may know that you have a gene that predetermines your choice, or he may base his conclusions on a detailed study of your childhood, your responses to Rorschach tests, or whatever. The main point is that you now have no causal influence over what he did on Friday; his prediction is a fixed part of the fabric of the past. Do you want the money?

I prefer this over the standard framing because people can remember the offer and the balance of their bank account better than box 1 and box 2. For some reason, I also find it easier to explain this thought experiments without referring to the thought experiment itself in the middle of the explanation. So, now whenever I describe Newcomb’s problem, I start with Sobel’s rather than Nozick’s version.

Of course, someone who wants to explore decision theory more deeply also needs to learn about the standard version, if only because people sometimes use “one-boxing” and “two-boxing” (the options in Newcomb’s original problem) to denote the analogous choices in other thought experiments. (Even if there are no boxes in these other thought experiments!) But luckily it does not take more than a few sentences to describe the original Newcomb problem based on Sobel’s version. You only need to explain that Newcomb’s problem replaces your bank account with an opaque box whose content you always keep; and puts the offer into a second, transparent box. And then the question is whether you stick with one box or go home with both.

Peter Thiel on Startup Culture

On January 24, 2017 By CasparIn GeneralLeave a comment

I recently read Peter Thiel’s Zero to One. All in all, it is an informative read. I found parts of ch. 10 on startup culture particularly interesting. Here’s the section “What’s under Silicon Valley’s Hoodies”:

Unlike people on the East Coast, who all wear the same skinny jeans or pinstripe suits depending on their industry, young people in Mountain View and Palo Alto go to work wearing T-shirts. It’s a chliché that tech workers don’t care about what they wear, but if you look closely at those T-shirts, you’ll see the logos of the wearers’ companies—and tech workers care about those very much. What makes a startup employee instantly distinguishable to outsiders is the branded T-shirt or hoodie that makes him look the same as his co-workers. The startup uniform encapsulates a simple but essential principle: everyone at your company should be different in the same way—a tribe of like-minded people fiercely devoted to the company’s mission.

Max Levchin, my co-founder at PayPal, says that statups should make their early staff as personally similar as possible. Startups have limited resources and small teams. They must work quickly and efficiently in order to survive, and that’s easier to do when everyone shares an understanding of the world. The early PayPal team worked well together because we were all the same kind of nerd. We all loved science ficion: Cryptonomicon was required reading, and we preferred the capitalist Star Wars to the communist Star Trek. Most important, we were all obsessed with creating a digital currency that would be controlled by individuals instead of governments. For the company to work, it didn’t matter what people looked like or which country they came from, but we needed every new hire to be equally obsessed.

In the section “Of cults and consultants” of the same chapter, he goes on:

In the most intense kind of organization, members hang out only with other members. They ignore their families and abandon the outside world. In exchange, they experience strong feelings of belonging, and maybe get access to esoteric “truths” denied to ordinary people. We have a word for such organizations: cults. Cultures of total dedication look crazy from the outside, partly because the most notorious cults were homicidal: Jim Jones and Charles Manson did not make good exits.

But entrepeneurs should take cultures of extreme dedication seriosuly. Is a lukewarm attitude to one’s work a sign of mental health? Is a merely professional attitude the only sane approach? The extreme opposite of a cult is a consulting firm like Accenture: not only does it lack a distinctive mission of its own, but individual consultants are regularly dropping in and out of companies to which they have no long-term connection whatsover.

[…]

The best startups might be considered slightly less extreme kinds of cults. The biggest difference is that cults tend to be fanatically wrong about something important. People at a successful startup are fanatically right about something those outside it have missed. You’re not going to learn those kinds of secrets from consultants, and you don’t need to worry if your company doesn’t make sense to conventional professionals. Better to be called a cult—or even a mafia.

Is it a bias or just a preference? An interesting issue in preference idealization

On January 18, 2017March 26, 2020 By CasparIn General2 Comments

When taking others’ preferences into account, we will often want to idealize them rather than taking them too literally. Consider the following example. You hold a glass of transparent liquid in your hand. A woman walks by, says that she is very thirsty and would like to drink from your glass. What she doesn’t know, however, is that the water in the glass is (for some reason not relevant to this example) poisoned. Should you allow her to drink? Most people would say you should not. While she does desire to drink out of the glass, this desire would probably disappear upon gaining knowledge of its content. Therefore, one might say that her object-level preference is to drink from the glass, while her idealized preference would be not to drink from it. There is not too much literature on preference idealization, as far as I know, but, if you’re not already familiar with it, anyway, consider looking into “Coherent Extrapolated Volition“.

Preference idealization is not always as easy as inferring that someone doesn’t want to drink poison, and in this post, I will discuss a particular sub-problem: accounting for cognitive biases, i.e. systematic mistakes in our thinking, as they pertain to our moral judgments. However, the line between biases and genuine moral judgments is sometimes not clear.

Specifically, we look at cognitive biases that people exhibited in non-moral decisions, where their status as a bias to be corrected is much less controversial, but which can explain certain ethical intuitions. By offering such an error theory of a moral intuition, i.e. an explanation for how people could erroneously come to such a judgment, the intuition is called into question. Defendants of the intuition can respond that even if the bias can be used to explain the genesis of that moral judgment, they would nonetheless stick with that moral intuition. After all, the existence of all our moral positions can be explained by non-moral facts about the world – “explaining is not explaining away”. Consider the following examples.

Omission bias: People judge consequences of inaction as less severe than those of action. Again, this is clearly a bias in some cases, especially non-moral ones. For example, losing $1,000 by not responding to your bank in time is just as bad as losing $1,000 by throwing them out of the window. A business person who judges the two equivalent losses equally will ceteris paribus be more successful. Nonetheless, most people distinguish between act and omission in cases like the fat man trolley problem.

Scope neglect: The scope or size of something often has little or no effect on people’s thinking when it should have. For example, when three groups of people were asked what they would pay for interventions that would affect 2,000, 20,000, or 200,000 birds, people were willing to pay roughly the same amount of money irrespective of the number of birds. While scope neglect seems clearly wrong in this (moral) decision, it is less clearly so in other areas. For example, is a flourishing posthuman civilization with 2 trillion inhabitants really twice as good as one with 1 trillion? It is not clear to me whether answering “no” should be regarded as a judgment clouded by scope neglect (caused, e.g., by our inability to imagine the two civilizations in question) or a moral judgment that is to be accepted.

Contrast effect (also see decoy effect, social comparison bias, Ariely on relativity, mere subtraction paradox, Less-is-better effect): Consider the following market of computer hard drives, from which you are to choose one.

Hard drive model	Model 1	Model 2	Model 3 (decoy)
Price	$80	$120	$130
Capacity	250GB	500GB	360GB

Generally, one wants to expend as little money as possible while maximizing capacity. In the absence of model 3, the decoy, people may be undecided between models 1 and 2. However, when model 3 is introduced into the market, it provides a new reference point. Model 2 is better than model 3 in all regards, which increases its attractiveness to people, even relative to model 1. That is, models 1 and 2 are judged by how they compare with model 3 rather than by their own features. The effect clearly exposes an instance of irrationality: the existence of model 3 doesn’t affect how model 1 compares with model 2. When applied to ethical evaluation, however, it calls into question a firmly held intrinsic moral preference for social equality and fairness. Proponents of fairness seem to assess a person’s situation by comparing it to that of Bill Gates rather than judging each person’s situation separately. Similar to how the overpriced decoy changes our evaluation of the other products, our judgments of a person’s well-being, wealth, status, etc. may be seen as irrationally depending on the well-being, wealth, status, etc. of others.

Other examples include peak-end rule/extension neglect/evaluation by moments and average utilitarianism; negativity bias and caring more about suffering than about happiness; psychological distance and person-affecting views; status-quo bias and various population ethical views (person-affecting views, the belief that most sentient beings that already exist have lives worth living); moral credential effect; appeal to nature and social Darwinism/normative evolutionary ethics.

Acknowledgment: This work was funded by the Foundational Research Institute (now the Center on Long-Term Risk).

Decision Theory and the Irrelevance of Impossible Outcomes

On January 17, 2017May 6, 2025 By CasparIn General9 Comments

(This post assumes some knowledge of the decision theory of Newcomb-like scenarios.)

One problem in the decision theory of Newcomb-like scenarios (i.e. the study of whether causal, evidential or some other decision theory is true) is that even the seemingly obvious basics are fiercely debated. Newcomb’s problem seems to be fundamental and the solution obvious (to both sides), and yet scholars disagree about its resolution. If we already fail at the basics, how can we ever settle this debate?

In this post, I propose a solution. Specifically, I will introduce a very plausible general principle that decision rules should abide by. One may argue that settling on powerful general rules (like the one I will propose) must be harder than settling single examples (like Newcomb’s problem). However, this is not universally the case. Especially in decision theory, we should expect general principles to be especially convincing because a common defense of two-boxing in Newcomb’s scenario is that Newcomb’s problem is just a weird edge case in which rationality is punished. By introducing a general principle that CDT (or, perhaps, EDT) violates, we can prove the existence of a general flaw.

Without further ado, the principle is: The decisions we make should not depend on the utilities assigned to outcomes that are impossible to occur. To me this principle seems obvious and indeed it is consistent with expected value calculations in non-Newcomb-like scenarios: Imagine having to deterministically choose an action from some set A. (We will ignore mixed strategies.) The next state of the world is sampled from a set of states S via a distribution P and depends on the chosen action. We are also given a utility function U, which assigns values to pairs of a state and an action. Let a be an action and let s be a possible state. If P(s,a) = 0 (or P(s|a)=0 or P(s given the causal implications of a)=0 – we assume all of these to be the equivalent in this non-Newcomb-like scenario), then it doesn’t matter what U(s,a) is, because in an expected value calculation, U(s,a) will always be multiplied with P(s,a)=0. That is to say, any expected value decision rule gives the same outcome regardless of U(s,a). So, expected value decision rules abide by this principle at least in non-Newcomb-like scenarios.

Let us now apply the principle to a Newcomb-like scenario, specifically to the prisoner’s dilemma played against an exact copy of yourself. Your actions are C and D. Your opponent is the “environment” and can also choose between C (cooperation) and D (defection). So, the possible outcomes are (C,C), (C,D), (D,C) and (D,D). The probabilities P(C,D) and P(D,C) are both 0. Applied to this Newcomb-like scenario, the principle of the irrelevance of impossible alternatives states that our decision should only depend on the utilities of (C,C) and (D,D). Evidential decision theory behaves in accordance with this principle. (I leave it as an exercise to the reader to verify this.) Indeed, I suspect that it can be shown that EDT generally abides by the principle of the irrelevance of impossible outcomes. The choice of causal decision theory on the other hand does depend on the utilities of the impossible outcomes U(D,C) and U(C,D). Remember that in the prisoner’s dilemma the payoffs are such that U(D,x)>U(C,x) for any action x of the opponent, i.e. no matter the opponent’s choice it is always better to defect. This dominance is given as the justification for CDT’s decision to defect. But let us say we increase the utility of U(C,D) such that U(C,D)>U(D,D) and decrease the utility of U(D,C) such that U(D,C)<U(C,C). Of course, we must make these changes for the utility functions of both players so as to retain symmetry. After these changes, the dominance relationship is reversed: U(C,x)>U(D,x) for any action x. Of course, the new payoff matrix is not that of a prisoner’s dilemma anymore – the game is different in important ways. But when played against a copy, these differences do not seem significant, because we only changed the utilities of outcomes that were impossible to achieve anyway. Nevertheless, CDT would switch from D to C upon being presented with these changes, thus violating the principle of the irrelevance of impossible outcomes. This is a systematic flaw in CDT: Its decisions depend on the utility of outcomes that it can already know to be impossible.

The principle of the irrelevance of impossible outcomes can be used beyond arguing against CDT. As you may remember from my post on updatelessness, sensible decision theories will precommit to give Omega the money in the counterfactual mugging thought experiment. (If you don’t remember or haven’t read that post in the first place, this is a good time to catch up, because the following thoughts are based on the ideas from the post.) Even EDT, which ignores the utility of impossible outcomes, would self-modify in this way. However, the decision theory resulting from such self-modification violates the principle of the irrelevance of impossible outcomes. Remember that in counterfactual mugging, you give in because this was a good idea to precommit to when you didn’t yet know how the coin came up. However, once you know that the coin came up the unfavorable way, the positive outcome, which gave you the motivation to precommit, has become impossible. Of course, you only give in to counterfactual mugging if the reward in this now impossible branch is sufficiently high. For example, there is no reason to precommit to give in if you lose money in both branches. This means that once you have become updateless, you violate the principle of the irrelevance of impossible outcomes: your decision in counterfactual mugging depends on the utility you assign to an outcome that cannot happen anymore.

Acknowledgment: This work was funded by the Foundational Research Institute (now the Center on Long-Term Risk).

	Lukas Finnveden on “Betting on the Past” by Arif…
	Jesse Clifton on Decision Theory and the Irrele…
	Lukas Finnveden on Cooperative AI competitions wi…
	Caspar on Cooperative AI competitions wi…
	Lukas Finnveden on Cooperative AI competitions wi…