Suicide as wireheading

In my last post you learned about wireheading. In this post, I’ll bring to attention a specific example of wireheading: suicide or, as one would call it for robots, self-destruction. What differentiates self-destruction from more commonly discussed forms of wireheading is that it does not lead to a pleasurable or very positive internal state, but it too is a measure that does not solve problems in the external world as much as it changes one’s state of mind (into nothingness).

Let’s consider the reinforcement learner who has an internal module which generates rewards between -1 and 1. Now, there will probably be situations in which the reinforcement learner has to expect to mainly receive negative rewards in the future. Assuming that zero utility is assigned to nonexistence (similar to how humans think of the states prior to their existence or phases of sleep as neutral to their hedonic well-being) a reinforcement learner may well want to end its existence to increase its utility. It should be noted that typical reinforcement learners as studied in the AI literature don’t have a concept of death. However, a reinforcement learner that is meant to work in the real world would have to think about its death in some way (which may be completely confused by default). An example view is one in which the “reinforcement learner” actually has a utility function that calculates utility from the sum of the outputs of the physical reward module. In this case, suicide would reduce the number of outputs of the reward module which can increase expected utility if rewards are expected to be more often or to a stronger extent negative in the future. So, we have another example of potentially rational wireheading.

However, for many agents self-destruction is irrational. For example, if my goal is to reduce suffering then I may feel bad once I learn about instances of extreme suffering and about how much suffering there is in the world. Killing myself ends my bad feeling, but prevents me from achieving my goal of reducing suffering. Therefore, it’s irrational given my goals.

The actual motivations of real-world people committing suicide seem a lot more complicated most of the time. Many instances of suicide do seem to be about bad mental states. Also, many attempt to use their death to achieve goals in the outside world as well, the most prominent example being seppuku (or harakiri), in which suicide is performed to maximize one’s reputation or honor.

As a final note, looking at suicide through the lens of wireheading provides one way of explaining why so many beings who live very bad lives don’t commit suicide. If an animal’s goals are things like survival, health, mating etc. that correlate with reproductive success, the animal can’t achieve its goals by suicide. Even if the animal expects its life to continue in an extremely bad and unsuccessful way with near certainty, it behaves rationally if it continues to try to survive and reproduce rather than alleviate its own suffering. Avoiding pain is only one of the goals of the animal and if intelligent, rational agents are to be prevented from committing suicide in a Darwinian environment, reducing their own pain better not be the dominating consideration. In sum, the fact that an agent does not commit suicide tells you little about the state of its well-being if it has goals about the outside world, which we should expect to be the case for most evolved beings.

2 thoughts on “Suicide as wireheading

  1. > It should be noted that typical reinforcement learners as studied in the AI literature don’t have a concept of death.

    Yeah, and probably neither do most animals. Of course, one could hard-code this into an RL agent by creating a new action (suicide) that leads to a new state (death), and assign that state a certain reward of 0 and make it an absorbing state.

    > Even if the animal expects its life to continue in an extremely bad and unsuccessful way with near certainty, it behaves rationally if it continues to try to survive and reproduce rather than alleviate its own suffering.

    Yeah, but this rationality isn’t due to long chains of reasoning by animals about what best accomplishes their goals. Rather, evolution hard-codes this rationality into animals via their wills to live, etc.

    Liked by 1 person

    1. >Of course, one could hard-code this into an RL agent by creating a new action (suicide) that leads to a new state (death), and assign that state a certain reward of 0 and make it an absorbing state.

      Yes, that would make them consider on particular kind of suicide (if you tell the AI in advance — you can’t “explore suicide”), but the AI would still not get what happens when it is run over by a train.

      >Yeah, but this rationality isn’t due to long chains of reasoning by animals about what best accomplishes their goals. Rather, evolution hard-codes this rationality into animals via their wills to live, etc.

      Sure! 🙂 (Will to live would still be a statement about goals, I guess, but they may have hard-coded rules against the kind of things that could kill themselves.)

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s