why it's a problem and how to prevent it"

Posted by Delta Gatti on Tuesday, May 28, 2024

By Bradford Saad (University of Oxford) & Adam Bradley (Lingnan University)

Frequently Asked Questions on the Problem of Digital Suffering

In “Digital suffering: why it's a problem and how to prevent it” we argue that the suffering of future digital systems poses a catastrophic risk and develop a strategy for preventing such suffering.  This post answers some key questions about digital suffering and our strategy for preventing it. 

What is digital suffering? 

We understand suffering as negatively valenced experiences that are pro tanto bad to have. This means that suffering is bad, considered on its own, but that its overall impact on wellbeing in a given case is left open. For example, take a marathon runner’s suffering. Like any other suffering, it is pro tanto bad. But its badness might be offset by the value of the achievement of running the marathon. Digital suffering is suffering in digital systems.

Are you saying I should be nice to my laptop so I don’t hurt its feelings?

No, we think it is very unlikely that any digital items you can currently buy at electronics stores have feelings.

What is the problem of digital suffering? 

Given the rapid advance of AI technology, there may soon exist digital systems that are conscious (Chalmers 2023 and Butlin et al. 2023). With digital consciousness comes the risk of digital suffering. The problem of digital suffering is that of mitigating this risk. 

How could digital suffering constitute a moral catastrophe?

The same way that human suffering could: digital suffering could be great enough in scale and severity to constitute a moral catastrophe. 

How bad could digital suffering be?

As a rough model, the total badness of digital suffering is the product of the number of digital minds, the average number of suffering experiences per digital mind, and the average intensity of each suffering experience.

There is reason to think that each of these parameters could have extremely high values.

The number of digital minds: if we can create one digital system with the capacity to suffer, we can create many.  In the current machine learning paradigm, training state-of-the-art systems is much more expensive than running such systems.  This suggests that shortly after engineering the first system that can suffer, we’ll be in a position to run many instances of that system.  With continued exponential growth in computational power and use, it would not be surprising if we then ran billions of instances of such a system within a decade (cf. Davidson 2023). In the long-run, the number of digital minds that our civilization could produce is unimaginably vast (see, e.g., Bostrom 2013: 18-19; Hanson 2016; and Greaves & MacAskill 2021: §3).

The number of experiences per digital mind: since digital systems operate at much faster speeds than the brain and digital systems can survive indefinitely, the number of suffering experiences that a particular digital mind can undergo could vastly outnumber the suffering experiences that a typical human undergoes.

The average intensity of digital suffering: there is  no reason to think that there is an upper bound on how bad suffering can be, much less for thinking that the suffering of humans or other biological systems approaches it. With the architectural freedom afforded by computer engineering, it would not be surprising if we could create digital systems whose suffering is much worse in intensity than any forms that have so far occurred.

Granting that we could cause digital suffering, why would we?

We are currently in the dark about which types of digital systems would have which types of experiences. As we create systems with an increasingly wide range of candidate markers for consciousness, it becomes increasingly likely that some of these systems will suffer—though for any particular system  we may lack epistemic access to the character of its experiences.  What we don’t know could harm digital minds!

Can’t we use standard philosophical tools to demonstrate the impossibility of digital suffering and, more generally, the impossibility of AI consciousness?

We—the authors of this post—certainly can’t, and we doubt you can either. Conscious computer systems are no harder to imagine than conscious brains. While we can perhaps rule out that consciousness is identical with a computational state, no such identity is required for conscious AI (see Chalmers 1996: Ch. 9).

Can’t we just check with consciousness scientists to make sure we’re not causing digital suffering?

No.  At present, there is no widely accepted scientific theory of consciousness. Nor is there a widely agreed upon scientific test for consciousness. Nor is there widespread convergence among scientists  regarding which types of digital systems would be conscious or suffer.

Can’t we just check with AI developers to make sure we’re not causing digital suffering?

No. 

First, AI developers know not what they create.  Current state-of-the-art models are created by training huge (multi-billion/trillion parameter) models on vast datasets (e.g. roughly, the internet) using domain-neutral learning algorithms, along with some fine-tuning through reinforcement learning. Admittedly, researchers are trying to develop interpretability techniques that would allow them to peek under the hood of these systems and see how they reason. However it is an open question whether it is possible—let alone feasible—for such techniques to yield more than a rudimentary understanding of existing models, much less whether these techniques will extend to more advanced systems that are currently in development.

Second, even if AI developers achieve a deep understanding of how these  systems work, that would not necessarily put them in a position to know what it’s like to be such systems or even whether they suffer. (Recall Mary the color scientist.) To bridge the gap, we need general principles that connect physical states with experiences. These principles are not the sorts of things that AI developers  are likely to discover.

Third, AI developers’ incentives are untethered from the truth about digital suffering. AI developers are, at present, primarily companies subject to pressures from consumers, investors, stakeholders, and governments. These pressures incentivize AI developers to portray their systems as (un)conscious whether or not they are, and  to paint a rosy  picture of the epistemic state of play, whether or not that picture is accurate. As a case in point: Google fired software engineer Blake Lemoine after he claimed to detect sentience in one of the company’s large language models, LaMDA 2. Google commented that they’d informed Lemoine that there was  “lots of evidence against” his claim. But, unsurprisingly, the supposedly abundant evidence went unspecified.  As this event suggests, the appearance of AI consciousness  makes for bad press that companies are keen to avoid. Hence, there is an incentive for them not to discover digital consciousness let alone digital suffering and to conceal or downplay it in the event that it is discovered.  At present, no regulations are in place to prevent AI developers from responding to these incentives.

Can’t we make sure we’re not causing digital suffering just by asking AIs whether we’re causing them to suffer?

No. The nature of these systems is that, without substantial architectural changes, they can be trained to say whatever we want them to say. The trouble is not that we can, if we try, decouple their reports from their potentially relevant internal states. Instead, it’s that there’s no basis in existing training methods for expecting such a coupling in the first place.  The same holds more generally for behaviors that indicate consciousness in humans: we should expect these to freely decouple from any consciousness in AI systems (See, e.g., Chalmers 2023 and Schneider 2019.)

What is your strategy for preventing digital suffering?

Our strategy, Access, Monitor, and Prevent (AMP), has three components. 

  • Gain epistemic access to some digital minds.  

  • In particular, we propose a ‘functional connectedness test’. The test only permits the creation of digital minds that are epistemically accessible to us via a ‘dancing qualia argument.’ 

  • Monitor created systems to determine what sorts of functional states—and hence what sorts of experiences—they have.

  • Prevent digital systems from entering states with the functional markers of suffering.  

  • The key innovation of the strategy is the use of the functional connectedness test to gain access to the experiences of digital systems.

    What is the functional connectedness test?

    The functional connectedness test asks of a given advanced digital system: is it functionally connected to some normally-functioning human?

    For two systems to be functionally connected is for there to be—if only in principle—a gradual transformation of one to the other that preserves fine-grained functional organization, where fine-grained functional organization is the level (whichever it is) that suffices to determine behavioral capacities. 

    An advanced digital system passes the test if and only if it is functionally connected to a normally-functioning human.

    Doesn’t applying the test require us to create a large gradual series of systems, including ones whose permissibility will not yet have been established?

    No: gradual transformations are to be understood as abstract mappings in state space, not concrete implementations thereof.

    Why is the functional connectedness test a good test?

    It’s supported by a dancing qualia argument. 

    What is the dancing qualia argument?

    Very roughly, the argument says: If a human and digital system that are functionally connected and normally-functioning (not on drugs, etc.) could have very different experiences, then we could construct a series of intermediate nomically possible cases  (i.e. cases compatible with the laws of nature). Further, by oscillating between suitably selected intermediate cases, we’d then be able to construct a scenario in which a normally-functioning system fails to notice large changes in its experience. Since it’s implausible the system would fail to notice such changes, it must be that systems that are functionally connected and normally-functioning at least have similar experiences.

    Here’s a visual aid:

    Here’s a slightly condensed presentation of the version we give in the paper:

    1. Let DS be a nomically possible digital system such that DS is functionally connected with some nomically possible normally-functioning human H.

    2. DS is a normally-functioning cognitive system.

    3. If DS is a normally-functioning cognitive system and has very different experiences from H, then dancing qualia are nomically possible in a normally-functioning cognitive system: oscillations within the series relating series relating DS and H can induce large phenomenal changes in a normally-functioning cognitive system that are wholly undetected.

    4. So, if DS has very different experiences from H, a nomically possible normally-functioning cognitive system is subject to dancing qualia. [1-3]

    5. It is implausible that a nomically possible normally-functioning cognitive system is subject to dancing qualia.

    6. So, it is implausible that DS has very different experiences from H. [4, 5]

    7. So, plausibly, given that DS is conscious, the experiences of DS and H are similar. [6]

    The dancing qualia argument licenses us to infer that digital systems that are functionally connected to you have experiences like ours when they are in corresponding functional states.  This means we can be confident that they are not suffering when they are in functional states that do not produce suffering or anything similar in you.

    That’s not how I remember the dancing qualia argument. Are you sure you’re getting it right?

    As we note in the paper, our presentation of the argument departs from the standard one in important ways.  In particular, the standard argument seeks to establish (Chalmers, 1996): 

    Organizational Invariance: phenomenology cannot vary independently of fine-grained functional organization.

    For reasons given in the paper, we deny that the dancing qualia argument establishes Organizational Invariance and claim that it would be morally risky to assume Organizational Invariance as part of a strategy for preventing digital suffering.

    However, we think reflection on dancing qualia lends strong support for a weaker thesis in the vicinity, namely:

    Dimensions along which restrictions are imposed are indicated with colors.

    Does your strategy work equally well regardless of how AI develops?

    No. Due to the fundamental architectural differences between human minds and machine-learning systems, machine-learning systems are generally not functionally connected to human beings. As a result, AMP does not permit the creation of advanced machine learning systems. The same goes for hard-coded systems.

    In contrast, AMP looks much more promising in connection with whole brain emulation (WBE), which would be digital systems that function like a human brain down to a certain level of functional detail.

    I notice that AMP is rather restrictive. Are there safe ways to relax AMP?

    Broadly speaking, two sorts of relaxations seem promising: those that make AMP safer (more effective in preventing digital suffering if implemented) and more viable (more likely to be implemented).

    In the paper, we also explore several specific approaches to relaxing AMP:

    • Applying the functional connectedness test to a broader class of humans (beyond those that are normally-functioning)

    • Relaxing the functional connectedness criterion

    • Applying AMP recursively, and

    • Allowing advanced digital systems to be created even if they will suffer in certain ways

    While we regard these relaxation proposals as promising and worthy of further exploration, we mostly see these as ways to improve AMP’s prospects within the WBE paradigm, not as ways to extend it to other paradigms.

    How have recent AI developments impacted your perspective on AMP?

    Our AGI timelines substantially shortened during the last two years in light of developments in machine learning.  We previously regarded it as a reasonably likely outcome that whole brain emulation would be the first path taken to AGI.  This now seems very unlikely. As a result, we are more pessimistic about AMP as a universal strategy for preventing digital suffering. But we still see value in AMP. For one, AMP might become (locally, if not globally) viable when WBE eventually does arrive. For another, AMP’s availability provides motivation for altering the current course of AI development. AMP provides a proof-of-concept that detailed strategies for mitigating digital suffering are possible. Proponents of other paths to advanced AI systems should try to develop similarly explicit and well-motivated strategies for preventing suffering. We encourage the search for alternatives to AMP. We see it as an initial attempt to solve the problem of digital suffering, not the last word. 

    What about actual human suffering and animal suffering? Shouldn’t we greatly reduce them first before addressing digital suffering, which is a mere possibility at this point?

    We’d agree that we—individually and as a civilization—need to prioritize what problems to work on. After all, our resources are limited and we can’t effectively solve even a small fraction of the world’s problems. 

    At the same time, there’s a risk of going too far in the opposite direction by focusing only on the most pressing problem until it’s solved and only then moving onto the next one. Fortunately, we have the option of trying to reduce various forms of suffering simultaneously.  Of course, tradeoffs are unavoidable. But we, as a society, should have members and groups working to solve different problems. 

    Why focus so much on suffering? 

    We agree that there’s more to life and morality than suffering.  We focus on suffering largely because a wide range of moral theories endorse the aim of preventing it.  Thus, this focus allowed us to isolate an important but neglected problem that rests on minimally controversial moral assumptions. Further, we allow that AMP may need to be refined to incorporate additional moral desiderata.  

    AMP aims to improve our epistemic access to digital systems’ experiences. Wouldn’t this make it easier for bad actors to harm digital minds?

    This is an important concern for any strategy that responds to the problem of digital suffering by trying to improve our epistemic access to digital minds. Short of halting AI development, it’s not clear what a solution to the problem of digital suffering would look like that did not have this feature. Both the misuse risk posed by epistemic access to digital experiences and the options for mitigating that risk merit further exploration. Although we’re cautiously optimistic that improving our epistemic access to digital minds will tend to be on balance highly positive in expectation, we’re also open to further analysis showing otherwise. In that event, halting advanced AI development might be morally preferable.

    Where can I read more about the problem of digital suffering and related issues?

    For more on the dancing qualia argument, see Chalmers 1996, Ch. 7 and references therein. An earlier online version can be found here.

    To learn more about whole brain emulation, you might start with:

    For some other work on AI rights, AI well being, AI consciousness, and/or associated risks see:

    ncG1vNJzZmimlazEsL7KoqWpoJmhvLS7z6GwZ6ull8C1rcKkZZynnWS9cK7Rmpufp6KZerStwJ1krqaZq7Kzv8itsGanlmK8ubLOq5s%3D