Nick Bostrom doesn't worry about the same things most philosophers worry about. The Oxford professor has written about the possibility that we're all living in a simulation, the likelihood that all intelligent life on Earth will become extinct in the next century, and whether or not morality can exist in an infinitely large universe. His latest book, Superintelligence, is all about what happens if and when general artificial intelligence (AI) emerges — and why that could mean a world dominated by machines.
The basic argument is simple. At some point, many experts believe that artificial intelligence will advance to a point where it not only exceeds human intelligence, but is capable of expanding its own intelligence, setting off an exponential "intelligence explosion." In theory, these hyper-intelligent machines could be used to serve human ends. They could cure diseases and resolve intractable scientific quandaries. In an extreme case, they could wholly replace human workers, enabling humankind to quit working and live comfortably off the robots' labor.
But the problem is that, Bostrom argues, superintelligent machines will be so much more intelligent than humans that they most likely won't remain tools. They'll become goal-driven actors in their own right, and their goals may not be compatible with those of humans. Indeed, they might not be compatible with the continued existence of humans. Please consult the Terminator franchise for more on how that situation plays out.
I talked with Bostrom about his book, why he believes superintelligence is a real possibility, and what, if anything, humans can do to avoid its worst pitfalls.
Dylan Matthew:sWhat do you mean by superintelligence and what is the path that you think is most plausible to get there?
Nick Bostrom:What I mean is any intellect that radically outperforms the best human minds in every field, including scientific creativity, general wisdom and social skills, roughly speaking.
I guess I have a bunch of different theses in the book. The basic idea is that I think at some point we will create machines that are superintelligent, and that the first machine to attain superintelligence may become extremely powerful to the point of being able to shape the future according to its preferences.
So that's step one in the argument, and step two is that it looks quite difficult to design a seed AI such that its preferences, if fully implemented, would be consistent with the survival of humans and the things we care about. And that, therefore, we face this great challenge of solving the control problem for advanced artificial agents that we've got to solve before we figure out how to solve the other problem, which is how to create machines that are superintelligent in the first place.
Dylan Matthews:Artificial intelligence has existed as a field for about 60 years. It seems like we are getting some place with very algorithmic tasks, but for the broader intelligence that you spoke of earlier — wisdom, creativity, and so forth — progress has been slower.
What makes you think that kind of artificial intelligence is possible?
Nick Bostrom:60 years is not very long in the scheme of things. There are a lot of things that human civilization has accomplished now that it took a lot longer than 60 years to do.
We did a survey of opinions among some of the leading experts in artificial intelligence, which I report in the book. One of the questions we ask was, "by what year do you think there is a 50 percent probability that we will have human-level machine intelligence?" And the median answer to that was 2040 or 2050, depending on precisely which group of experts we polled.
There is a lot of uncertainty around that of course, it could happen much sooner or much later, but it suggests that it's a very mainstream opinion among experts to think that there is a real chance that this may happen over the next few decades, or at least in this century.
Dylan Matthews:I suppose one possible response to that is, "Well, these are people who have devoted their lives to AI research. If they were not optimistic about the chances for progress, presumably they would have done something else."
And there are people outside of the field — I am sure you are familiar with Hubert Dreyfus's arguments about this — who think there is something fundamental about human cognition that is not model-able in that way. What do you make of those conceptual arguments for the limitations of it?
Nick Bostrom:It is certainly possible that this expert group is biased. I don't buy the particular argument by Dreyfus, or any of the other attempts to show that it is, in principle, impossible to create generally intelligent machines.
We have an existence proof of general intelligence. We have the human brain, which produces general intelligence. And one pathway towards machine intelligence is by figuring out how the human brain accomplishes this by studying the neural network that is inside our heads. We can, perhaps, discern what algorithms and computational architectures are used in there and then do something similar in computers. That might not be the way that we will actually first get to the goal, but in principle it is something that we could do to make incremental progress.
An even more radical approach is to literally copy a particular human brain, as in the approach of whole-brain emulation. We would not have to understand at any higher level of description how the brain produces intelligence. We would just need to understand what the components do.
It might well be that the purely artificial approach will beat these other approaches to the punch. Another consideration is that even if it were, which I do not believe it is, outside the reach of current human intelligence to conceive and engineer artificial intelligence, human biological intelligence is itself something that can be enhanced, and I believe will be enhanced, perhaps in latter half of this century. So we also need to consider that there could be these much enhanced human scientists and computer scientists who will be able to make progress even if we were stumped.
Dylan Matthews:One of the more interesting sections of the book, I thought, was on how you think AI will wind up affecting the physical world. Walk me through how you see that process unfolding, and why, if machines get this intelligent, they'd want to get involved in the physical world in that way?
Nick Bostrom:Well, there are two different questions there. One is whether they want to do that, and the other question is how they could do, assuming that they did want it.
If we begin with the latter of those two questions, I think that the real danger lies not in the body of the AI, but in the mind. You could imagine different robots equipped with machine guns or rockets or stuff like that. That's not where I think the danger lies.
Suppose that we initially put it in a box and disconnected the internet cable and allowed it to communicate only by typing text on a screen in the hope that that would enable us to keep it confined in this way. But, in order to get any use of it at all, we have to interact with it. At that point a huge vulnerability opens up, which is the human itself. Humans are not secure systems. Even today, mere mortal humans often succeed in manipulating other humans, and hackers often use social engineering to break into computer systems. So if we imagine a superintelligent manipulator and persuader, it seems like it would likely succeed in finding a human accomplice or otherwise using us as its arms and legs if it didn't have direct access to physical manipulators.
Now the question as to why it might want to do this is interesting and important and my claim would be that for a wide range of different possible goals that the superintelligence might have, achieving access to physical resources would be a convergent instrumental value. That is, having access to resources is something that is useful if you want to achieve a goal for a very wide range of different goals.
There are some other instrumental goals as well that would be convergent for a wide range of tasks, like making yourself smarter, preventing humans from switching you off, inventing new technologies, preserving your current goal system: these are all instrumentally useful sub-goals to a wide range of possible final goals that the superintelligence might have.
Dylan Matthews:I think, for some people, the stumbling block here is the concept of AIs wanting things. You could imagine a program that takes a formalized mathematical theorem, and then it produces a proof. If something could do that for the Riemann Hypothesis, we would all say it's an astounding accomplishment of artificial intelligence.
But it is hard to think of such a program wanting things, and having goals, and looking out for itself. How do you see that kind of capability arising? And is it possible for humans to avoid creating machines with that capability while still creating extremely intelligent machines?
Nick Bostrom:Perhaps. I call that "tool AI," and there is a subsection on it.
Today, if you had some mass AI and you tried to enlist it to help you find a proof, maybe that would involve taking some lemmas and seeing whether combining them in a certain sequence would deliver the proof.
But, with a more powerful general AI, you might also find different plans. For example, how to allocate your computational resources, or how to obtain more computational resources that would then make you more likely to discover the proof. If you start to think about how to obtain more computational resources, one of those plans might call for taking over other microprocessors so that you could instantiate your program on more hardware, and then a whole host of other things come with that.
To achieve general intelligence as opposed to domain specific intelligence, you would need a very sophisticated model about how the world works, how humans work, and how physical objects work. And once you are searching for plans in that very general world model, you might have machines that look as if they have will.
One approach that one might think would obviously be the safest bet is to try to engineer a system that would not be of this agent-like character: that is, to try to build a tool AI. It is still worth exploring that further, but it's a lot less obvious than it looks that it actually is a fruitful avenue.
For a start, you might end up with an agent even if you didn't set out to create it. So, if there are these internal processes within the system that amount to very powerful search processes for plans, it might well be that one of these internal processes will exhibit agent-like behavior, even if you didn't define the whole system to be an agent. And these agent-like processes that emerge spontaneously might then be more dangerous, because they were not put in on purpose.
Dylan Matthews:What kind of restrictions, if any, could be helpful in increasing the probability that we will develop friendly general AI rather than amoral, or even malevolent, general AI?
Nick Bostrom:I unfortunately don't see much hope at the current time, if we are thinking about regulations that would be helpful in this context. I think, at the moment, it would make more sense to try to accelerate work on the control problem, rather than trying to slow down work on AI, because there are just so many incentives for various people and companies to try to make advances in faster hardware, better understanding of how the brain works, cleverer algorithms, etc.
It is very hard to see how that would slow down in any significant degree, at least at the current level of understanding of what the problem is, whereas even fairly limited resources could do a lot to speed up our progress towards solving the control problem. It might only be half a dozen or so people working on the control problem today, whereas there are tens and tens of thousands if not hundreds of thousands of people working towards general AI.
Dylan Matthews:One interesting discussion towards the end of the book is about what a world dominated by superintelligences would look like, about whether there would be one large superintelligence or a collection of them.
I think the idea that there'd only be one might be non-intuitive for a lot of people. What's the case for that outcome?
Nick Bostrom:I discuss two different classes of scenarios. There is the scenario where the transition to the machine intelligence era is very rapid, and you have an intelligence explosion at some point, where it is likely that there will be one superintelligence able to form a singleton, a world order where at the highest level of decision making there is only one agent.
Say you go from human-level to superintelligence within four hours or two days or some very short period of time like that. In that case, it is very unlikely that there will be two or more development teams that undergo this transition in parallel. Usually, the leader is at least two weeks or months ahead of the next best project. In that case, you might have this singleton outcome, where there is one thing that will shape the future according to its preferences.
But then there is this other process scenario, the multipolar outcome, where you have a slower transition playing out over years or decades. In this case, there probably would be multiple systems that are roughly comparable at any given time, such that none of them was able to dictate the future. I think the singleton outcome is more likely, but we certainly cannot, at this stage, preclude the multipolar outcome either.
In the singleton outcome, everything would depend on what this AI would want. And that, in turn, would depend on whether we had succeeded in solving this very difficult control problem, which is an open question. I am hoping my book will increase the probability that we will solve it by attracting some of the brightest minds to work on this problem.
In the multipolar outcome, it might be harder for us to influence the long-term future, but it might be that various institutional arrangements might make a difference. But there is the question of whether those institutions would be stable, as these digital minds proliferate and continue to evolve in competition with one another. I think in that scenario, maybe some form of global coordination will be necessary in order to avoid developments going in whichever direction the AIs are pointed, which might not be very friendly to human values.
This interview has been edited for clarity and length.