There are teams of researchers in academia and at major AI labs these days working on the problem of AI ethics, or the moral concerns raised by AI systems. These efforts tend to be especially focused on data privacy concerns and on what is known as AI bias — AI systems that, using training data with bias often built in, produce racist or sexist results, such as refusing women credit card limits they’d grant a man with identical qualifications.
There are also teams of researchers in academia and at some (though fewer) AI labs that are working on the problem of AI alignment. This is the risk that, as our AI systems become more powerful, our oversight methods and training approaches will be more and more meaningless for the task of getting them to do what we actually want. Ultimately, we’ll have handed humanity’s future over to systems with goals and priorities we don’t understand and can no longer influence.
Today, that often means that AI ethicists and those in AI alignment are working on similar problems. Improving the understanding of the internal workings of today’s AI systems is one approach to solving AI alignment, and is crucial for understanding when and where models are being misleading or discriminatory.
And in some ways, AI alignment is just the problem of AI bias writ (terrifyingly) large: We are assigning more societal decision-making power to systems that we don’t fully understand and can’t always audit, and that lawmakers don’t know nearly well enough to effectively regulate.
As impressive as modern artificial intelligence can seem, right now those AI systems are, in a sense, “stupid.” They tend to have very narrow scope and limited computing power. To the extent they can cause harm, they mostly do so either by replicating the harms in the data sets used to train them or through deliberate misuse by bad actors.
But AI won’t stay stupid forever, because lots of people are working diligently to make it as smart as possible.
Part of what makes current AI systems limited in the dangers they pose is that they don’t have a good model of the world. Yet teams are working to train models that do have a good understanding of the world. The other reason current systems are limited is that they aren’t integrated with the levers of power in our world — but other teams are trying very hard to build AI-powered drones, bombs, factories, and precision manufacturing tools.
That dynamic — where we’re pushing ahead to make AI systems smarter and smarter, without really understanding their goals or having a good way to audit or monitor them — sets us up for disaster.
And not in the distant future, but as soon as a few decades from now. That’s why it’s crucial to have AI ethics research focused on managing the implications of modern AI, and AI alignment research focused on preparing for powerful future systems.
Not just two sides of the same coin
So do these two groups of experts charged with making AI safe actually get along?
These are two camps, and they’re two camps that sometimes stridently dislike each other.
From the perspective of people working on AI ethics, experts focusing on alignment are ignoring real problems we already experience today in favor of obsessing over future problems that might never come to be. Often, the alignment camp doesn’t even know what problems the ethics people are working on.
“Some people who work on longterm/AGI-style policy tend to ignore, minimize, or just not consider the immediate problems of AI deployment/harms,” Jack Clark, co-founder of the AI safety research lab Anthropic and former policy director at OpenAI, wrote this weekend.
From the perspective of many AI alignment people, however, lots of “ethics” work at top AI labs is basically just glorified public relations, chiefly designed so tech companies can say they’re concerned about ethics and avoid embarrassing PR snafus — but doing nothing to change the big-picture trajectory of AI development. In surveys of AI ethics experts, most say they don’t expect development practices at top companies to change to prioritize moral and societal concerns.
(To be clear, many AI alignment people also direct this complaint at others in the alignment camp. Lots of people are working on making AI systems more powerful and more dangerous, with various justifications for how this helps learn how to make them safer. From a more pessimistic perspective, nearly all AI ethics, AI safety, and AI alignment work is really just work on building more powerful AIs — but with better PR.)
Many AI ethics researchers, for their part, say they’d love to do more but are stymied by corporate cultures that don’t take them very seriously and don’t treat their work as a key technical priority, as former Google AI ethics researcher Meredith Whittaker noted in a tweet:
I have an AI ethics joke but it has to be approved by PR, legal, and our partners in the Department of Defense before I can tell it.— Meredith Whittaker (@mer__edith) July 26, 2020
A healthier AI ecosystem
The AI ethics/AI alignment battle doesn’t have to exist. After all, climate researchers studying the present-day effects of warming don’t tend to bitterly condemn climate researchers studying long-term effects, and researchers working on projecting the worst-case scenarios don’t tend to claim that anyone working on heat waves today is wasting time.
You could easily imagine a world where the AI field was similar — and much healthier for it.
Why isn’t that the world we’re in?
My instinct is that the AI infighting is related to the very limited public understanding of what’s happening with artificial intelligence. When public attention and resources feel scarce, people find wrongheaded projects threatening — after all, those other projects are getting engagement that comes at the expense of their own.
Lots of people — even lots of AI researchers — do not take concerns about the safety impacts of their work very seriously.
At the different large-scale labs (where large-scale = multiple thousands of GPUs), there are different opinions among leadership on how important safety is. Some people care about safety a lot, some people barely care about it. If safety issues turn out to be real, uh oh!— Jack Clark (@jackclarkSF) August 6, 2022
Sometimes leaders dismiss long-term safety concerns out of a sincere conviction that AI will be very good for the world, so the moral thing to do is to speed full ahead on development.
Sometimes it’s out of the conviction that AI isn’t going to be transformative at all, at least not in our lifetimes, and so there’s no need for all this fuss.
Sometimes, though, it’s out of cynicism — experts know how powerful AI is likely to be, and they don’t want oversight or accountability because they think they’re superior to any institution that would hold them accountable.
The public is only dimly aware that experts have serious safety concerns about advanced AI systems, and most people have no idea which projects are priorities for long-term AI alignment success, which are concerns related to AI bias, and what exactly AI ethicists do all day, anyway. Internally, AI ethics people are often siloed and isolated at the organizations where they work, and have to battle just to get their colleagues to take their work seriously.
It’s these big-picture gaps with AI as a field that, in my view, drive most of the divides between short-term and long-term AI safety researchers. In a healthy field, there’s plenty of room for people to work on different problems.
But in a field struggling to define itself and fearing it’s not positioned to achieve anything at all? Not so much.
A version of this story was initially published in the Future Perfect newsletter. Sign up here to subscribe!