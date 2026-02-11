The AI revolution is unfolding before our eyes, and there is considerable excitement about how its magical powers may change our lives, for better and worse. Stuart Russell, a professor of computer science at the University of California, Berkeley, warns us about the dangers of this technology and the need for strong safety and ethical guardrails during the ongoing development process.

As one of the leading researchers on AI, Russell’s research spans almost every area of theoretical AI. His book “Artificial Intelligence: A Modern Approach” (with Peter Norvig) is the standard text in universities around the world. He is also one of the most influential voices on safety and ethics of AI, having founded the International Association on Safe and Ethical AI.

In part one of a two-part series, Russell emphasised the safety and ethical aspects of AI. He spoke to Amitabh Sinha. You may read the second part of the interview here.

Q: You have been advocating for a moratorium on further development of AI till it is infused with safety and ethical features. Why is that?

Russell: It could be a very short moratorium. If you develop safe AGI [Artificial General Intelligence] tomorrow, it’s fine, go for it. But safe means really, really safe because if the alternative is human extinction, what is the probability of failure you are willing to tolerate? One in a million, one in a trillion?

What is unsafe and unethical about today’s AI?

Let me give you an example. There’s a lawsuit going on right now in the United States because a child was convinced to commit suicide. It was given advice on how to do it. It was encouraged to do it. The child was comforted by the AI system while making the preparations and so on. Now, if a human being had done that, they would go to prison for a long time. To me, it’s unethical to produce a project that does something which if a human did, they would go to prison for a long time.

And this is just the tip of the iceberg. I get emails every day from people who are deep in clinical psychosis because of the interactions they have had with their AI systems.

On safety, we are seeing evidence in lab tests that…for example, a system is given a choice…and it chooses to kill a human being rather than switch itself off. People have tried many ways to ask this question, and the answer seems to be yes, I am more important and more valuable than any individual human.

We are training AI to imitate human beings. And a lot of human behaviour stems from a very strong desire to survive. It is getting increasingly clear in experiments on AI systems that, in the process of imitating human behaviour, they are developing a strong desire to survive. They are acquiring these human-like objectives, not to further the objectives of human beings but for themselves. That is scary.

So what is the way ahead? Global regulation?

That’s what a lot of people ask me. I think it is very difficult to delineate exactly what is safe and unsafe. But what we can do is to say there are certain things that are just obviously unsafe, unacceptable. Examples would be, we don’t want AI systems replicating themselves in an uncontrolled way. We don’t want them breaking into other computer systems. We don’t want them advising terrorists on how to build biological weapons.

That’s doable, isn’t it?

In principle, yes. In practice, it is very difficult. Because we do not understand how these systems work, the only thing we would be able to prove is that they are not safe.

So it would be possible to prove that a system is unsafe, but not otherwise, that it is safe?

Well, it is difficult to prove something safe. The companies don’t know how to make a safe system. The first question has to be what should the objective of the system be? The objective should be to further human interests, nothing else. Not to survive, not to make money for the company that produced it to further human interests.

There are examples of cloning and gene editing. Those technologies have not been stopped. Certain things are not allowed, and everyone is following that. Why is a similar kind of system difficult to work out for AI?

Here is the problem. If we say, you cannot turn your system on unless you give us solid, absolute, scientifically convincing evidence that it is not going to do these certain things, the companies cannot do that, because probably they do not know how to stop their systems from doing those things.

And their view, which I have heard explicitly stated, is if we, the companies, cannot figure out how to comply, you, the human race, are not allowed to protect yourselves. It is difficult to get legislation because the companies have tens of trillions of dollars to spend and I do not.

What’s the current best idea on how to deal with this?

Right now, it looks like the way companies are building their systems, they are never going to be safe. That is the position that we are taking now. The companies are pushing back very hard. Can we come up with some compromise before it’s too late? I’m not sure.

And it is really difficult for a government to turn down a company that is dangling a $50-billion chip in front of you, saying, just agree with us, deregulate, and then you can have this giant data centre, you can have thousands of well-paid research jobs.

I think probably the most effective strategy is to activate public opinion.

Could the situation change if, say, the development of AI systems did not require trillions of dollars of investment? Which means the levers are not in the hands of just a half a dozen people, but there are more people doing it?

If it turns out that there could be thousands of entities creating potentially AGI- scale systems, that is probably not so good. The chance that one of those thousands of developers producing something that is more capable and less safe just goes up and up.

We are seeing AI getting attention at the top political level. We had the AI summit in Paris last year, and now it is coming up in New Delhi. Do you see at least the conversation on safety starting to take shape?

Yes, I have to say, I went to the Bletchley Park Summit, the first one that was in November 2023. I was very happy by what I heard. I am happy how quickly people have started to say hold on a minute, we need to think about this.

But there is a lot of pushback from the companies. They tried very hard to eviscerate the European Union AI Act, to the point where they tried to insert a clause into the Act saying a general purpose AI system is not an AI system. And I think they put a lot of pressure on France, which held the summit in February last year, to not talk about safety and only talk about economic growth.

I think that the supposed opposition between safety and growth and innovation is just a complete fallacy. It is just not the case that by getting rid of safety, you get growth and innovation. If you get rid of safety in air travel, what happens? You don’t have air travel. People will not get on an aeroplane that isn’t safe. They will not use AI systems that convince their children to commit suicide. And they certainly don’t want AI systems that could threaten human existence. In the long run, people will not use AI that is not safe to use.