OpenAI outlines AI safety plan, allowing board to reverse CEO decisions

OpenAI released guidelines on how the ChatGPT maker plans to deal with extreme risks from its most powerful AI systems. PHOTO: REUTERS

Updated

Published

Dec 19, 2023, 07:46 AM

SAN FRANCISCO – Artificial intelligence (AI) company OpenAI said its board can choose to hold back the release of an AI model even if the company’s leadership has deemed it safe, another sign of the start-up empowering its directors to bolster safeguards for developing the cutting-edge technology.

The arrangement was spelled out in a set of guidelines released on Dec 18 on how the ChatGPT-maker plans to deal with what it may deem to be extreme risks from its most powerful AI systems.

The release of the guidelines also follows a period of turmoil at OpenAI after chief executive Sam Altman was briefly ousted by the board, putting a spotlight on the balance of power between directors and the company’s c-suite

The Microsoft-backed OpenAI will deploy its latest technology only if it is deemed safe in specific areas such as cyber security and nuclear threats. The company is also creating an advisory group to review safety reports and send them to the company’s executives and board. While executives will make decisions, the board can reverse those decisions.

Since OpenAI launched ChatGPT a year ago, the potential dangers of AI have been top of mind for both AI researchers and the general public.

Generative AI technology has dazzled users with its ability to write poetry and essays, but also sparked safety concerns with its potential to spread disinformation and manipulate humans.

OpenAI’s recently announced “preparedness” team said it will continuously evaluate its AI systems to figure out how they fare across four different categories – including potential cyber-security issues as well as chemical, nuclear and biological threats – and work to lessen any hazards the technology appears to pose.

Specifically, the company is monitoring for what it calls “catastrophic” risks, which it defines in the guidelines as “any risk which could result in hundreds of billions of dollars in economic damage or lead to the severe harm or death of many individuals”.

Mr Aleksander Madry, who is leading the preparedness group and is on leave from a faculty position at the Massachusetts Institute of Technology, told Bloomberg News his team will send a monthly report to a new internal safety advisory group. That group will then analyse Mr Madry’s team’s work and send recommendations to Mr Altman and the company’s board, which was overhauled after ousting the CEO.

Mr Altman and his leadership team can make a decision about whether to release a new AI system based on these reports, but the board has the right to reverse that decision.

OpenAI announced the formation of the “preparedness” team in October, making it one of three separate groups overseeing AI safety at the start-up.

There is also “safety systems”, which looks at current products such as GPT-4, and “superalignment”, which focuses on extremely powerful – and hypothetical – AI systems that may exist in the future.

Mr Madry said his team will repeatedly evaluate OpenAI’s most advanced, unreleased AI models, rating them “low”, “medium”, “high”, or “critical” for different types of perceived risks. The team will also make changes in hopes of reducing potential dangers it spots in AI and measure their effectiveness. OpenAI will only roll out models that are rated “medium” or “low”, according to the new guidelines.

“AI is not something that just happens to us that might be good or bad,” Mr Madry said. “It’s something we’re shaping.”

Mr Madry said he hopes other companies will use OpenAI’s guidelines to evaluate potential risks from their AI models as well.

The guidelines, he added, are a formalisation of many processes OpenAI followed previously when evaluating AI technology it has already released. He and his team came up with the details over the past couple months, he said, and got feedback from others within OpenAI.

In April, a group of AI industry leaders and experts signed an open letter calling for a six-month pause in developing systems more powerful than OpenAI’s GPT-4, citing potential risks to society.

A May Reuters/Ipsos poll found that more than two-thirds of Americans are concerned about the possible negative effects of AI and 61 per cent believe it could threaten civilisation. BLOOMBERG, REUTERS

Join ST's Telegram channel and get the latest breaking news delivered to you.

The Straits Times

Business

The Straits Times

OpenAI outlines AI safety plan, allowing board to reverse CEO decisions

The Straits Times