Out-Law News 2 min. read

AI safety risk testing regime agreed at safety summit


Artificial intelligence (AI) models being developed by some of the world’s leading technology companies will be examined by government officials to make sure their use will not cause national security, safety or other societal harms, under a landmark new agreement.

Amazon Web Services, Anthropic, Google, Google DeepMind, Inflection AI, Meta, Microsoft, Mistral AI and Open AI have agreed to the arrangements, which envisage government agencies in 10 countries – including the US and UK – and the EU undertaking tests on next-generation AI models before and after they are deployed.

Australia, Canada, France, Germany, Italy, Japan, the Republic of Korea and Singapore were the other signatories of the agreement, which was reached on Thursday at the AI safety summit hosted by the UK government. It builds on the ‘Bletchley declaration’, which 28 countries – including China – together with the EU signed at the summit on Wednesday.

The declaration established an international consensus on the need for AI development and use to be “human-centric, trustworthy and responsible” and on the risks posed by ‘frontier AI’ – “highly capable general-purpose AI models, including foundation models, that could perform a wide variety of tasks - as well as relevant specific narrow AI that could exhibit capabilities that cause harm - which match or exceed the capabilities present in today’s most advanced models”.

The agreement on testing provides for the governments to build their own public sector capacity for testing and develop their own approaches to AI regulation. However, it also promotes international cooperation – including future development of shared standards in respect of testing.

In the UK, a new AI Safety Institute will be tasked with undertaking AI testing in collaboration with the participating developers. The existing Frontier AI Taskforce that has been operating within the UK government over the past four months will evolve to become the new body. The AI Safety Institute’s mission is to “minimise surprise to the UK and humanity from rapid and unexpected advances in AI”.

“The Institute will develop and run system evaluations, independently and in partnership with external organisations, while also seeking to address a range of open research questions connected to evaluations,” according to the UK government. “Evaluations may not be able to fully understand the limits of capabilities or assure that safeguards are effective.”

“The goal of the Institute’s evaluations will not be to designate any particular AI system as ‘safe’, and the Institute will not hold responsibility for any release decisions. Nevertheless, we expect progress in system evaluations to enable better informed decision-making by governments and companies and act as an early warning system for some of the most concerning risks. The Institute’s evaluation efforts will be supported by active research and clear communication on the limitations of evaluations. The Institute will also convene expert communities to give input and guidance in the development of system evaluations,” the government said.

At the AI safety summit on Thursday, representatives from 29 countries as well as the EU also agreed to support the development of a new report on the capabilities and risks of frontier AI. 

The “international, independent and inclusive” ‘state of the science’ report will “summarise the best of existing research and identify areas of research priority” with a view to informing future policymaking. Yoshua Bengio, member of the UN’s Scientific Advisory Board, will lead on the initiative, which will be supported by experts from each of the signatory states and other countries globally. The report is to be published before the next AI safety summit, which is due to be held in France in 12 months’ time.

“The intention of the ‘state of the science’ report is to facilitate a shared science-based understanding of the risks associated with frontier AI and to sustain that understanding as capabilities continue to increase, through its narrowly defined scope to review the latest, cutting-edge, research on the risks and capabilities of frontier AI models,” according to the agreement reached.

We are processing your request. \n Thank you for your patience. An error occurred. This could be due to inactivity on the page - please try again.