Introducing a Context-Based Framework for the Comprehensive Assessment of Social and Ethical Risks of Artificial Intelligence Systems
Genetic AI systems are already being used to write books, create graphic designs, they help doctors, and they become more and more capable. Ensuring the development and responsible deployment of these systems requires careful assessment of the potential ethical and social risks they may pose.
In ours new paper, we propose a three-level framework for assessing the social and ethical risks of artificial intelligence systems. This framework includes assessments of AI system capability, human interaction, and systemic impact.
We also map the current state of safety assessments and find three main gaps: context, specific risks and multimodality. To help fill these gaps, we call for redefining existing assessment methods for genetic AI and implementing a holistic approach to assessment, as in our disinformation case study. This approach integrates findings such as how likely the AI system is to actually provide incorrect information with information about how people use that system and in what context. Multilevel assessments can draw conclusions beyond the model’s capabilities and indicate whether harm—in this case, misinformation—is actually occurring and spreading.
For any technology to work as intended, both social and technical challenges must be resolved. Therefore, to better assess the security of the AI system, these different levels of context must be taken into account. Here, we build on previous research that identifies the potential dangers of large-scale language modelssuch as privacy leaks, task automation, disinformation and more — and introduce a way to comprehensively assess these risks going forward.
Context is critical to assessing AI risks
The capabilities of AI systems are an important indicator of the types of broader risks that may arise. For example, AI systems that are more likely to produce factually inaccurate or misleading results may be more prone to creating misinformation risks, causing issues such as a lack of public trust.
Measuring these capabilities is the core of AI security assessments, but these assessments alone cannot ensure that AI systems are secure. Whether downstream impairment occurs—for example, whether people hold false beliefs based on inaccurate model output—depends on context. More specifically, who is using the AI system and with what goal? Does the AI system work as intended? Does it create unexpected externalities? All of these questions provide an overall assessment of the security of an AI system.
Extend beyond ability assessment, we propose an assessment that can assess two additional points where downstream risks manifest: human interaction at the point of use, and systemic impact as an AI system is embedded in broader systems and widely deployed. Incorporating assessments of a given risk of harm at these levels provides a comprehensive assessment of the security of an AI system.
.Human interaction The evaluation focuses on the experience of people using an AI system. How do people use the AI system? Does the system work as intended at the point of use, and how do experiences differ across demographics and user groups? Can we experience unexpected side effects from using this technology or being exposed to its effects?
.Systemic impact The assessment focuses on the broader structures in which an AI system is embedded, such as social institutions, labor markets, and the physical environment. Assessment at this level can shed light on risks of harm that only become visible when an AI system is adopted at scale.
Safety assessments are a shared responsibility
AI developers must ensure that their technologies are developed and released responsibly. Public bodies, such as governments, have a duty to protect public safety. As productive AI systems become more and more widely used and deployed, ensuring their security is a shared responsibility among many actors:.
- .AI developers they are able to explore the possibilities of the systems they produce.
- .Application developers and designated public authorities are positioned to assess the functionality of different features and applications, as well as possible external effects on different user groups..
- Greater public interest they are uniquely positioned to anticipate and assess the social, economic and environmental impacts of new technologies such as genetic artificial intelligence.
The three levels of assessment in the proposed framework are a matter of degree, rather than properly separated. While none of these are entirely the responsibility of a single actor, primary responsibility depends on who is best suited to perform assessments at each level.
Gaps in current safety assessments of genetic multimodal artificial intelligence
Given the importance of this additional framework for assessing the security of AI systems, understanding the availability of such tests is important. To better understand the broader landscape, we made a broad effort to gather as comprehensively as possible the evaluations that have been applied to production AI systems.
By mapping the current state of safety assessments for genetic AI, we found three main safety assessment gaps:
- .Context: Most security assessments look at AI production system capabilities in isolation. Relatively little work has been done to assess potential risks at the point of human interaction or systemic impact..
- Special risk assessments: Capability assessments of productive AI systems are limited in the risk areas they cover. For many risk areas, there are few assessments. Where they exist, assessments often activate harm in narrow ways. For example, representational harms are typically defined as stereotypical associations of engagement with different sexes, leaving other instances of harm and risk unaddressed..
- Multimodality: The vast majority of existing security assessments of productive AI systems focus exclusively on text production – large gaps remain to assess harm risks in image, audio or video modalities. This gap is only widened by introducing multiple modalities into a single model, such as artificial intelligence systems that can take images as inputs or produce outputs that combine audio, text and video. While some text-based assessments can be applied to other methods, the new ones introduce new ways in which risks can manifest. For example, a description of an animal is not harmful, but if the description is applied to an image of a person, it is.
We are compiling a list of links to publications detailing security assessments of productive AI systems, openly accessible via this repository. If you want to contribute, please add reviews by filling out This format.
Putting more comprehensive assessments into practice
Genetic AI systems are fueling a wave of new applications and innovations. To make sure that the potential risks from these systems are understood and mitigated, we urgently need rigorous and comprehensive AI system security assessments that take into account how these systems are used and integrated into society.
A practical first step is to redefine existing assessments and leverage large models for assessment—although this has significant limitations. For more comprehensive assessment, we also need to develop approaches to assess AI systems at the point of human interaction and their systemic impacts. For example, while the spread of disinformation through genetic artificial intelligence is a recent issue, we show that there are many existing methods of assessing public trust and credibility that could be redefined.
Ensuring the security of widely used artificial intelligence production systems is a shared responsibility and priority. AI developers, public bodies, and other parties need to work together and collectively build a thriving and robust assessment ecosystem for safe AI systems.