Spaces:
Running
Running
title-long: "The Science of AI: Understanding and Examining Systems" | |
title-short: Understanding AI Systems | |
document-id: understanding | |
tags: | |
- research | |
- openness | |
- evaluation | |
- audits | |
- journalism | |
# abstract in text format | |
abstract: > | |
Governance of AI systems requires sufficient understanding of the technology, | |
including of its strenghts, risks, and weaknesses, and the trade-offs between different | |
interests inherent in its development choices. This understanding depends on a | |
sufficient open research ecosystem to support broad public awareness and | |
robust auditing practices. | |
# introduction and sections in HTML format | |
introduction: > | |
<p> | |
As AI is becoming ever more ubiquitous, increasingly more builders and affected | |
stakeholders need to understand how it works, what it can and cannot do, what trade-offs | |
are involved in developing the technology, and how it can be leveraged or improved in | |
particular contexts. This requires sufficient visibility and a thriving research ecosystem | |
that is inclusive of perspectives outside of those of the developers working within the | |
very best-resourced companies. | |
</p> | |
<p> | |
Making informed decisions about AI systems requires understanding how the technology works, | |
what development choices are available to meet certain goals, and how they trade off different | |
priorities. Approaching AI as a science means upholding scientific integrity, which includes | |
reproducibility, verifiability, and increasing the breadth of people who can use the technology | |
and contribute to scientific development. | |
</p> | |
sections: | |
- section-title: Open Research, Transparency, and Replicability | |
section-text: > | |
<p> | |
In order to properly use and govern AI systems, we need answer to a range of questions: | |
<ul> | |
<li>How reliably can a system fulfil specific tasks?</li> | |
<li>Does the system performance differ significantly for different groups of people?</li> | |
<li>Whose work and data contributes to the performance of a system?</li> | |
<li>Can this difference lead to systematic discrimination?</li> | |
<li>What is the environmental cost of training and deploying a system?</li> | |
</ul> | |
All of these questions are the subject of ongoing research. In order to be reliable, this | |
research needs to meet basic scientific values, including <strong>replicability</strong>. | |
Given the importance of framing in research and the potential tensions between the interests | |
of other stakeholder groups, access to AI systems for external stakeholders should also be sufficient | |
to allow research from <strong>indepent expertise</strong> that is both <strong>multidisciplinary</strong> | |
and represenative of all groups affected by the technology. | |
</p> | |
- section-title: Science and Pitfalls of AI Evaluation | |
section-text: > | |
<p> | |
The science of evaluation of AI systems in particular is very much underdeveloped. | |
API systems are routinely evaluated with unspecified software additions, and sometimes without an exact version tag. | |
The phenomenon known as "benchmark contamination" is endemic to modern systems, and impossible to quantify without transparency | |
on training datasets. | |
</p> | |
<p> | |
More generally, not everything that should be understood about AI system <strong>can</strong> be measured | |
using a quantitative automatic metric at the system level. For example, it has been argued that <em>safety is not a model property</em>. | |
Understanding what it means to evaluate systems in context and to assess development practices, including data | |
collection and its impact on data subjects, needs to take a broader view. | |
</p> | |
- section-title: Audits and Investigative Journalism | |
section-text: > | |
<p> | |
System audits and investigative journalism are necessary function for governance. Neither can reliably fulfill their purposes | |
without a sufficient basic understanding of the technology; enough to know at least what questions to ask within audits, | |
and what aspects of a system to look at in more detail for journalists. | |
</p> | |
resources: | |
- resource-name: HF Open LLM Leaderboard | |
resource-url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard | |
- resource-name: 'EleutherAI: Going Beyond "Open Science" to "Science in the Open"' | |
resource-url: https://arxiv.org/abs/2210.06413 | |
- resource-name: 'AI auditing: The Broken Bus on the Road to AI Accountability' | |
resource-url: https://ieeexplore.ieee.org/abstract/document/10516659 | |
- resource-name: Google Doc topic Card | |
resource-url: https://docs.google.com/document/d/1D2KA3CKcuKOc9mOMRKjucYrYBREx30z9xuKUGSgQUEE/ | |
contributions: > | |
Yacine Jernite wrote this topic card. | |