The AI Policy Hub is an interdisciplinary initiative at UC Berkeley dedicated to training graduate student researchers to develop governance and policy frameworks to guide artificial intelligence, today and into the future.
On April 20, 2023, the AI Policy Hub held its inaugural AI Policy Research Symposium, a showcase of the research carried out by the initial cohort of AI Policy Hub fellows, a group of six graduate students whose work addresses topics such as improving the explainability of AI systems, understanding the consequences of AI-enabled surveillance, and minimizing the harms of AI-based tools that are increasingly used in criminal prosecutions.
The event was moderated by the AI Policy Hub’s two directors: Jessica Newman, Research Fellow at the Center for Long-Term Cybersecurity (CLTC) and Director of the Artificial Intelligence Security Initiative, and Brandie M. Nonnecke, Director of the CITRIS Policy Lab and Associate Research Professor at UC Berkeley’s Goldman School of Public Policy.
“We launched the AI Policy Hub a year ago with the ambition of fostering an interdisciplinary initiative at UC Berkeley to support and enable cohorts of graduate researchers who are passionate about safe and equitable AI,” Newman said in her opening remarks. “We believe that governing AI technologies is one of the most important challenges facing humanity today, and that we need as many people working on these challenges from as many backgrounds and areas of expertise as possible.”
Newman explained that, in keeping with CLTC’s future-oriented focus, the AI Policy Hub works to anticipate AI-related challenges that policymakers will face as technologies evolve. “Our mission is to cultivate an interdisciplinary research community to anticipate and address policy opportunities for safe and beneficial AI,” Newman said. “To us this means we have to investigate what is happening today from a broad perspective, looking at the intersections of the technical, political, societal, legal and ethical dimensions, while also looking over the horizon to ensure policies are not out of date by the time they are enacted.”
Weighing Limits on AI
The event’s first keynote was presented by Stuart Russell, Professor of Computer Science at UC Berkeley, holder of the Smith-Zadeh Chair in Engineering, and Director of the Center for Human-Compatible AI and the Kavli Center for Ethics, Science, and the Public. Russell’s talk was recorded in advance as he was attending a meeting of the Organisation for Economic Cooperation Development (OECD), in Paris, where he was “hammering out the final details of the language of the definition of artificial intelligence” that will likely be used by lawmakers in the U.S., the EU, and elsewhere.
Russell said that different regulating bodies are struggling to agree on a single definition of AI. “There is a persistent misunderstanding of the difference between the method by which the AI system is constructed, which does have implications for how its accuracy should be evaluated, and the object itself, the output of the construction process,” he said.
He explained that recent rapid advances in AI have raised new questions about how such technologies will be managed in the future. “Once you have machines that are more capable than human beings, it’s rather hard for us to stop them doing whatever they want to do,” Russell said. “So that leads to the question, what do they want to do?… You have a system that supposedly exhibits sparks of artificial general intelligence (AGI) that may or may not have its own internal goals, and it’s being released presumably to eventually hundreds of millions or billions of people. And it’s completely unregulated.”
Early efforts to limit AI technologies have exposed challenges, Russell said. For example, “depending on how they are prompted, applications like ChatGPT may still give unlicensed medical advice, or offer advice on how to commit suicide or develop banned weapons,” he noted. “Researchers have overall reduced the bad behavior by 29 percent compared to previous systems, basically by saying bad dog whenever it misbehaves, but that’s not really a particularly reliable form of producing guaranteed safe systems. Imagine if we simply said bad dog to a nuclear power station every time it explodes. That’s probably not the way you’d want to go about it.”
Russell pushed back against those who argue against regulation of AI, describing the risk as “analogous to waiting on developing a planetary defense system against asteroids until a large asteroid slams into the earth…. I don’t see why anyone thinks it’s a good idea to wait until artificial general intelligence arrives and then think about how to regulate it because it may, as Alan Turing predicted, be too late.”
Programs like the AI Policy Hub are important, Russell suggested, because they create a bridge for translation between policymakers and technologists. “Developing sensible policy around AI is extremely important,” he said. “I don’t think one should underestimate the difficulty of that for policymakers who have absolutely no exposure to what AI is really about and how these systems work, and who are often informed about it by lobbyists who have a vested interest…. It’s also difficult for AI researchers who have no knowledge or experience in how policy should be created…. There’s a huge opportunity, therefore, for people who understand both AI and policy, and who have real, solid, technical backgrounds in both of these areas, to make a huge difference. This is a really important time to be thinking and talking about AI policy.”
AI Policy Hub Fellows
The second part of the symposium featured a series of short presentations by the six graduate student researchers in the AI Policy Hub.
Alexander Asemota, a third-year PhD student in the UC Berkeley Department of Statistics, discussed his research on explainability in machine learning. He explained that he is investigating how “counterfactuals” — statements about something that did not happen, rather than what did — could be an effective approach for explaining the outputs of artificial intelligence.
AI is “increasingly used to make decisions in high-stakes contexts, for example, approving or denying someone for a loan,” Asemota explained, but “systems use complicated models that no one may understand. Even if the developer understands a model, how would they explain those decisions to someone who has no background in AI? Counterfactual explanations are a promising solution to this problem.”
“A counterfactual can answer the question, what if I had done x?” he said. “For example, if I apply for a loan, and I was denied, I can ask, if my income had been $1,000 higher, would I have been accepted for the loan? This approach can tell someone what they need to do to get the solution they want, and provides actionable feedback.”
His research has diverse policy implications, Asemota said. “Organizations that use counterfactuals for explanations should compare the recommendations to observed changes,” he said. “This improves the feasibility of recommendations, and it assists in differentiating between multiple different recommendations. So you can see which recommendations are more likely than others, based on observed changes in the past…. And though counterfactuals have significant potential, regulators really should scrutinize their use to make sure that they aren’t violating regulations. With that in mind, I’ve been working on a white paper directly to the Consumer Financial Protection Bureau, suggesting AI and explanation.”
AI Policy Hub Fellow Zoe Kahn presented her research using qualitative methods to understand the perspectives and experiences of communities that may be negatively impacted by AI and machine learning technologies. Kahn is working on a project that uses data-intensive methods to allocate humanitarian aid to individuals experiencing extreme poverty in Togo. Her research is related to a digital cash aid program in Togo that provides money digitally, with individuals’ eligibility determined through automated methods.
“The criteria that was used to determine who would receive money from this program (and who would not) was based on a machine-learning model that uses mobile phone metadata,” she said. “What we were really interested in understanding were the perspectives and experiences of people living in rural Togo who are really impacted by by this particular program. These are often people who have no technical training, little to no formal education, and varying levels of literacy.”
Kahn sought to provide subjects in her study with an “effective mental model about the data that’s recorded by mobile phone operators, as well as an intuition for how that information could be used to identify people in their community who are wealthier, and people in their community who are poor.” She and her team developed a visual aid to convey this understanding, and found it was “really useful as a way of providing people with an explanation.”
“Participants took up elements of the visual aid during the interviews both as a way of answering our questions, but also as a way of articulating their own ideas in new and interesting ways,” she said. “Not only do these methods enable policymakers to engage in what I’d call ‘good governance,’ but in my experience doing this work in Togo, it really is a way to treat people with dignity. My hope is that the methods we have developed in Togo can be used as a way to really meaningfully engage people who may not have technical training or familiarity with digital technology.”
Micah Carroll, a third-year Artificial Intelligence PhD student, presented his research on recommender algorithms, software used in a variety of everyday settings, for example, to recommend content on Netflix and Spotify or display products on Amazon based on a user’s past behavior. “One thing that’s understudied in this area is that user preferences are subject to change, and recommender algorithms themselves may have incentives to change people’s preferences — and their moods, beliefs, behaviors, and so on,” Carroll said.
Carroll studies the impacts of recommenders that are based upon reinforcement learning (RL), which are designed to “modify the state of the environment in order to get high rewards.” The use of RL creates “clear incentives for the recommendation system to try to manipulate the user into engaging more with the content they’re shown, given that the recommender system is being rewarded for keeping users engaged on the platform for longer,” he explained. “The question remains, can real recommender systems actually manipulate users? And I think there’s definitely a variety of evidence that shows that this seems in fact quite plausible.”
Companies should increase their transparency for AI systems that interface with humans, and describe in more detail what types of algorithms they’re using and how they’re using them, Carroll said. He also proposes the use of “auditing pathways” to enable routine auditing by third-party auditors, as well as “manipulation benchmarking standards” that can help measure the impacts. “One of the most important insights of this line of work is that manipulation can happen even without designer intent,” he said. “We should be worried about misuse and intentional manipulation that individuals might try to enact, but even companies that are for the most part trying their best might engage accidentally in manipulation.”
The next presentation was by Zhouyan Liu, a former investigative journalist who is now a student in UC Berkeley’s Master of Public Policy degree program, conducting empirical studies on China’s technology policy, digital surveillance, and privacy.
Liu said that China’s “surveillance actions raise our concerns about basic human rights and privacy issues,” but he emphasized that “surveillance itself is in fact one of the oldest political actions, not limited to China or the present day…. My question is, when the oldest political practice meets the latest technology, what has changed? What exactly has a technology changed about surveillance, and what impact has it had?”
China’s government relies on surveillance, Liu explained, because “prevention is better than the cure,” as technology empowers authoritarian rulers to detect problems and potential threats before people resist, and it provides the “manpower needed to collect large amounts of data” and the “ability to efficiently and comprehensively aggregate data about the same individuals from different fields and departments and perform automatic analysis.” His study sought to determine whether China’s government uses surveillance to limit unrest, and whether it improves regime security, by reducing the number of protests or reducing ordinary criminal offenses.
Using data drawn in part from the Chinese government’s pilot of the Sharp Eyes surveillance program, Liu concluded that AI-based surveillance did increase the government’s ability to intervene in civil disputes, but did not result in a significant change in the number of political opponents arrested. And while civil protests decreased, crime rates remained unchanged.
“The implication of the study is first and foremost significant for individuals living in authoritarian countries,” Liu said. “Even if you do not express any dissatisfaction or protest against the current regime, you’re still being monitored automatically by the system. On the other hand, such a system actually has a huge impact on the structure of authoritarian governments themselves. Some grassroots officials explicitly told me that they do not like this AI system, because AI technology takes away the power that originally belonged to them, and decisions and data are made at higher levels. This will be a shock to the entire government structure, and what political consequences they will bring is still unknown.”
Angela Jin, a second-year PhD student at UC Berkeley, presented a study on the use of evidentiary AI in the US criminal legal system, focused on the use of
probabilistic genotyping software (PGS), which is increasingly used in court cases to prosecute defendants, despite not having been properly validated for accuracy. PGS is used to link a genetic sample to a person of interest in an investigation.
“These cases have led to many calls for an independent assessment of the validity of these software tools, but publicly available information about existing validation studies continues to lack details needed to perform this assessment,” Jin said. “Despite these concerns, PGS tools have been used in a total of over 220,000 court cases worldwide. And evidentiary software systems are not just being applied to DNA evidence. Tools have been developed in use for many other applications, such as fingerprint identification, gunshot detection, and tool mark analysis.”
Jin’s study focused on internal validation studies, a series of tests that a lab conducts on the PGS prior to using it in casework. “These studies are commonly referred to in courtroom discussions, along with the standards that guide them, as indicators of PGS reliability and validity,” she said. “But these standards provide ambiguous guidance that leaves key details up to interpretation by individual forensic labs…. Second, these standards do not account for the technical complexity of these systems.”
She explained that she and her collaborators are creating a testing framework for probabilistic genotyping software. “In this work, our goal is to eventually influence testing requirements while creating more rigorous testing standards will help us move toward more reliable use of PGS in the US criminal legal system,” she said. “My work also seeks to incorporate defense attorney perspectives into the creation of such testing requirements and policies, especially as policymakers increasingly seek to incorporate perspectives from diverse stakeholders….Together, these projects seek to influence policy as a way to move us towards responsible use of PGS.
The final AI Policy Hub Fellow to present was Cedric Whitney, a third-year PhD student at Berkeley’s School of Information whose research focuses on using mixed methods to tackle questions of AI governance. He presented research building on prior work he carried out at the Federal Trade Commission (FTC) focused on algorithmic disgorgement and the “right to be forgotten” in AI systems.
Whitney explained that algorithmic disgorgement is “a remedy requiring a party that profits from illegal or wrongful acts to give up any profit that they made as a result of that illegal or wrongful conduct. The purpose of this is to prevent unjust enrichment and to make illegal conduct unprofitable. The FTC recently began using this remedy as a novel tool and settlement with companies that have developed AI products on top of illegally collected data.”
A related challenge, Whitney explained, is that “when a company builds a product on top of illegally collected data, just enforcing that they delete that data doesn’t actually remove the value they derive from it. A model trained on data doesn’t suddenly self-destruct itself post data deletion. By requiring that a company also delete the effective work product, algorithmic disgorgement takes aim at the true benefit of illegal data collection in these settings. My research has been looking at the landscape for use of this tool.”
Through his research, he hopes to “create a resource for policymakers and technologists to better understand what algorithmic disgorgement can and cannot do, both to assist in functional implementation, and prevent it from being portrayed as a panacea to all algorithmic harm…. It should be a priority to increase regulatory funding to meet the hiring needs necessary for utilizing this and other governance tools,” as “there is a high burden on technical capacity to even begin utilizing them at an effective manner.”
A Different Perspective on Regulation
The closing keynote of the event was delivered by Pamela Samuelson, the Richard M. Sherman ’74 Distinguished Professor of Law and Information at UC Berkeley and a Director of the Berkeley Center for Law & Technology.
Samuelson described the AI Policy Hub as “one of the best things I’ve encountered at Berkeley,” as she agreed that “the problem in getting good regulation in this space is that the policymakers, even if they’re well intentioned (and not all of them are) don’t actually know anything about technology…. On the other hand, AI researchers, assuming that they are well intended, don’t know anything about policy and what the policy levers are out there. These two communities need to work together, and what they really need is people like you. There are certainly going to be jobs for people who are able to integrate technology knowledge and also policy expertise.”
She noted that the challenge of governing in AI is that the thorniest issues emerge from the details. “Everybody wants to say there should be careful risk assessment and mitigation of risk, and it’s important for there to be transparency and for things to be explainable and contestable. But the problem is that it’s easy to reach consensus at this very high level of generality, and very difficult to get it down to, here’s what you can actually do on a day-to-day basis.”
How AI is regulated will vary by geographic region, Samuelson noted. “The Europeans like to think that they can think things out in advance, and then everybody will comply,” she said. “But I think that we need something that’s a little bit more flexible, and that recognizes that the state of AI is going to continue to evolve. Auditing requirements are going to be important. But again, we don’t have any standards right now for what a good audit might look like…. The idea of providing some access for researchers to be able to do some independent testing of the accuracy of these systems is something that I expect will happen over time.”
Samuelson pointed out that many of the harms that could result from AI-based technologies already are regulated by existing laws. “Even though you look at the US right now, and it doesn’t really have much in the way of AI regulation, there are lots of agencies that in fact will be enabling some sort of regulation,” she said. “The FAA is going to be regulating the use of AI in airplanes, at least I hope so, and the FDA in the use of AI in medical devices. Courts can still adjudicate product liability. So we may need some special rules, but there are lots of laws already on the books that will apply to AI, we just haven’t been thinking about that.”