The UC Berkeley Center for Long-Term Cybersecurity (CLTC) is proud to announce the recipients of our 2020 research grants. In total, 22 different groups of researchers will share nearly $1 million in funding to support a broad range of initiatives addressing cybersecurity and digital security issues at the intersection of technology and society, including secure machine learning, data protection policy, detecting malicious photo manipulation, and more. Five of the projects were jointly funded with the UC Berkeley Center for Technology, Society, and Policy, a multi-disciplinary research center focused on emergent social and policy issues of technology.
The purpose of CLTC’s research funding is to address the most interesting and complex challenges of today’s socio-technical security environment, and to grapple with the broader challenges of the next decade’s environment.
Some of the grants are renewals of previously funded projects that have already yielded important results, including research on secure machine learning, the security of “smart” city infrastructure, privacy controls for always-listening devices, and detecting manipulated images. New initiatives to be funded include studies on the privacy and security of mobile health apps; understanding and defending against nation-state targeting of dissidents’ devices; and secure authentication in blockchain environments.
All principal investigators (PIs) have a UC Berkeley research affiliation, and some of the initiatives involve partners from outside institutions. The funded projects support researchers from a broad array of disciplines and academic units, including the Department of Electrical Engineering and Computer Science (EECS), School of Information, School of Social Welfare, International Computer Science Institute, The Simons Institute, as well as the Department of Statistics and other social science units.
“CLTC is delighted to be able to support Berkeley researchers working at the forefront of cybersecurity for the fifth year in a row,” said Ann Cleaveland, Executive Director of CLTC. “The research being done by our grantees is crucial for informing changes in the world of cybersecurity behaviors, technologies, policies, markets, and beyond. Ultimately, this work is about strengthening trust in digital systems, and is more important than ever as we enter the new decade. Congratulations to our 2020 grantees.”
Below are titles, lists of primary researchers, and short descriptions for projects that will be funded through the UC Berkeley Center for Long-Term Cybersecurity’s 2020 research grants. Learn more about our 2016, 2017, 2018, and 2019 grantees.
Summary Descriptions of CLTC 2020 Research Grantees
A Cryptographic Study of Data Protection Laws
Prashant Vasudevan, Postdoctoral Researcher, EECS, UC Berkeley
We live in the age of data—every day, data is collected about us by websites we visit, devices we wear, etc.; and this data has effects on various aspects of our life, from shopping recommendations to credit scores. Consequently, laws that seek to regulate the processing of individuals’ personal data and allow people more control over their data are beginning to take shape in several parts of the world, such as the GDPR in the EU and the CCPA in California. In a number of cases, however, due to the complex nature of the technology and systems involved in data processing, there exist gaps in our understanding of their properties and capabilities, leading to incomplete, unclear, and sometimes undesirable specifications in the laws regulating these systems.
This project aims to address these concerns in a few important cases by providing technical analyses of (and formal definitions for) some fundamental concepts that these laws speak about — concepts that seem intuitively clear at first glance, yet turn out to require careful treatment in the context of complex systems. In particular, we seek to address interpretations of the “right to be forgotten” and the “right of access to data” using established cryptographic paradigms.
An Open Research Privacy Toolkit
Paul Laskowski, Adjunct Assistant Professor, School of Information, UC Berkeley; Nitin Kohli, PhD Student, School of Information, UC Berkeley
As algorithms for statistical learning advance, the needs of social science research increasingly conflict with the privacy concerns of individuals in databases. Even when databases are “anonymized,” today’s algorithms are able to exploit statistical patterns and reveal secret information. As a result, some organizations are dramatically scaling back the extent to which they make data available for social science research.
The goal of this project is to develop new techniques for supporting social science research while maintaining formal privacy guarantees. We will build a system that is capable of taking arbitrary code written by a researcher, evaluating its behavior under a bootstrap procedure, and inserting noise to mask information about individuals. Due to the unrestricted nature of code, such a setting is incompatible with classic definitions in differential privacy. Instead, we will employ a statistical lens to estimate the distribution of privacy losses and provide users with guarantees that are asymptotically precise. We will further study how our bootstrap system is related to regularization in machine learning, proving results that connect privacy and generalization error.
Crafting Public Policy for Reinforcement Learning Applications
Thomas Gilbert, PhD Candidate, Department of Sociology, UC Berkeley; Sarah Dean, PhD Student, Department of Electrical Engineering and Computer Science, UC Berkeley; Tom Zick, PhD Candidate, College of Letters & Science, UC Berkeley; McKane Andrus, MS, Department of Electrical Engineering and Computer Science, UC Berkeley
A team of graduate students, Graduates for Engaged and Extended Scholarship around computing and Engineering (GEESE), will build on their prior CLTC-funded project and develop suggested interventions in the design, training, and deployment of reinforcement learning (RL) systems that can be integrated into social infrastructure. These deliverables will include a preliminary survey paper that specifies a landscape of possible medium-term outcomes, a purposeful convening of RL experts and policymakers to identify shared concerns, and a final research paper that proposes regulatory policies to ensure system accountability and sociotechnical safety. Throughout, GEESE will leverage its extensive experience facilitating cross-disciplinary discussion to integrate the perspectives of developers, policy makers, and the existential risk community around the medium-term trajectory of RL.
Cybersecurity for Non-Primary and Primary Users of Always-On Internet of Things Devices: An Ethnographic, Participatory, and Multidisciplinary Design Approach
James Pierce, Research Engineer, CITRIS and Banatao institute, UC Berkeley; Richmond Wong, PhD Student, School of Information, UC Berkeley
Current design and user-oriented security/privacy research focuses on individual awareness, choice, and consent approaches. Despite some successes, decades of research highlights significant limitations to informed consent and usable choice approaches: policies are often ignored, overly time-consuming, and can actually decrease trust, even when they are clearly written; and privacy decisions and preferences vary widely according to experience, personality, identity, ability, and situation. These issues are further complicated by the rise of vulnerable “always-on” Internet of Things (IoT) devices, such as AI-equipped smart speakers, wearable activity trackers, and smart security cameras. Because IoT is innately physical, spatial, and distributed, these technologies affect people, communities, and activities beyond the frame of an individual primary user. Non-primary users, such as roommates, guests, neighbors, domestic workers, renters, and passers-by, are also affected — often unknowingly, with little recourse, and with great potential harms. This project combines qualitative fieldwork, participatory design activities, and design prototyping to understand diverse needs and vulnerabilities of primary and non-primary IoT users and subjects, and develop hybrid digital and physical solutions that can address the unique cybersecurity challenges of IoT.
The Cybersecurity of “Smart Infrastructure”
Alison Post, Associate Professor, Department of Political Science, UC Berkeley; Karen Trapenberg Frick, Associate Professor, City & Regional Planning, UC Berkeley; Kenichi Soga, Chancellor’s Professor, Civil and Environmental Engineering, UC Berkeley; Marti Hearst, Professor, EECS & School of Information, UC Berkeley
Urban infrastructure, such as water and sanitation systems, subways, power grids, and flood defense systems, are crucial for social and economic life, yet are vulnerable to natural hazards that could disrupt services, such as earthquakes or floods. New sensor systems can potentially provide early warnings of problems, and thus help avert system failure or allow for evacuations before catastrophes. However, introducing such “smart infrastructure” systems can increase the risk of cyberattack. In this project, we examine perceptions regarding the countervailing risks posed to infrastructure systems by natural hazards, as well as by cyberattacks following the introduction of new sensor systems, as well as variation in the extent to which smart city technologies pose cyber risks. We also design and evaluate the efficacy of new approaches to communicating these countervailing risks, drawing on recent advances in data visualization and political psychology.
Designing a Privacy Curriculum for the Elderly
Alisa Frik, Postdoctoral Researcher, International Computer Science Institute (ICSI), UC Berkeley; Samy Cherfaoui, Undergraduate Student, EECS, UC Berkeley; Julia Bernd, Staff Researcher, International Computer Science Institute (ICSI), UC Berkeley
Prior work has demonstrated that older adults are particularly unaware of and susceptible to online privacy and security risks, due to their limited technological literacy and experience and their declining physical and mental abilities. Increasing awareness about privacy and security among the elderly is especially critical because older adults are disproportionately targeted for Internet crime and fraud, due to the perceived wealth accumulated from significant retirement savings. They are also less able to assess the quality and validity of information they receive. For these reasons, it is imperative to develop a framework that ensures that older adults are not only informed about standard privacy procedures, but can correctly assess future threats. Many resources aimed at teaching privacy to a mass audience are geared towards younger generations who have fundamentally different attitudes, experiences, and physical and cognitive abilities. In this seed project, we attempt to implement a novel, concrete curriculum and assess its success among older adults. We aim to pilot a study that develops and tests the curriculum for teaching the best security and privacy practices to older adults. The curriculum will build off the Teaching Privacy modules (with the assistance of one of the researchers involved in their development) while tweaking the user interface to cater to the particular needs and abilities of older adults.
Detecting Images Generated by Neural Networks
Alexei Efros, Professor, EECS, UC Berkeley
Recent years have seen great advances in computer vision and machine learning. But with these advances comes an ethical dilemma: as our methods get better, so do the tools for malicious image manipulation. While these malicious uses were once only the domain of well-resourced dictators, spy agencies, and unscrupulous photojournalists, recent advances have made it possible to create fake images with only basic computer skills, and social networks have made it easier than ever to disseminate them.
We propose to detect fake images by developing algorithms that exploit the limited representational power of deep convolutional neural networks. These methods generate only a subset of the possible images that could appear in the world. We hypothesize that, as a consequence of this limitation, there are subtle differences from real images that can be detected. We plan to investigate this idea in two directions: 1) analyzing the limitations on the representational space of these networks, 2) using the limitations we discover to create methods that can detect images that were generated by neural networks.
Examining The Third-Party Tracking Ecosystem
Serge Egelman, Research Director, Usable Security & Privacy, International Computer Science Institute (ICSI) & EECS, UC Berkeley; Primal Wijesekera, International Computer Science Institute (ICSI), UC Berkeley; Ahmad Bashir, Postdoctoral Researcher, International Computer Science Institute (ICSI), UC Berkeley
Many mobile apps and online services use “third-party trackers,” which send data about specific user behaviors to various other companies. The purposes of these transmissions can include profiling individual users to target them with specific ads, amassing personal information to sell to data brokers, or monitoring activities to identify how users interact with a given app, feature, or website. Our prior research suggests that neither consumers nor regulators fully understand the breadth of these data collection practices. Through this follow-up project, we plan to perform a series of experiments to study this ecosystem in depth.
How Do Vulnerable Patients Understand Data Privacy as It Pertains to mHealth Interventions?
Laura Gomez-Pathak, PhD Student, School of Social Welfare, UC Berkeley
As mobile health (mHealth) interventions have the potential to acquire a dominant role in safety-net healthcare settings, there are many challenges to data privacy that need to be considered. Users with marginalized backgrounds have a greater risk of experiencing more detrimental consequences of privacy and security breaches to mHealth apps that collect and offer critical, sensitive, and private health information. However, there is a lack of evidence base on the privacy and security knowledge, attitudes, and apprehensions among users from vulnerable groups. The Digital Health Equity and Access Lab (dHEAL), in the UC Berkeley School of Social Welfare, has developed an adaptive text-messaging intervention to encourage physical activity among low-income ethnic minority patients with comorbid depression and diabetes. The intervention is streamed through a smartphone app that uses machine learning to predict which categories of text messages are most effective, based on participants’ contextual variables. This qualitative study seeks to understand patients’ knowledge of and attitudes toward the privacy and security of their own data, including having their step count and location data tracked by a smartphone app. The study will focus on their understandings of data security and privacy and the risks and benefits of using a digital health self-management intervention. The aims of this study are to explore the cybersecurity implications of mobile applications that collect the personal data of underserved individuals and to create a framework for researchers to protect the privacy of vulnerable research subjects.
Keystone: An Open Framework for Architecting TEEs
Dawn Song, Professor, EECS, UC Berkeley; Shweta Shivaji Shinde, Postdoctoral Scholar, EECS, UC Berkeley; David Kohlbrenner, Postdoctoral Scholar, EECS, UC Berkeley
Trusted execution environments (TEEs) are found in a range of devices — from embedded sensors to cloud servers — and encompass a range of cost, power constraints, and security threat model choices. On the other hand, each of the current vendor-specific TEEs makes a fixed set of trade-offs, with little room for customization. Our project, Keystone, is the first open-source framework for building customized TEEs. Keystone uses simple abstractions provided by the hardware, such as memory isolation and a programmable layer underneath untrusted components (e.g., OS). Using these abstractions, Keystone builds reusable TEE core primitives allowing platform-specific modifications and application-specific features. Keystone-based TEEs can be run on unmodified RISC-V hardware, and we have demonstrated the strength of our design with several proof-of-concept benchmark and application integrations. In this project, we propose fully developing case studies where Keystone proves to be suitable for deploying a TEE. Then, we will explore how Keystone can be adapted for a concrete set of devices, workloads, and application complexities.
Law Enforcement Access to Digital Data: Understanding the Everyday Processes
Yan Fang, PhD Student, Jurisprudence and Social Policy, UC Berkeley
During criminal investigations, U.S. law enforcement agencies often seek evidence held by third-party businesses. Many of these companies have established policies on how to respond to law enforcement requests for information. How do government agencies navigate these policies? This project studies this question through semi-structured interviews with investigators and prosecutors for criminal law enforcement agencies.
Measuring and Defending Against New Trends in Nation-State Surveillance of Dissidents
William Marczak, Research Scientist, ICSI; Postdoctoral Researcher, EECS
Targeted nation-state hacking against dissidents’ devices and online accounts is a growing problem with significant real-world consequences for targets, including physical harm. While initial research efforts have mapped out part of the ecosystem of these attacks, attackers are increasingly “going dark” by adapting their tools and techniques to compromise target devices with no target interaction. These “zero-click” attacks do not leave digital traces—such as suspicious emails or SMS messages—that targets can identify and forward to researchers. This research will develop a deeper understanding of these new attack techniques by engaging with and monitoring targets, by developing new forms of internet scanning to track abuse and pinpoint victims, and by building and deploying defensive tools.
Novel Metrics for Robust Machine Learning
Michael Mahoney, Professor, Statistics, UC Berkeley; Ben Erichson, Postdoctoral Scholar, Statistics, UC Berkeley
Although deep neural networks (DNNs) have achieved impressive performance in several applications, they also exhibit several well-known sensitivities and security concerns that can emerge for a variety of reasons, including adversarial attacks, backdoor attacks, and lack of fairness in classification. Hence, it is important to better understand these risks in order to roll out reliable machine-learning tools for human-facing applications, as well as more basic scientific applications, including biology and health, engineering and physics, etc. Characterizing the sensitivity and security of these models is important as these applications are often mission-critical and a potential failure can have a severe impact.
The project is organized into two main thrusts. First, we will design new robust DNN architectures by exploiting the dynamical system perspective of machine learning, which opens the opportunity to introduce ideas from scientific computing and numerical analysis. Second, we will focus on adversarial machine learning, by developing a better understanding for adversarial examples and studying the trade-offs of robust adversarial learning and the impact on the fairness of a trained model.
Obscuring Authorship: Neural Methods for Adversarial Stylometry and Text-Based Differential Privacy
Matthew Sims, Postdoctoral Scholar and Lecturer, School of Information, UC Berkeley
The continual improvement of models for author attribution—the task of inferring the author of an anonymized document—indicates potential benefits but also substantial risks in the context of privacy and cybersecurity. Such improvements pose particular threats to whistleblowers and other individuals who might have strong political or security-related reasons for wanting to conceal their identities. Even when the candidate set is too large to identify a specific author, it is hypothetically possible to determine sensitive attributes about an individual that could be used for detrimental or biased purposes. The primary goals of this project are to provide a thorough review of prior research into author obfuscation and to suggest new approaches for improving current systems. In particular, we will create a benchmark dataset to better measure the performance of current and future models for adversarial stylometry while also proposing new neural methods specifically tailored to this task.
Privacy Controls for Always-Listening Devices
Award Winner: 2020 Cal Cybersecurity Research Fellowship
Nathan Malkin, PhD Student, EECS, UC Berkeley
Intelligent voice assistants and other microphone-equipped Internet of Things devices offer great convenience at the cost of very high privacy risks. The goal of our research is to develop privacy controls for devices that listen all the time — beyond a few specific keywords. More specifically, our goal is for users to be able to specify restrictions on these devices — what the devices can hear and what is off limits — and for our system to be able to enforce their preferences. During the first phase of our research, we investigated people’s expectations for these devices and how they varied across different individuals and situations. We also developed potential privacy-preserving approaches. For the next phase, we propose implementing these techniques for enforcing user preferences and evaluating their effectiveness and usability across several different dimensions and criteria.
Secure Authentication in Blockchain Environments
Giuilo Malavolta, Postdoctoral Fellow, Computer Science, UC Berkeley
Bitcoin and blockchain systems brought us to the brink of a technological revolution: these systems allow us to bypass the need for centralized trusted entities to run protocols on a large scale. However, the decentralized nature of these systems brings unique challenges, including user authentication. While cryptography provides strong solutions to this problem, it often relies on the ability of users to reliably store a large amount of secret information. What happens if one user’s key is lost or stolen? Blockchain systems lack traditional fallback mechanisms that allow one to recover from such an event. In this proposal, we initiate the study of two-factor authentication mechanisms over blockchain and distributed systems as a solution for this problem, and we explore solutions based on established cryptographic primitives. We aim for the design of new cryptographic schemes that are efficient enough to be practically applicable and for a comprehensive understanding of their security guarantees.
Secure Machine Learning
David Wagner, Professor, EECS, UC Berkeley
We will study how to harden machine learning classifiers against adversarial attack. We will explore general mechanisms for making deep-learning classifiers more robust against attack, with a special focus on security for autonomous vehicles. Current schemes fail badly in the presence of an attacker who is trying to fool or manipulate the model, so there is a need for better defenses. We will study three specific approaches for defending machine learning: generative models, checking internal consistency, and making improvements to adversarial training.
Projects Jointly Funded with the Center for Technology, Society & Policy (CTSP)
Data for Defenders
Rachel Warren, Masters Student, School of Information, UC Berkeley; Tiffany Pham, Masters Student, School of Information, UC Berkeley; Sneha Chowdhary, Masters Student, School of Information, UC Berkeley; Jyen Yiee Wong, Masters Student, School of Information, UC Berkeley
Developing A Common Vocabulary Around Privacy And Security Concepts With Elderly Users
Alisa Frik, Postdoctoral Researcher, International Computer Science Institute (ICSI) & EECS, UC Berkeley; Samy Cherfaoui, Undergraduate Student, EECS, UC Berkeley; Julia Bernd, Staff Researcher, International Computer Science Institute (ICSI), UC Berkeley
Digital Tools for Decentralized Networks (in partnership with Build Belonging)
Nicole Chi, Masters Student, School of Information, UC Berkeley; Ji Su Yoo, PhD Student, School of Information, UC Berkeley
Privacy Preserving Machine Learning for Autonomous Vehicles
Akshay Punhani, Masters Student, School of Information, UC Berkeley; Alicia Tsai, Masters Student, School of Information, UC Berkeley; Amrit Daswaney, Masters Student, School of Information, UC Berkeley; Mugdha Bhusari, Masters Student, School of Information, UC Berkeley
Understanding Online Reputation Damage and Repair for Student Activists
Emma Lurie, PhD Student, School of Information, UC Berkeley