Data-driven public policy depends on data. And, in the area of technology policy, access to data has been a significant barrier to research. Concerned about how online services might intrude on privacy, push hyper-partisan misinformation, or disadvantage their competitors? Those services aren’t sharing the relevant data with researchers.
A team at Princeton University, led by professor Jonathan Mayer, is working to solve this problem. Together with Mozilla — creator of the Firefox web browser — Mayer’s team has developed a novel research initiative for studying technology policy issues.
Mozilla Rally, which launched today, is a new platform that enables Firefox users to donate their data for the public good. Rally aims to build toward a future where users actively consent to the use of their data and can put it to work for society. The Rally research initiative, which is the first step toward that vision, enables academic researchers to run large-scale field studies on the web and examine how changes in user experiences might impact policy issues.
“Online services constantly experiment on users to maximize engagement and profit,” said Mayer, assistant professor at Princeton University, with appointments in the Department of Computer Science and the School of Public and International Affairs. “But for too long, academic researchers have been stymied when trying to experiment on online services. Rally flips the script and enables a new ecosystem of technology policy research.”
To date, data-driven research on online services has mostly relied on self-reported surveys, web crawls, and social media feeds. These methods have led to valuable insights, but they also have major limitations, Mayer said. Self-reported survey data can be unreliable, and web crawl data often does not represent real user experiences. Likewise, social media feeds are limited to specific platforms and user actions. These research shortcomings are especially problematic for technology policy issues that involve targeted, recommended, or personalized content, where user experience is a fundamental consideration.
“Rally offers researchers two valuable new perspectives,” said Ben Kaiser, a Ph.D. candidate in computer science at Princeton who worked on the Rally research initiative. “Studies can examine the internet from a user’s point of view, revealing how platforms and content appear to users. Rally also lets researchers study how users engage with platforms and content, yielding realistic, generalizable observations. These powerful capabilities will lead to leaps in understanding on a range of important technology policy issues.”
In addition to launching the Rally research initiative today, Mozilla and Mayer’s team are releasing WebScience, a new toolkit that enables researchers to build standardized browser-based studies. WebScience provides functionality that is commonly required for these types of studies, including measuring webpage navigation and attention, identifying exposure to linked content, observing social media shares, and extracting text from webpages for natural language processing.
“WebScience is made more powerful by being public: Researchers throughout the academic community benefit from the ability to quickly create studies using WebScience and can contribute new measurements for others to use,” said Anne Kohlbrenner, a Ph.D. candidate in computer science at Princeton, who worked on implementing WebScience.
Before Rally, conducting research with web browser data required starting from scratch: implementing a browser extension to measure user experiences, building a data collection and analysis backend, and identifying users who might be willing to participate. Rally and WebScience simplify this process, providing a ready-to-go template and toolkit for building extensions, a backend for data collection and analysis, and a central portal for announcing studies and recruiting participants.
Respect for users is foundational to the Rally research initiative. Every academic study on Rally requires specific informed consent, is subject to Institutional Review Board requirements, and is reviewed by Mozilla for privacy and security. WebScience encourages studies to minimize data collection, and researchers analyze Rally data in a secure environment maintained by Mozilla.
“Cutting people out of decisions about their data is an inequity that harms individuals, society, and the internet. We believe that you should determine who benefits from your data. We are data optimists and want to change the way the data economy works for both people and day-to-day business."Rebecca Weiss, Rally Project Lead
The first academic study in the Rally research initiative, “Political and COVID-19 News,” was built by the Princeton team. An initial version launched in November 2020, as a pilot for the research initiative. The ongoing study examines how people engage with news and misinformation about politics and COVID-19 across online services. The next iteration of that study is launching on Rally today, and Mayer’s group has additional upcoming studies on misinformation and privacy.
The Princeton team’s work on Rally is supported by the Schmidt DataX Fund at Princeton University, made possible through a major gift from the Schmidt Futures Foundation.
Rally is currently available to all Firefox users in the United States. To join, go to rally.mozilla.org.