Dear Steve Huffman,
Last week, soon after Reddit announced plans to restrict free access to the Reddit API, the company cut off access to Pushshift, a data resource widely used by communities, journalists, and thousands of academics worldwide (see Pushshift’s official response).
We are writing to express concern about this sudden disruption to critical resources, and the uncertainty about the future it has created. We are asking for clarification and a meeting about the best ways to restore essential functionality for the communities that power your platform and the researchers who rely on your platform for essential public-interest work. To support that dialogue, we are coordinating a survey of the impact.
By preventing communities from accessing the very data they generate, Reddit has severely disrupted the safety and functionality of your platform. As you know, Reddit relies on volunteers to create moderation technologies and to do moderation labor that costs your competitors hundreds of millions of dollars per year. Tens of thousands of volunteers protect children’s safety, manage sensitive mental health support, and mediate some of the world’s largest conversation spaces for constructive civic discourse.
To succeed at their role, these unpaid leaders and workers need to access historical and contemporary community data to moderate a conversation space with over 1.5 billion active users. For many years, Reddit has relied on volunteer labor and computing infrastructure from Pushshift to provide communities with essential data services. You have now cut that off without warning to communities and haven’t offered alternatives, which will degrade safety protections across Reddit.
People who participate on Reddit have also contributed fundamental advances in science by donating data and participating in public conversations that researchers study in the public interest. Humanity is smarter and safer thanks to Reddit research on social media privacy, mental health treatments, child protection, crisis response, stock market governance, COVID-19 response, and democratic discourse. Just one Reddit dataset, Pushshift, has been cited in over 1,700 scholarly articles. By cutting off Pushshift and casting doubt on the future of data access, Reddit puts independent research at risk.
The Coalition for Independent Technology Research is organizing this letter with community moderators, academic researchers, and civil society groups who do research for the common good. While we are independent from Pushshift, many of us have relied on Pushshift data. We believe Reddit should commit to practical steps that will enable us to quickly and reliably restore the services we operate, and that communities and researchers should have a consequential voice in decisions about how that is achieved.
We know that online platforms face a complex legal and moral environment around data privacy. Balancing privacy and the common good is one of the most important challenges of our time. But it requires just that — balance. When these data systems degrade, Reddit will become a more dangerous place to be, moderators will burn out trying to protect their communities, and countless public-interest research projects will get snuffed out.
As a starting point, our group of moderators and researchers are organizing a collective survey on the disruptions across the platform and research, which we expect to complete in the next week. We call on Reddit to meet with us to discuss how the company will commit to restoring, in a timely manner, the kind of data access needed for services and research that have been disrupted.
(affiliations for identification purposes only)
- J. Nathan Matias, /u/natematias, Founder, Citizens and Technology Lab, Cornell University & Executive Committee, Coalition for Independent Technology Research
- Sarah Gilbert, u/SarahAGilbert, Research Director, Citizens and Technology Lab, Cornell University
- Brian C. Keegan, u/brianckeegan, Director, Colorado Laboratory for Users, Media, and Networks, University of Colorado Boulder
- Josephine Lukito, Media & Democracy Data Cooperative, Center for Media Engagement
- Jasmine Walker, Stanford Practitioner Fellow & Veteran Moderator
- Kai-Cheng Yang, Observatory on Social Media
- yellowmix, u/yellowmix
- Ethan Zuckerman, u/ethanz, Founder, Initiative for Digital Public Infrastructure, University of Massachusetts, Amherst
- Nathalie Marechal, u/privacyrschr, Co-Director, Privacy & Data Project, Center for Democracy & Technology & Board of Directors, Coalition for Independent Tech Research
- David Lazer, Director of the Lazer Lab, Northeastern University
- Rebekah Tromble, Director, Institute for Data, Democracy & Politics, George Washington University
- Brandi Geurkink, Senior Policy Fellow, Mozilla