It’s Time to Open the Black Box of Social Media
Social media companies need to give their data to independent researchers to
better understand how to keep users safe
Social media platforms are where billions of people around the world go to connect
with others, get information and make sense of the world.
These companies,
including Facebook, Twitter, Instagram, Tiktok and Reddit, collect vast amounts of
data based on every interaction that takes place on their platforms.
And despite the fact that social media has become one of our most important public
forums for speech, several of the most important platforms are controlled by a small
number of people. Mark Zuckerberg controls 58% of the voting share of Meta, the
parent company of both Facebook and Instagram, effectively giving him sole control
of two of the largest social platforms. Now that Twitter’s board has accepted Elon
Musk’s $44 billion offer to take the company private, that platform will likewise
soon be under the control of a single person. All these companies have a history of
sharing scant portions of data about their platforms with researchers, preventing us
from understanding the impacts of social media to individuals and society. Such
singular ownership of the three most powerful social media platforms makes us fear
this lockdown on data sharing will continue.
After two decades of little regulation, it is time to require more transparency from
social media companies.
In 2020, social media was an important mechanism for the spread of false and
misleading claims about the election, and for mobilization by groups that
participated in the January 6 Capitol insurrection.
We have seen misinformation
about COVID-19 spread widely online during the pandemic. And today, social
media companies are failing to remove the Russian propaganda about the war in
Ukraine that they promised to ban. Social media has become an important conduit
for the spread of false information about every issue of concern to society. We don’t
know what the next crisis will be, but we do know that false claims about it will
circulate on these platforms.
Unfortunately, social media companies are stingy
about releasing data and
publishing research, especially when the findings might be unwelcome (though
notable exceptions exist). The only way to understand what is happening on the
platforms is for lawmakers and regulators to require social media companies to
release data to independent researchers. In particular, we need access to data on the
structures of social media, like platform features and algorithms, so we can better
analyze how they shape the spread of information and affect user behavior.
For example, platforms have assured legislators that they are taking steps to counter
mis/disinformation by flagging content and inserting fact-checks. Are these efforts
effective? Again, we would need access to data to know. Without better data, we
can’t have a substantive discussion about which interventions are most effective and
consistent with our values. We also run the risk of creating new laws and regulations
that do not
adequately address harms, or of inadvertently making problems worse.
Some of us have consulted with lawmakers in the United States and Europe on
potential legislative reforms like these. The conversation around transparency and
accountability for social media companies has grown deeper and more substantive,
moving from vague generalities to specific proposals. However, the debate still lacks
important context. Lawmakers and regulators frequently ask us to better explain why
we need access to data, what research it would enable and how that research would
help the public and inform regulation of social media platforms.
To address this need, we’ve created this list of questions we could answer if social
media companies began to share more of the data they gather about how their
services function and how users interact with their systems. We believe such
research would
help platforms develop better, safer systems, and also inform
lawmakers and regulators who seek to hold platforms accountable for the promises
they make to the public.
Research suggests that misinformation is often more engaging than other types of
content. Why is this the case? What features of misinformation are most associated
with heightened user engagement and virality? Researchers have proposed that
novelty and emotionality are key factors, but we need more research to know if this
is the case. A better understanding of why misinformation is so engaging will help
platforms improve their algorithms and recommend misinformation less often.
Research shows that the delivery optimization techniques
that social media
companies use to maximize revenue and even ad delivery algorithms themselves can
be discriminatory. Are some groups of users significantly more likely than others to
see potentially harmful ads, such as consumer scams? Are others less likely to see
useful ads, such as job postings? How can ad networks improve their delivery and
optimization to be less discriminatory?
Social media companies attempt to combat misinformation by labeling content of
questionable provenance, hoping to push users towards more accurate information.
Results from survey experiments show that the effects of labels on beliefs and
behavior are mixed. We need to learn more about whether labels are effective when
individuals encounter them on platforms. Do labels reduce the spread of
misinformation or attract attention to posts that users might otherwise ignore? Do
people start to ignore labels as they become more familiar?
Internal studies at Twitter show that Twitter’s algorithm ms amplify right-leaning
politicians and political news sources more than left-leaning accounts in six of seven
countries studied. Do other algorithms used by other social media platforms show
systemic political bias as well?
Because of the central role they now play in public discourse, platforms have a great
deal of power over who can speak. Minority groups sometimes feel their views are
silenced online as a consequence of platform moderation decisions. Do decisions
about what content is allowed on a platform affect some groups disproportionately?
Are platforms allowing some users to silence others
through the misuse of
moderation tools or through systemic harassment designed to silence certain
viewpoints?
Social media companies ought to welcome the help of independent researchers to
better measure online harm and inform policies. Some companies, such as Twitter
and Reddit, have been helpful, but we can’t depend on the goodwill of a few
companies, whose policies might change at the whim of a new owner. We hope a
Musk-led Twitter will
be as forthcoming as before, if not moreso. In our fast-
changing information environment, we should not regulate and legislate by
anecdote. We need lawmakers to ensure our access to the data we need to help keep
users safe.