As digital data about people grows, journalists are increasingly turning to social media platforms such as Twitter and Facebook to find information. One trend in journalism, particularly in political reporting, is the compiling of publicly available social media data to infer and report on public opinion. Whether it’s reporting on what topics are trending, analyzing whether the Twittersphere is expressing positive or negative sentiment toward a political candidate or quoting Facebook users’ posts in a news report, journalists are using online comments and discussions to say something about public opinion and enrich their political reporting.

The problem is: What a journalist can discern from publicly available social media data is not exactly what is typically thought of as “public opinion.” Just because social media posts are freely available doesn’t mean they accurately represent what the public thinks or feels about political issues. And as fear of disinformation grows with every passing year, the way that public sentiment is compiled is more important than ever.

We explored the topic recently in a survey conducted by the Ryerson University Social Media Lab, and we found that this journalistic practice is not something all Canadians are comfortable with. That in itself brings ethical challenges.

When we say “publicly available social media data,” we mean social media posts that have been made from a public account or have been posted with a “public” privacy setting. This content is available to anyone and can be collected using an automated script, a social media application such as Tweetdeck, or simply by searching directly on the social media site. It is important to consider both the effectiveness and the ethics of using this data to infer public opinion.

Opinions derived from social media are not representative of the general public

Not everyone uses social media, and only a small percentage of social media users contribute their opinion to online discussions. Twitter is the “go-to” platform for opinion mining because it is often perceived as being public. But, our survey of 1,500 Canadian Internet users (age 18+) found that only 42 percent of respondents report having a Twitter account. Of those Twitter users, fewer than half visit the platform daily, and even a smaller percentage ever contribute their opinions online. Considering how Internet accessibility and digital literacy vary across the country and among demographics, it is easy to see that a sample of opinions shared on social media is not necessarily very representative of the general public.

Even many regular social media users in Canada do not have public accounts. Facebook is Canada’s most-used social media site, and only 18 percent of Canadians we surveyed said they have a public account. Most Facebook users have private or semi-private accounts, which means individuals would have to specifically select to share a post with a public audience for their posts to be accessible to journalists. Posting in public groups or on public pages on Facebook — a politician’s page, for example — is typically public by default.

Even when we look at social media sites where users’ accounts are public by default, such as Twitter, only about half of Canadians surveyed (48 percent) say they have a public Twitter account.

So, publicly available social media data represents a small slice of the Canadian population, which means it is not generalizable to all Canadians. Unlike traditional opinion polls, which at least try to take random samples of the population, social media data can’t come close. To effectively use social media data, news reports need to be clear about these limitations.

What the public thinks about their data being used by journalists

Many Canadians don’t think journalists should, or even can, use social media to infer public opinion. In our study, we found that only 43 percent of Canadian Internet users think journalists should use publicly available social media data to infer public opinion, while 47 percent think they can. If people feel like certain journalistic practices are inappropriate and shouldn’t be done or if people do not trust the information being reported, they might well turn to other sources or tune out. Worse, people can feel their privacy has been invaded, which presents ethical challenges.

The way a journalist chooses to use social media data plays an important role. Reporting on general trends or sentiment is generally considered acceptable, but 32 percent of Canadians in our survey have a problem with journalists quoting social media posts. This is because individuals are identifiable in this practice, like in a BBC report on BrewDog’s Mock Pink IPA. At least one quoted Twitter user from this story stumbled upon their tweet and felt uncomfortable about not being asked for consent.

There are at least two reasons people express discomfort with this practice. First, social media posts can feel very impermanent — even when they are public. In a sea of content, it is easy to feel like posts will disappear into the flood of opinions. The sheer volume of content can make people feel anonymous even when posting publicly or at least unlikely to be tracked down later.

Second, the expected audience and context on social media matter. People typically have a particular context in mind when they post on social media sites. People post as part of a specific conversation and assume a particular audience. If posts are collected and used in a different context, it can feel uncomfortable or, worse, like a violation of privacy.

It is worth noting, however, that 35 percent of social media users consider these practices to be beneficial. In particular, people who use social media to post political content tend to be more comfortable with journalists using their posts. It’s likely because some people want to be heard.

Ultimately, there are questions about whether or not it is effective and ethical to use publicly available social media data, particularly at a time when trust in news sources is being called into question.

What can be done?

Journalists are being asked to do more for less. Newsrooms are shrinking, and journalists are stretched thin. But they are still being asked to do more than ever as they navigate a fast-expanding digital environment, under pressure to produce reporting that reflects up-to-the-minute sentiment. In this context, social media can be a helpful source. But it can’t be a simple copy-and-paste.

First, journalists should get permission before using social media posts that identify the person sharing their views. For example, unless consent is obtained, quoting or embedding specific posts should be avoided since they include names, usernames, profile photos, and/or links to profiles. One exception is if the posts come from a public figure.

Next, journalists should explain how social media data was obtained. For example, when reporting what is trending on Twitter, it is helpful to explain that Twitter has an algorithm designed to figure out which topics are newly popular in particular regions and that individuals and groups can promote certain hashtags and topics to increase their likelihood of trending.

If analytics software is used to determine the sentiment of a particular social media conversation, journalists can tell readers what software was used and whether it was tested by its creators to ensure it is reliable and valid. Providing additional information about methodology and explaining what data was included in the analysis and pointing out what might be missing can also go a long way toward helping readers understand the value of the data being put in front of them. For example, collecting one election hashtag on Twitter offers a sense of the issues that matter to some Twitter users, but only those who choose to include that hashtag. There will invariably be other election hashtags that one party’s supporters will use more than another’s. Showing how data is compiled helps the audience contextualize the information.

This is a lot of additional content and work. It requires time, energy and expertise to get it right. But journalists need to get it right if they want to use social media data to report what the public thinks. This work helps to ensure that news stories are perceived as trustworthy and helps develop digital literacy amongst readers.

Crucially, for journalists to play this important educational role, they need support. Learning to evaluate this information and finding clear and concise ways to communicate it is not easy or cheap. Newsrooms should venture to employ and rely on experts who understand how social media data is generated, collected and organized. They can help guide journalists with their use of social media in their reporting.

The role of journalism is more important than ever as disinformation spreads and digital information increasingly becomes a source for understanding public opinion. Journalism has always had a strong foundation in ethical professional practice, and this needs to be continued and strengthened in a social media age.

Photo: Shutterstock/ By SofiaV


Do you have something to say about the article you just read? Be part of the Policy Options discussion, and send in your own submission. Here is a link on how to do it. | Souhaitez-vous rĂ©agir Ă  cet article ? Joignez-vous aux dĂ©bats d’Options politiques et soumettez-nous votre texte en suivant ces directives.

Elizabeth Dubois
Elizabeth Dubois is an assistant professor at the University of Ottawa and a fellow at the Public Policy Forum.
Anatoliy Gruzd
Anatoliy Gruzd is a Canada Research Chair in Social Media Data Stewardship, associate professor at the Ted Rogers School of Management at Ryerson University, and director of research at the Social Media Lab.
Jenna Jacobson
Jenna Jacobson is an assistant professor at Ryerson University in the Ted Rogers School of Retail Management and a research fellow at the Social Media Lab in Toronto. Her research focuses on social media, branding, and user behaviour.

You are welcome to republish this Policy Options article online or in print periodicals, under a Creative Commons/No Derivatives licence.

Creative Commons License

More like this