Why Do You Need Clean Data in Social Analytics?

January 23, 2017 / by Shiho Hashimoto Shiho Hashimoto


If you are analyzing your social media activities without cleaning your data first, you are wasting your time. Clean data is the key to getting your strategy right. 

We discussed a little bit what clean data is in our earlier blog post. Here is a more deeper look into the topic. 

In social media, conversations aren't always that clean.

By unclean conversations we mean irrelevant ones. With the huge amount of existing social conversations, there naturally comes a lot of irrelevant ones too.

When speaking of companies and social data, irrelevant data refers to spam, ads, posts by the company itself or its employees, as well as posts not related to the brand. In other words, "noise". Also, not everything is posted by real humans. Some can be created by social media robots, so called bots.

Simply put - spam.

Let's say that ACME brand has received 12 500 messages during the past seven days. Of all the messages, 77 % was noise. Which means, only 23 % of the whole conversation is relevant and created by real humans. Hence, you can ask yourself, would you rather make your decisions based on spam, as in the total of 12 500 messages, or on those 2875 messages actually created by real users?

Filtering out and removing the noise is indeed very time consuming, yet vital. In marketing, as well as in product development, it’s crucial to base your decisions on reliable and relevant data in order to understand what your customers really want.

When thinking about social media analytic tools and business intelligence, you need to find a high-quality analytics tool that does the data cleaning for you. Your goal of doing analysis is to get the actual actionable insight that will help you make smarter strategic decisions. You should not be spending a single moment in trynig to figure out how to clean data. And yet, clean and relevant data is the alpha and omega of everything. 

First step in validating whether your current analytics tools use clean data or not is to benchmark with other analytics tools. 

