Editor’s note: Dr. Michael Wu is the Principal Scientist of Analytics at Lithium where he is currently applying data-driven methodologies to investigate and understand the complex dynamics of the social Web.
Social media is a required avenue for brands to engage their customers. However, social media engagement is primarily based on conversations and personalized interactions that are difficult to scale. Influencer marketing provides brands with the leverage to reach many by engaging only a few illusive influencers. This strategy depends on the accurate measurement of people’s digital influence, so brands can figure out who they need to engage. Although this spawned an exciting new industry around influence measurement, there are also many problems around influence measurement that brands need to understand.
One of the reasons that brands don’t understand digital influence is because they don’t seem to realize that no one actually has any measured “data” on influence (i.e. explicit data that says precisely who actually influenced who, when, where, how, etc.). All influence scores are computed from users’ social activity data based on some models and algorithms of how influence works. However, anyone can create these models and algorithms. So who is right? How can we be sure your influence score is correct? In other words, how can we validate the models that vendors use to predict people’s influence?
To illustrate how statistical validation works, I will use a simpler and more tangible example, where we are trying to predict the stock price of some company, say Apple.
Build A Predictive Stock Model And Validate It
First, we need to build a model (or an algorithm) that takes various input data about Apple that might be predictive of its stock price. We can literally pick any data that we feel could potentially affect Apple’s stock price in any way as input: sales data, fundamental company data, social data, competitor data, and industry and market data.
Regardless of how much data we put into the model, and how complex and brilliant the model might be in combining this data, the final test for whether the model actually works is to see if it can predict the real stock price of Apple.
There are three requirements to validate any statistical model or algorithm:
- We need a model or algorithms that compute some predicted outcome (e.g. stock price of a company, weather in San Francisco tomorrow, earthquake, or someone’s influence)
- An independent measure of the outcome that the model is trying to predict
- A measure that compares and quantifies how closely the predicted outcome matches the independently measured outcome
The most important of these is No. 2: having an independent measure of the outcome. It is pretty obvious if you think about it. To validate whether your model can accurately predict the stock price for Apple, you must have the actual stock price of Apple, so you can compare the prediction against the actual stock price.
What Independently Measured Means And Why It Is So Important
However, many people don’t understand what it means to be “independent.” To be independently measured means the measured outcome is completely independent of the model. In the example of predicting Apple’s stock price, it means you cannot use any of the actual stock price data as part of the input to the model. If you do this then obviously the model would predict the stock price, because it would already have information about the actual stock price. So, the actual stock price that you thought you measured independently will no longer be truly independent of the model anymore.
Hence the fact that this model is able to predict Apple’s stock price well is meaningless, because it didn’t actually predict anything. After all it already has the actual stock price that it is trying to predict. This model is basically cheating because it’s based on circular reasoning.
Since all influence vendors use models to predict people’s influence, these models must be validated for accuracy. However, influence vendors do not properly validate their algorithm for the following reasons:
No Data: They don’t have an independent source of influence data. So they can only validate their algorithm by gut feelings and intuitions, which is often not good enough. Would you invest your money based on a stock model that is validated by gut feelings and intuitions?
Overgeneralization: They validate their algorithm base on a handful of known influencers and try to overgeneralize their algorithm to millions of users.
Invalid Circular Validation: They use reciprocity data, such as likes and retweets (which are decent proxies for one’s digital influence), for validation. But they also use these data in their algorithm. This is a common error in model validation, because this circular validation process doesn’t give you any information about the accuracy of the algorithm. To properly validate any model, you must have an independent measure of the outcome, and that means you cannot use any of it in your model.
So should you trust your influence scores? Just ask your influence vendor how they validate their model.
An Old Story Of Search Engine Optimization
Another serious problem with most influence scoring models is IEO (influence engine optimization) as opposed to search engine optimization (SEO).
As the Web grew in the early 90s and turned into a big data problem, human-maintained Web directories no longer became a scalable solution for information retrieval from the Internet. Powerful search engines (e.g. Lycos, AltaVista, Excite, Yahoo, Inktomi, Google) were developed to index the Web in order to provide both scalable and efficient information retrieval. In order to present the retrieved information to the user in a more meaningful way, search engines needed to rank their search results in terms of relevance and show the most relevant pages first.
Google developed an innovative relevance ranking algorithm — PageRank — based on the hyperlink structure of the Web. The PageRank algorithm basically takes inputs (i.e. the hyperlink structures of the entire Web) and cranks out a score for every webpage that, in theory, represents its authority on the Web.
As we learn from the behavior economics of humans, when we put a score on something, we create an incentive for some people to get a better score. This is human nature. Because people care about themselves, they care about any comparisons that concern them, whether it is their websites, cars, homes, their work, or just themselves. Some would go so far as to cheat the algorithm just to get a better score. In fact, Google’s PageRank algorithm has created an entire industry (i.e. SEO) around gaming their score. Although SEO specialists may deny the fact that they are gaming the PageRank algorithm, they are constantly finding ways to artificially increase the PageRank of your websites. Is this cheating? Some SEO schemes may be acceptable by Google, but there are definitely some that are considered cheating (e.g. link farm and spamdexing).
The New Story Of Influence Engine Optimization
Today, the social Web has grown and gained massive adoption. And again, influence vendors are putting a number on something (i.e. people’s influence). So people will again find ways to artificially increase their influence score. But there are three aspects that are different this time.
Influence scoring algorithms are much more susceptible to gaming than PageRank because someone’s influence score depends heavily on his own behavior. This should be apparent from the fact that all influence scores are computed from people’s social media activities data.
Unlike the PageRank score of a page, someone’s influence score is feedback directly to himself. This means we won’t need an IEO expert to tell someone how to behave in order to increase his influence score. A user can easily discover the effects of his own behaviors on his influence score all by himself. So not only is the influence scoring algorithm more susceptible to gaming, they are also easier to game.
Finally, compared to Google, influence vendors have little to no mechanisms with which to discover and combat these cheating behaviors.
IEO is an inevitable consequence of scoring people’s influence. So, do influence scores still have any meaning? It’s definitely not a measure of someone’s influence; and it’s probably not even a measure of his potential influence anymore due to IEO.
An influence score is really just a measure of how well people game the influence scoring algorithm.
If you tweeted a lot yesterday and your influence score jumps up today, you’ve just discovered that you can increase your influence score by tweeting more. Knowing this, would you continue to tweet more? Most people probably would, especially if they care about their score. This has created a lot of loud mouths who are not actually influential in any meaningful way. Therefore, his influence score is merely a reflection of the fact that he has successfully gamed the algorithm into giving him a higher score simply by tweeting more, but not actually doing anything truly influential.
Because behaviors that game the system are typically a lot simpler than behaviors that are truly influential, IEO will tend to changes people’s behavior in a way that pushes them further away from being truly influential. Ironic isn’t it? That’s why I call this the influence irony.
What does this mean? It means influence scores will become less accurate as a measure of someone’s potential influence, and more of a reflection of how successful someone has gamed the influence scoring algorithm.