Who's afraid of Cambridge Analytica?

Cunning marketers can predict and manipulate our behaviour. Facebook knows your soul. Google is hacking your brain.

I’ve noticed many people being scared of digital manipulation, Cambridge Analytica style, where companies track your digital footprints to show you microtargeted content that penetrates your brain and manipulates your mind. If such a thing works, it seems possible for powerful people to hire such firms to direct public opinion and election results through the use of political campaigns ads targeted specifically to our psychological profiles.

That’s does sound a bit scary.

But whether we should actually worry about it depends, it seems to me, on two factors: the accuracy of what Cambridge Analytica called “psychografic profiling” (predictive power) and the efficacy of the consequential microtargeting (persuasive power).

Let’s dive in.

Predictive power

There’s a famous paper - cited over 2500 times - by Michal Kosinski, David Stillwell and Thore Graepel. It seeks to predict traits — such as sexual orientation, ethnic origin, political views, religion, personality, intelligence, — and attributes like age, gender, and relationship status from Faceook likes. So they asked 58.000 folks to provide access to their Facebook likes and also to fill out some questionnaires, which gave the researchers their “true score” on these demographic and psychometric variables. This allowed them to test the accuracy of the likes-based prediction model by correlating its predictions about the participants personality with “the truth”.

For dichotomous attributes like gender and sex the chance the like-based model got the right answer was pretty decent, hovering around 70%:

For the continuous traits such as personality and happiness, the model had only around 30% to guess correct:

A word on the transparent bars. They indicate test-retest reliability, but what does that mean? Psychological traits cannot be measured directly. As a consequence, their values can only be measured approximately, for example, by evaluating responses to questionnaires. The test-retest reliability of a questionnaire is the correlation between the questionnaire scores obtained by the same respondent at two points in time. As you can see, the correlation between the predicted and actual Openness score (r = 0.43) was very close to the test–retest reliability for Openness (r = 0.50). This indicates that for the openness trait, observation of the user’s likes is roughly as informative as using their personality test score itself.

A correlation of ~0.3 between the model’s guess and the correct answer means that its predictions will often be wrong. So although the paper claims, in its abstract, to “show that easily accessible digital records of behavior, Facebook Likes, can be used to automatically and accurately predict a range of highly sensitive personal attributes,” the correlations don’t strike me as establishing that at all.

Corresponding to the low correlation, it has already been observed that audience segments vary greatly in quality and are often inaccurate across leading data brokers.

A 2018 meta-analysis on the predictive power of social media data for users’ personality noted that correlations between observed and predicted personality scores almost exclusively range from .30 to .40. The latest meta-analysis I could find, from 2020 by Davide Morento and Christian Montag, finds a correlation of .34.

This indicates that the Kosinki study was fairly typical.

Which is perhaps why the abstract abstract seems designed to convince readers that the shock value of psychographic profiling is greater than shown by the results, rather than to honestly report on the mixed/weak outcomes of the study.

Sure, perhaps my likes reveal my age, gender, race and relationship status with some reliability, but that does not seem nearly precize enough to press my buttons. And showing me an add because I’m caucasian and in my 20s seems a far cry from microtargeting.

Yet Facebook likes is what Cambridge Analytica relied on to, some say, target the right kind of people to turn some states red instead of blue. But these data suggest that, if that’s true, they probably weren’t capable of targeting “the right kind of people” in the first place. Let alone influence them.

As Slate’s Will Oremus concluded about Cambridge Analytica’s psychographic profiling, as “sinister” as it sounded, it was “an imprecise science at best and ‘snake oil’ at worst.”

I should note that there’s some disagreement about how scary a correlation of ~.34 is. Morento and Montag emphasize that “because the overall meta-analytic effect size presented here is moderate, it appears that the analysis of digital footprints still falls short in predicting such characteristics with accuracy allowing for assessment at the individual level.” In other words, you can’t use it to microtarget.

But other researchers point out that .34 means that these computer-based judgments correlate with participants’ self-ratings in about the same way as participants friends’ ratings correlate with her self-ratings. In other words, there’s something off about these personality tests, or we’re just really bad at filling them out on our friends behalf, or we don’t know our friends remotely as well as we think, or the AI knows us better than we might like.

In the end, the debate over how to interpret the accuracy of the .34 correlation is not so relevant. What matters more is whether the targeted advertising based on this (accurate or not) psychografic profiling works.

Persuasive power

Targeted advertising can, it seems, have some (very) limited effects on product purchases. For example, Matz et al. (2017) showed uplifts in conversions up to 50% with personality-based targeting of ads.

One of the experiments — to get a feel for how these studies go — ran a beauty advertisement. Extroverts got the message “Dance like nobody’s watching.” The ad appearing on introverts’ timeline read “Beauty doesn’t have to shout.” These personalized ads led to extra clicks, which lead to extra purchases for the retailer.

Once more the abstract is not devoid off academic spin:

“In three field experiments that reached over 3.5 million individuals with psychologically tailored advertising, we find that matching the content of persuasive appeals to individuals’ psychological characteristics significantly altered their behavior as measured by clicks and purchases. Persuasive appeals that were matched to people’s extraversion or openness-to-experience level resulted in up to 40% more clicks and up to 50% more purchases than their mismatching or unpersonalized counterparts. Our findings suggest that the application of psychological targeting makes it possible to influence the behavior of large groups of people by tailoring persuasive appeals to the psychological needs of the target audiences.

That sounds convincing, and, yes, even a bit scary, until you look at the actual numbers. Actually, these 40% and 50% increases hardly amount to anything, because the baseline is so low: (In the same way as I still only got 1.5 cookies if I increase my meager starting amount of 1 cookie by a whopping 50%.)

Way to go guys, your targeted advertising increased the conversion rate to an amazing 0.015% and 0.8%! In other words, it added a few dozen purchases after millions of people had seen the ad. Their first study, for instance, got 390 conversions from ads against 3.129.993 users. So even though the uplift they got is impressive, the overall impact was still tiny.

Meta-lesson of the day: don’t just read abstracts, look at the data.

Such small effects are common in online advertising, which is why one the more prominent papers in the field was titled On the near impossibility of measuring the returns to advertising. Miniscule results are tough to detect.

What about using personality targeting for political advertising?

Note that even if the influence of Cambridge Analytica’s campaign had been as large as that recorded in experiments on beauty products, it would only have swayed a few thousand voters.

In reality, its influence was likely nil.

In 2018, political scientists Joshua Kalla and David Broockman published a meta-analysis of all the well-designed field experiments studying the effects of political communication. That is, their meta-analysis included those studies that sent flyers to a random subset of counties, canvassed random subset of houses, called a random subset of potential voters, and so forth. By recording opinion surveys, this design allowed the researchers to precisely estimate the effects of their intervention— the letter, the face-to-face discussion, the call—on participants who had been exposed to it, compared with otherwise similar participants who hadn’t.

The result:

The best estimate for the persuasive effects of campaign contact and advertising—such as mail, phone calls, and canvassing—on Americans’ candidate choices in general elections is zero. Our best guess for online and television advertising is also zero.

Studies looking into the efficacy of online ads for changing political preferences, however (im)precisely these are microtargeted, consistently find no evidence such content could change voting preferences or behavior.

Does that mean political communication has no effect at all? Let’s just say that it functions more frequently as an agent of reinforcement than as an agent of change. It’s much more likely that microtargeted ads work on reinforcing people’s preconceived notions. It’s very unlikely that they actually changed anyone’s minds on whether to, for example, vote for Donald Trump or Hillary Clinton.

Realistically, advertising does something, but only a small something – and at any rate it does far less than most of us believe. So there’s no reason to be afraid of digital manipulators with the evil power to change hearts and minds. And every reason to be skeptical of reporting and documentaries that imbue microtargetng with Derren Brown-like abilities to tinker with perceptions and sway credulous masses.