Ask a Samba Expert: The Data Science Behind Synthetic Control Groups

Ken Mallon – SVP, Measurement Sciences

Our “Ask a Samba Expert” series explores a variety of topics across data science, analytics and research to uncover behind-the-scenes details about the methodology and expertise of Samba’s ground-breaking Measurement Sciences team.

What is your background?

I have found that many people in the research, analytics and data science space have some quirky backgrounds and I’m no exception. I was first trained as a High School Math teacher but ended up going to Stanford to get a Master’s degree in Statistics with the idea of applying statistical methods in the health sciences.

I then earned another Master’s degree—this time in Health Sciences from Johns Hopkins University, which led me to a decade of work in the Biotech/Health Sciences industry. About 20 years ago, I made the leap to digital and joined Yahoo! as one of their very early data researchers (the term data science hadn’t been invented yet!). I’ve worked at large tech companies (Yahoo!, Microsoft), large research firms (Kantar Millward Brown and Ipsos) and some smaller, innovative companies. Now at Samba TV, I lead our Measurement Sciences division, overseeing more than sixty data scientists, engineers, analysts and research specialists.

How did the concept of synthetic control groups land on the Measurement Science team’s radar?

Control groups are extremely important when it comes to measuring effects. For example, suppose I want to test the effectiveness of a headache medicine. To do the test, I give 100 people this medicine when they have a headache, I wait an hour and ask them if they still have a headache. If only 20 still have a headache, do I declare an 80% success rate? No. Because I don’t know how many would have still had a headache had I NOT administered the medicine.

In the area of media and ad effectiveness research, the problem of control groups has existed in TV measurement since the inception of TV. For decades the standard methods included econometric modeling that correlates GRPs delivered at a DMA/geo level to “brand tracking” survey results that measure brand awareness, purchase consideration and other metrics, also at a DMA/geo level. Multivariate statistical methods are used to tease out the effects of ad exposure on these brand-based survey results but research knew that correlation does not equal causality.

Digital media researchers got excited because online ad exposure could be linked to a person or a household. This opened up the possibility of connecting ad exposure to sales for the first time. I’m proud to have been part of the team that created the first closed-loop sales lift system for digital advertising in the early 2000s called “Consumer Direct” which was applied to grocery store product sales. The system used randomized control groups.

With the advent of household-level TV data, scientists at Samba TV wondered if household-level closed loop ad effectiveness measurement could be done the same way it had been developed for digital advertising. The problem was that in digital advertising it’s possible to create a random hold-out group to use as a control group, whereas in linear TV advertising the concept of a hold-out doesn’t exist; if you watch a particular show, and don’t skip the commercials, you will see the ads.

“In the area of media and ad effectiveness research, the problem of control groups has existed in TV measurement since the inception of TV.”

At a high level, how do synthetic control groups work?

If we were to test the effectiveness of linear TV advertising in the same way one can test headache medicine, it would be very challenging. You’d have to take a sample of people who intend to watch a certain TV show (and therefore be exposed to a certain commercial) and tell a random subset of those people not to watch the show. Then, wait and see if the outcome is different between the two groups. It’s clear this would be an impractical way to test TV ad effectiveness.

But there are some people who intend to watch a show and end up not watching. Perhaps they went to a school event with their kids, took a friend out for dinner, stayed late at work, etc. There are a variety of seemingly random reasons why a person might not watch a TV show that they intend to watch. We decided to closely examine this group hoping they could be used as a control group.

The first step in using this group as control is to identify people with a high likelihood to watch a given show. This is fairly straight-forward with modern machine learning. Second, we have to be sure that the control group characteristics closely match that of the exposed group. This is also straight-forward because Samba TV has access to historical viewership data and demographic information.

What are the challenges or pain points that synthetic control groups solve?

The biggest challenge is that when one doesn’t use a proper control group, the conversion lift results look artificially high. On the one hand, if you are in the business of selling advertising, you might be happy that results look unusually high.

However, astute buyers of the research will quickly begin to question the scientific integrity of the findings. So, the key pain point that synthetic control groups solve is that they provide a much more accurate measure of the true advertising effect which enables brands to make more informed decisions about which ads are working best for which audiences.

“The key pain point that synthetic control groups solve is that they provide a much more accurate measure of the true advertising effect.”

How does the final solution (Causal TV Attribution) compare with the initial vision for the project?

I think the final solution is much more scaled and generalized than originally conceived. Samba TV now has an automated way to generate control groups for any TV ad campaign that we track.

Describe the impact you’ve seen on client decision-making where Causal TV Attribution measurement was implemented.

A simple example is given below. The grey bars represent conversion rates from synthetic control groups. In the absence of these control groups, a media buyer would conclude that TV network 1 is the best performer because the red bar is highest for network 1. This buyer might decide to stop buying ads on network 4. However, when compared to a synthetic control, a buyer would realize that networks 3 and 4 are actually performing very well.