Dunking on marketing attribution is popular on Twitter. I’ve worked on marketing attribution projects, which makes me highly sympathetic to the claims of people like Pedram Navid. I generally endorse them.
Pedram is expressing the bomber-plane-with-red-dots.jpg meme. Most marketing attribution projects account for only the observed journeys of users and ignore the unobserved. What if the unobserved parts of the journey are actually the most important ones?
Using some sophisticated, probabilistic techniques we can actually account for the unobserved parts of the journey. With the unobserved de-mystified, we can actually have some understanding of how to attribute the strength of marketing channels.
But we have reasons to remain sanguine about the whole endeavor. Pedram’s take is probably correct for 99.99+% of attempts at marketing attribution, and it seems likely that marketers aren’t incentivized to really care anyway.
What is marketing attribution
Let’s define marketing attribution as the attempt to quantify the influence of a growth activity on a consumer’s purchase decision. This roughly how academic literature defines it, but I want to expand the scope beyond just advertising .
In plain speak, a marketing department might spend $10M each month. Imagine they equally split that $10M budget across Google search, YouTube, Instagram, and TikTok. Meanwhile, the growth team is doing some SEO stuff. Salespeople are making cold calls. The company’s founders are tweeting about a fire new feature that just dropped.
Those are all growth activities.
Let's say in that one month the company brought in 10,000 new customers. Marketing attribution is an attempt to award each of those growth activities their proper due. At the end of the month, the marketing department wants to have a spreadsheet that expresses something like, “Our YouTube campaigns were responsible for 2,000 new customers, our TikTok campaign did 1,000…”
How most marketing attribution looks like
Let’s cover how attribution looks 99.99% of the time. They all have the flaws that Pedram pointed out.
All the major ad platforms give marketers tools to evaluate their ad campaigns. Surprise! They’re biased to make the platform’s efficacy look really good. I think Google Analytics’ default rule is something like, “If the person makes a purchase and clicked on one of your ads in the last 28 days, give conversion credit to the ad campaign.”
By default these platforms try to give all the credit to themselves, and most(?) marketers catch on to this fact.
Last Touch Attribution (LTA)
Most marketers realize generating readouts from the ad platform’s campaign manager isn’t the best idea. Instead, they generate a rule for their organizaton like, “The last growth activity that sent the customer here will get 100% of the credit for the customer.”
In practice, marketers can tinker with the settings of Google Analytics or whatever measurement tool they use and get last touch attribution out of the box.
But you can’t possible append a UTM to every growth activity, which means LTA would never credit that activity for the customer. Choosing the last thing that brought the customer to your site is arbitrary—why not credit whatever activity initially exposed them to your brand? And what if the customer clicked on that Facebook ad and was predetermined to purchase the item anyway. Why should that ad get 100% of the credit?
LTA is insufficient for many reasons, but I think those are probably the strongest arguments against it.
Multi Touch Attribution (MTA)
An even more sophisticated marketer might realize that LTA is garbage. They want to understand the entire user journey and give credit to each growth activity the user touched.
They’ll work with data engineers to have the source of every user session sent to a data warehouse. Once in the data warehouse, analytics engineers will do their magic to join the list of the month’s new customers with the table of each user session.
They end up with a table that looks partially like this:
|User ID of New Customers||YouTube Ad Sessions||Google Search Sessions||Snapchat Ad Sessions||Conferences Attended|
An MTA approach would go something like the following:
User AAA was a new customer this month. They visited the site 3 times from YouTube Ads, 3 times from Google Search ads, 3 times from Snapchat Ads, and attended 1 brand activation event (a conference). That means YouTube gets 30% of the credit, Google gets 30% of the credit, Snapchat gets 30% of the credit, and the conference gets 10% of the credit.
MTA still isn’t good enough because there’s no way to capture all of the possible offline activities. I was already incredibly generous in assuming the attendance of a conference was actually tracked and entered into the warehouse. But what if someone hears about the product from a friend, clicks on the Facebook ad, and purchases? How is that offline activity getting tracked?
Finally, MTA fails to recognize path dependency. What if the order in which the user was exposed to the growth activities is impactful? Maybe customers first need to read a blog post, hear a testimonial from a trusted influencer, and then see a Facebook ad. If it occurred in any other order, maybe it’s not as useful.
So where does this leave us? If none of these methods at crediting growth activities are sufficient, should we doubt the whole endeavor?
Some intuition that attribution should be possible
Marketers aren’t the only people that are trying to assign credit to actors in complex systems. I immediately think of sports analytics. Baseball analysts try to find the contribution of a player to the win total of a team. This feels very similar to marketing attribution!
It’s helpful to recast the purpose of marketing. Marketing activities try to take customers from the state of disengaged to the state of converted. Marketing gurus frame this journey in several different ways, but academic literature most commonly breaks it into:
Therefore, we could model a customer’s journey and their interactions with marketing activities as moving through these three steps.
Unlike a baseball player, we can’t observe the customer rounding the bases to the home plate. How might we tease out these hidden movements through the funnel?
Enter the Hidden Markov Model
Marketing professors Vibhanshu Abhishek, Peter Fader, and Kartik Hosanagar used a "hidden Markov model" (HMM) to try and understand how a customer is impacted by marketing activities. They worked with a large car manufacturer to measure the impact of online ads for a release of a new car.
This approach represents a departure from the attribution attempts above. They’re not coming up arbitrary rules in advance. Instead, they train the model on some data (“training data”) and see how well it performs against the validation set. Furthermore, it’s a way to think probabilistically about a very complex system—human decision-making. This all seems to be a better way of thinking about the problem than the above approaches.
Admittedly, I’m a little underpowered to discuss the math here. But this illustration provided by the authors helps me stumble through it.
A HMM is an attempt to model the customer as they from a state of unaware (State 1) to state of purchase (State 3). At each state, there is some probability the customer stays at the current state, moves to the next state, or goes back in a state. Marketing activities are attempts to move a customer towards State 3 and should impact the likelihood of moving forward.
The authors then layer on some more features to the HMM and validate it against other attempts at attribution. You can see that their “full-model” is much stronger than other attempts at attribution. Look at how poorly LTA performs.
One of their findings is that 42% of online purchases of the car model came from non-advertising growth activities.
So yay marketing attribution is solved, right?!
Marketing attribution is still mostly suss
Returning to the earlier sports analogy of baseball analytics, there are probably tens of thousands of baseball teams in the US across all levels. Out of all of those, there are probably at most 30 teams that take analytics seriously. Meaning 99.99%+ of baseball organizations rely on managerial genius (or gut or luck or whatever) to make personnel decisions.
It’s likely the same when it comes to marketing attribution. The above HMM attempt is hard! I can barely explain yet alone code/implement it. Inevitably, some machine learning engineer is probably going to say running an LSTM or something else even fancier would be better.
That's cool, but how many marketing orgs are working with MLEs and DS's capable of this level of modeling. It’s probably like baseball—99.99%+ of them aren't.
The paper by Abhishek et al. also shows that the LTA way of attributing overstated online ads’ contribution by 88%. Yikes! Imagine telling a marketing org that you’re going to reduce the credit their online ads get by 88%. Do you think you’d have buy in to switch to HMM from LTA? Your mileage may vary! Consider the structural incentives marketers have.
Yes, Pedram is mostly correct. Marketing attribution remains mostly suss, and might not even be the right thing to focus on when optimizing ad spend. But there are ways to account for the common critiques of attribution with attempts at probabilistic modeling like HMM.
 Abhishek, Vibhanshu and Fader, Peter and Hosanagar, Kartik, Media Exposure through the Funnel: A Model of Multi-Stage Attribution (August 17, 2012). Available at SSRN: https://ssrn.com/abstract=2158421 or http://dx.doi.org/10.2139/ssrn.2158421