Incrementality Testing vs. MMM: You Probably Need Both

Jan 10, 20269 min read

"Should we run an incrementality test or use MMM?"

Wrong question. The right question is "how should we use them together?" They solve different problems.

What Incrementality Testing Actually Is

Incrementality testing is an experiment. You deliberately turn off (or scale down) marketing in one group and keep it running in another group, then measure the difference in outcomes.

The most common format is a geo-lift test (also called geo experiment or matched market test). Pick 5-10 markets where you run ads normally (control) and 5-10 similar markets where you turn ads off or increase spend (treatment). After 4-8 weeks, compare the revenue difference.

There are also holdout tests (random user-level holdout from seeing ads on a specific platform) and ghost ads (bid in the auction but serve a blank ad, then compare behavior of users who would have seen the real ad). Each has tradeoffs.

What MMM Does Differently

MMM is observational, not experimental. It looks at historical data and uses statistical models to estimate what each channel contributed. You do not need to turn anything off. It measures all channels simultaneously.

The tradeoff: because it is observational, MMM can confuse correlation with causation if not properly specified. If you always increase spend in Q4 and Q4 always has higher revenue (because of holidays, not your ads), a naive model might over-credit marketing.

Good MMM implementations (like Google Meridian) control for seasonality, trends, and external factors to mitigate this. But they can still have blind spots.

When to Use Which

	Incrementality Test	MMM
Question	"Does this specific channel cause incremental revenue?"	"How should I split budget across all channels?"
Scope	One channel at a time	All channels simultaneously
Revenue cost	You lose revenue in holdout markets	No revenue sacrifice
Timeline	4-8 weeks per test	Ongoing (model updates monthly or quarterly)
Confidence	High (causal)	Moderate (observational)
Cost to run	$5K-$50K in lost revenue + setup	$1K-$3K/mo (self-serve)

How They Work Together

The best measurement stack uses both. Here is how:

Step 1: Run MMM as your always-on measurement layer. It covers all channels and gives you a baseline understanding of performance.

Step 2: Identify channels where the MMM has wide uncertainty. If TikTok ROAS has a credible interval of 0.5x to 4.8x, the model is not confident. That is where you run an incrementality test.

Step 3: Feed incrementality results back into the MMM as calibration priors. If your geo-lift test shows TikTok ROAS is 2.1x, set that as a Bayesian prior in the model. The MMM results get more accurate with each calibration.

This creates a feedback loop. MMM tells you where to test. Incrementality tests make the MMM better. Over time, your measurement gets progressively more reliable.

Practical Setup

For geo-lift tests, you need geographic markets that are similar in size and demographics. In the US, common pairs include Denver/Portland, Austin/Nashville, or Minneapolis/Columbus. You need at least 3-5 markets per group for statistical power.

Run each test for a minimum of 4 weeks, ideally 6-8. Shorter tests lack statistical power. Longer tests mean more revenue sacrifice in holdout markets. It is a balance.

Budget for 1-2 incrementality tests per year on your most uncertain or highest-spend channels. Use MMM for everything else.

With Spendmix, your MMM runs continuously and flags which channels have the widest uncertainty, so you know exactly where to focus your next incrementality test. See the uncertainty visualization in our demo report.

Ready to see what your budget is actually doing?

Start your free trial. Upload your data, get your first MMM report in days, not months.

Start your free trial