Nov 92007

"Increase AdWords Conversions 300%!" Errr, really?

Cruising around blogosphere, this intriguing headline caught my eye:

Learn what produced a massive 300% boost in conversions

and so I clicked over to check out the post at Mind Valley Labs.

They describe 300% conversion increase from an Adwords copy test. Here's their data:

mind valley labs

Converting their conversion rates to conversions, Mind Valley reports their "40 tactics" copy drove 9 conversion events in 150 clicks (6%), versus their control copy, with 2 conversions in 130 events (1.5%).

We here at RKG are huge fans of testing. (Shameless plug: our consulting division now offers conversion improvement projects using Google's formidable and free MVT tool.) Winning at online marketing requires regular testing of keywords, bid strategies, destination URLs, and landing page design. But, when testing, you always need to keep a careful eye on statistical significance.

So, is 9/150 all that different from 2/130?

Using a standard test of binomial proportions, we get a p-value of 0.053. Using a 2x2 chi-squared test for independence, we get a p-value of 0.108.

Strictly speaking, at the 95% confidence level, no, 9/150 isn't statistically different from 2/130.

But let's not get hung up in precise p-value cutoffs. The real issue here is that the test is simply way too small.

Different experts give different rules of thumb for sizing tests. In the catalog industry, I was taught the rule that each cell (each mailing list, each catalog version, whatever) needed at least 100 responses (conversions) before you could say much about its observed response rate with any certainty. Doing some quick web searching, some folks agree with this 100 rule ( Gordon Bell, Jonathan Mendez), while others suggest 25 to 50 (Jonathan Miller).

Having long since traded my catalog marketing beret for a web marketing propeller beanie, I'll concur with the "25 to 50 conversions" rule of thumb. Why? Unlike the 4-mailings-a-year catalog industry where a bad mistake can sink your ship, the web is more forgiving of mistakes. Even if you roll out the incorrect version after a web test, no worries; once discovered, you can swap in a better version tomorrow. You don't need as much data to act because the price of a mistake is so much smaller. (Yet another opportunity to hat tip to Mike Moran's new book and great mantra: do it wrong quickly.)

Let's be permissive and take the lower end of the "25 to 50 conversions" rule-of-thumb range. Applying that to the Mind Valley data, I'd suggest waiting about 12 times longer -- about 1800 clicks, or 180,000 impressions, before running the victory flag up the flag pole.

If forced to choose today, of course I'd go with the current winner, the "40 Tactics" copy version. But I'd place practically zero faith in that preference.

In fact, I'm comfortable wagering that this initial 300% lift degrades to nothing -- or at best, to a single digit percentage difference -- within two weeks. Mind Valley, up for a friendly lunch bet?

For anyone interested marketing statistical significance, we provide a free calculator for sizing marketing test cells. Also, I'd highly recommend Gordon's excellent page on calculating sample size. Math warning: it may take you a bit of time to work through Gordon's page, but it will be time well spent. As an aside, one nice thing about modern web tools (like Google's MVT website optimizer) is that all these significance calcs are baked in, with significance levels presented visually.

Leaving the stats behind and returning to the AdWords angle: we here at RKG believe copy testing is a usually 3rd order effect. After literally hundreds of formal copy tests, we've nearly always found copy to be a far smaller performance driver than keywords, bidding, and matchtypes. (This assumes your copy is decent -- you probably can achieve meaningful lifts if your control is weak to begin with.)

Perhaps I can cajole my colleague George Michie into expanding his thoughts about the value of PPC copy testing, when these makes sense and when it doesn't. Hint: we typically expect and obtain meaningful copy lifts on copy tests related to free shipping, sales, and alternate payment methods. And I'll mention here an SEL article I wrote in August on Eight Essentials For Crafting Killer Paid Search Ad Copy.

Test fast, test often -- but do keep one eye on significance levels.

bell curve normal curve statistical significance


10 Responses to ""Increase AdWords Conversions 300%!" Errr, really?"
Alan- Thanks for the mention and calling attention to this. This isn't the only case study Mind Valley uses that has insignificant data sets. I commented on this very same topic on their blog last year. It's unfortunate (and ironic) that they promote testing in such a manner that it shows their own conclusions invalid. Cheers, Jonathan
Dave Davis says:
Smashing "call out". In fact, if you look at a lot of their numbers they are the same. You will also notice that they are all in the same industry. Their sample size is way too small and in a way, it is misleading. We go by the 30 conversion rule in general but for some clients in some markets AND GEOGRAPHIC LOCATIONS it is significantly larger. "you probably can achieve meaningful lifts if your control is weak to begin with" - You couldn't have said it better. In most cases novice advertisers create their ads themselves. They know their product inside-out. They fail to realize that their customer doesn't.
Ken Savage says:
I don't see how just changing the 2 bi-lines would convert so much better. Am I missing something here? Take away the science of it all for a minute and there's no difference in the 2 ads as someone quickly browses over the right side paid ads. Both the headline and url are the same. Only the phrase '40 more proven marketing tactics' stand out to me. But hey if multi-variant testing proves this fact it must be better. We humans are a strange bunch, ain't we?
Hi Ken -- I don't think there's any "proof" here, and certainly no MVT -- just a too-small sample size! :) Cheers Alan
Great post! There are far to many 'expert' companies out there not engaging in accurate testing. Sure, optimizing your titles, headlines, display URLs, etc. can have a dramatic affect on click through and conversion both negatively and positively, but folks often get too impatient. It is imperative when running any testing to allow for an accurate sample size! The 'sample size' is debatable but I typically shoot for 500 - 1,000 click throughs depending on the industry.
I enjoyed reading your article(s). It appears the common thread of agreement overall that both you and Mind Valley do agree on is that continuous and intelligent testing and knowing how to tweak is not a precondition to successful results but an ongoing requirement of any successful business website without exception. Thank you for your insights.
Richard Kraneis says:
Alan, I can't even remember how I found this article, but I can say I have added your blog to my favorites... As for Mind Valley Labs and their claim of increased conversions. 1) Their CTR indicates to me that this is a Google content network ad. Very low CTR and in my testing, lower conversion rates. I think MVL should have declared this, they didn't. 2) It would be helpful to know what Mind Valley Labs considers a "conversion". When you go to their web page currently, the only conversion possible that I see is a free newsletter signup. So they're testing a "free" newsletter? Again, touting one's increased conversion rates on a free signup is a little tacky. I'm sure MVL thought they had a winner in this advertisement. Perhaps it worked for them. But it's nice to know they're are blogs and consultants like your group that set the record straight. Thanks for the fine blog. Richard Kraneis
I am not convinced that its ok to make mistakes on the web. Your rational for reducing a response size from 100 to 25 sounds reasonable on its face. However mistakes are not so easily corrected. Once a test is completed it will be very unlikely that it will be corrected. More likely is that the error will be declared a fact (having tested it) and a sub par version of the campaign will be carried on. Rerunning tests is unlikely to happen in most organizations. Confidence is confidence. Once a low hurdle for confidence is established it will become the norm to accept all kinds of error prone conclusions just because it is expedient. Its best to "do it right the first time." This is a case where errors are rarely discovered after the fact. You don't know you are using a bad version, the wheels don't come off the campaign.


Check out what others are saying...
[...] Should the MarketingSherpa’s guide to landing pages use an effective landing page? Should a company touting the value of statistics use statistically relevant datasets? [...]
[...] RKG on finding statistical significance in two Adwords tests  Interesting commentary on the value (or lack-of-value) in copy testing in PPC Avinash Kaushik on separating signal from noise with statistical significance [...]

Leave A Comment