RKG Logo

There are three kinds of lies: lies, damned lies, and statistics. — Benjamin Disraeli

I nearly choked on my Sunday morning bagel reading an article in yesterday’s Washington Post titled “State of the Household”.

Staff writer Neil Irwin wanted to show how macroeconomic trends are affecting the average American household’s financial statement.

financial statement of average american household

The key word here is “average.”

“Average” often isn’t the same as “typical”, even though the article’s subhead — “Translating Big Economic Trends Into Something a Real Family Might Face”, emphasis mine — suggests they are one and the same.

According to the Post, here are some facts about the average U.S. household, along with my comments in italics:

  • Income: Household income of $55k.

    High but plausible. Median US household income in 2006 was $48k, so stating $55k for “typical” income is high, but in the ball park.

  • Income: Interest income of $10k.

    Huh? Assuming 5% money market yield, that would mean having $200K in cash around (!!!) No, that isn’t typical.

  • Asset: Stocks and mutual funds totaling $96k

    Nope, not typical.

  • Asset: Business equity of $70k

    With our national population around 303 million and roughly 27 million businesses, that means less than one in ten American own a business. Not typical.

  • Expense: Heating oil costs of $205

    When I lived in New England, our heating bills were at least five-fold that. So not typical for heating oil users. Now I live in Virginia, and heat with a heat pump. So now my oil costs are exactly zero. So not typical for non-heating-oil-users, either.

I am sure these averages are correct, but they’re also utterly meaningless.

Averages do a lousy job of characterizing skewed distributions, as averages are very sensitive to outliers. Bill Gates and Warren Buffet et al play a large role in moving those “typical” interest income numbers reported above.

And averages might not be the best characterization of a population. As Arnie Barnett, one of my doctoral advisors, used to say, “The average (the centroid) of a donut is right in the middle, exactly right where there is no donut.

donut.jpg

Averages are popular because they have great theoretical properties, and thus form the foundation of basic and advanced statistics.

Medians often perform better at characterizing a distribution. (The median of a distribution or sample is the value at which at most half the population are above and at most half the population are below).

OK, so averages can deceive. What’s the relevance to paid search?

In an earlier post on calculating optional PPC bids (Computing Optimal Pay-Per-Click Bids In 19 Easy Steps), I gave an example along these lines:

“Assuming an average click to your site generates an average sales of $7.88, and assuming your margin and financial goals dictate an advertising-to-sales ratio of 28.6%, then you should bid on average 7.88 x .286 = $2.25 per click.” (post)

In the footnotes, I point out that

In real life, conversion and SPC are the most important thing NOT to suppose. Averages and assumptions will kill you here. You need good data and careful statistics. Also, in real life, AOV and conversion vary widely by term and engine.

and

Correctly estimating conversion, AOV, and SPC on medium-traffic “body” terms and on very low volume “tail” terms is very, very important. At our firm that’s one of the most important ingredients to our secret sauce. Correctly estimating low-probability events has been an interest of mine since my doctoral research on the topic at MIT. (post)

The take-away?

In PPC bidding as in newspaper columns, use caution when using averages to characterize highly skewed or highly dispersed distributions.

If you like this post, consider subscribing to our RSS feed. You can also have new posts sent to you via email.

Share this post (via email, Digg, Delicious, etc)

Possibly Similar Posts

Trackback

http://www.rimmkaufman.com/rkgblog/2007/12/17/averages-can-deceive/trackback/

Comments

  1. Andrew, December 19, 2007:

    Another interesting Wikipedia article concerns Anscombe’s quartet, the group of four sets of data which all share the same mean and regression line, but which are in reality very different (which becomes clear when graphed).

Your Comment

Email Updates

Categories

Recent Comments

  • Nick Stamoulis: Great tips during this crazy time. People are most likely going to pull much of their PPC advertising as a whole these days but the...
  • Namu: Thank you for the great tip. Now I can read my starred items on my iPod touch offline!
  • Ricardo Figueiredo: Thank you for the good advice Alan. It's intriguing to observe how there are some businesses, and individuals, feeling the...
  • George Michie: Chris, I wouldn't be surprised if that's a real number. Inc says they have 550 employees, so their income would have to be $50...
  • George Michie: Hi Christian, I suppose they take the same percentage hit off their commission that the retailer does. To my thinking it's the...
  • Chris Zaharias: I read the magazine on a flight Sunday and recall seeing iCrossing on there at ~~$100M in revenues, and thought the same thing of...
  • Alan Rimm-Kaufman: Christian -- I didn't mean to imply all retailers will face Q4 losses. But it is not improbable that many retailers will be...
  • Christian Little: Despite the economic crisis, how could most retailers be facing a Q4 loss? For most retail this is the best time of the year, you...
  • Christian Little: That's pretty remarkable...makes me want to build a coupon site lol. Don't coupon sites take a huge hit in commissions though...
  • Stephen Schramke: Sage advice. Thanks for sharing!
  • George Michie: Could be Neil. I have my doubts. My suspicion is that there just isn't much work being done, other than taking commission checks to...
  • Neilzb: Those numbers are pretty remarkable, but if I had to guess I would say that it’s possible that they are just 8 people 'outsourcing' full...
  • Jeff Cornejo: I disagree that a revenue/employee ratio shows ANY kind of profitability. If anything, a mostly-passthrough model, with high...
  • George Michie: Hi Dan, The IP address of the advertiser isn't a factor, anyone can run geo-targeted ads regardless of where their website resides....
  • dan shipe: Hey, me again. What about possible exploits to this system? Adwords must evaluate the geographic region based on the IP address of the...

Blog Stats

  • Posts: 758
  • Words: 336,078
  • Comments: 1,346

Administration

Close
  • Social Web
  • E-mail
Powered by ShareThis