Chris Anderson has gotten a lot of mileage out of a concept he refers to as the long tail. The long tail comprises the numerous products in a category (think songs, book titles, search words) that sell in small numbers. The products in Anderson’s long tail are at the opposite end of the spectrum from the rare blockbuster hit. If the top 20% of products account for 80% of sales, then the remaining 80% make up Anderson’s long tail. The idea is that when everything is digital and networked, the long-tail will get longer and become profitable to serve.
My beef is not so much about the concept as it is with the name. In order for the numerous, small-quantity products to fall in the tail of a curve, you have to draw the curve like the one above (a replication of the chart on the jacket of Anderson's book). This is a chart of sales of each product after sorting the products from highest to lowest selling.
And although you can draw the curve this way, it is clearly not the usual way to show the distribution of unit sales across products. No…. the usual way would be to construct a histogram or frequency chart. The Y-axes would be the number of products that had sales in a given range. The X-axes would be the number of sales.
That chart would look like the one below:
(You can see that all I’ve done is switch the labels of the two axes.)
Both charts describe a distribution of sales across products in which a few products sell a lot and lots of products sell a little. The long tail in the top graph reflects the numerous products that sold in increasingly smaller amounts. In the bottom chart, those same products occupy the tall head. The tall head in the bottom chart reflects the large number of products with low number of sales. The bottom chart counts the number of products at each level of sales while the top chart counts the number of sales for each product ranked in terms of sales.
Now neither chart is right with the other wrong. But the bottom chart is more typical. It is the probability distribution of unit sales for a randomly chosen product. There is a large probability the product will sell a small number of units and a very low probability the product will be a hit—and sell a high number of units.
To see that the bottom curve is more typical, simply ask yourself whether the original 1977 STAR WARS movie belongs in the head or the tail of the distribution of all movies.
Clearly, most regular folk (and all statisticians) would say that Star Wars belongs in the tail. It was unusual in its popularity. Unusual events are in the tail (in this case, the upper tail) of a distribution. Just like really smart, tall, or athletic individuals, block-buster hits are several standard deviations above the mean. They are way out in the tail. If all this makes sense to you, then you intuitively use the bottom curve when talking about distributions.
And if you use the bottom curve, Anderson’s long tail is actually a tall head.
If we use the bottom probability-distribution curve to make Anderson’s main point, the chart might look like this:
For physical products with scarce physical distribution, the really small ones never make it to market—creating the short, almost truncated, left tail in blue. Anderson’s main point is that if products and distribution are digital, the economics allow for all products to be sold. The short left tail for physical goods becomes a tall head when things are digital. When things are digital, all products (not just the popular ones) reach the consumers…..and what we find are lots and lots of products that sell in very small quantities. You can think of the red curve above as the natural extension of the blue curve once all products see the light of day. Digital economics adds a tall red head to the right tail of the probability distribution of sales.
But the blockbusters are in the tail, and Anderson’s long-tail items are in the tall head.
This post guest-blogged by Prof. Phil Pfeifer of UVA's Darden School.