RKG Logo 434-978-4300

I enjoy stumbling onto new things, and so changed my default FireFox homepage from Google’s personalized homepage to Yahoo’s redirect to a random URL (random.yahoo.com/bin/ryl) just to shake things up. After randomly hitting content spam pages (MFA) a few times when opening the browser in the morning, I began to wonder about their prevalence. After all, the web is a huge haystack, and those bogus pages must be occasional needles, right?

Curious, I tried 50 random pages from random.yahoo.com/bin/ryl. I’m assuming (big assumption) that Y! isn’t filtering all that much, save for language — that’s my guess because (a) all the results were in english, (b) three of the 50 were broken links, and (c) three of the 50 were porn sites.

Of the 50, four were clearly junk pages solely designed to generate search revenue. These four URLs were all concatenations of two common dictionary words which didn’t make much sense together, clearly suggesting they were purchased by a ‘bot. (The most amusing of the four was dochunter.com
, which can’t seem to decide if the page is about hunting moose, choosing a MD, or
– gasp — hunting doctors).

This survey is decidedly unscientific, is based on a tiny sample, and depends critically on the randomness of random.yahoo.com/bin/ryl, which isn’t known.

But still, 4 in 50 is 8% — that is amazingly high, in my opinion. The web is well over 11.5 billion pages (that estimate is over 18 months stale) — 8% of 11.5b is over 900 million junk pages.

Even if this estimate is off on the high side by an order of magnitude, that suggests at least 100 million bogus content pages siphoning value from advertisers to spammers. Scary.

If you like this post, consider subscribing to our RSS feed. You can also have new posts sent to you via email.


Possibly Similar Posts

Trackback

http://www.rimmkaufman.com/rkgblog/2006/07/03/content-spam-at-8/trackback/

Blogs Citing This Post

  1. Pingback: Quack, Quack: Made-For-AdSense Spam on January 25, 2008

Your Comment

Tags

RKG

Email Updates

Categories

Recent Comments

  • Alan Rimm-Kaufman: Do elaborate, Ed. Do you mean brands monitoring Twitter to jump in and respond to random compliments or complaints which arise...
  • Edward: I think Twitter is an opportunity for brands to establish a “real-time” connection with customers. Although it represents new...
  • George Michie: Happy New Year, John. Market Motive is a good place to start. Keep a close eye on other good blogs: ours, SearchEngineLand, ClickZ....
  • john: Hi George and Happy New Year. I work for large Company who created a new SBU of New Media Specialists, basically resellers of Google And...
  • George Michie: Shelley, I hear you! We had a candidate who listed HTML as a proficiency on her resume. When pressed for specifics she said:...
  • Erin: Actually, I am always wondering this about myself. I surprise myself by what I know, but I don’t feel super confident about my...
  • Shelley Ellis: I like the sports analogy. My husband was a baseball coach for years and I still get tickled thinking about the time a mom told him...
  • AJ @ Web Domains, UK: I think the problem is spamming scripts are being written that target prominent blogging/comments systems such as Movable...
  • Debra Askanase: Hi Alan, Thanks for the quick overview. Just want to point out that the Wufoo outgoing link actually connects to Survey Monkey....
  • George: Vijay, that’s the best idea I’ve heard in months! Rick, your point is well-taken. It’s really noisy, and for many retail...
  • Alan: You can use this toy model in any channel where you can sales definitively tie sales back to an advertisement or promotion using a tracking...
  • Patrick: Nice model. What I’m wondering however is what type of net sales you should include. I believe these should be the sales resulting...
  • Vijay R: With mobile phones being ubiquitous, why not use them as a channel to measure effect of online on offline sales? For instance, offer a...
  • George: Not all of our clients do think this way. Many do, but others are more concerned about share of voice, and the trappings of a good program....
  • Jim Novo: George, I have to ask the same question I asked Alan on his “melons” post - how do we get people to care about profits? Every...

Blog Stats

  • Posts: 803
  • Words: 357,150
  • Comments: 1,587

Administration