THE RKGBLOG

ORielly: Data is the Intel Inside

Somewhat stale, but interesting post from Tim O’Rielly on Google’s free 411 service.

Tim suggests the angle isn’t advertising, but data collection. Emphasis mine:

There’s a hidden story here about the speech recognition itself… speech recognition took a huge leap in capability when automated speech recognition started being used for directory assistance. All of a sudden, there were millions of voices, millions of accents to train speech recognition systems on, and much less need for the individual user to train the system.

This is reminiscent of a comment that Peter Norvig, Director of Research at Google, made to me last year about automated translation, and why it’s getting better. “We don’t have better algorithms. We just have more data.”

In short, I’m speculating that the 1-800-GOOG-411 service is designed to harvest voice data to build Google’s own speech database, rather than licensing from Nuance or another player.

If I’m right about this, we see here another demonstration of my Web 2.0 principle that “data is the Intel Inside”, and that many of the future battles between industry giants will be around who owns data, rather than who controls software APIs.

Technorati Tags: , , , , , ,

  • Alan Rimm-Kaufman
    Alan Rimm-Kaufman founded the Rimm-Kaufman Group...
  • Comments are closed.