TellMe acquisition by Microsoft

Microsoft paid over $800 million for Tellme Networks in one of its largest private company acquisitions at the time. Tellme was not a consumer brand—most people had never heard of it—but it was the infrastructure behind automated phone systems at companies like FedEx, Domino’s Pizza, and directory assistance (800-555-TELL). Voice interfaces at scale, built on speech recognition, before smartphones made the interface familiar.

The acquisition was interesting for two reasons. The obvious one: Microsoft wanted voice technology, and buying a proven commercial operation was faster than building one. The less obvious one was what Tellme CEO Mike McCue said about the journey.

McCue gave an unusually candid interview to Fast Company about the pressures of being a public company. His observation deserves to sit outside the business school framing it usually gets: “Going public feels great—and then the next day, you realize you have all these expectations.” What he was describing was the structural tension between the quarterly reporting cycle and the kind of long-horizon investment that genuine innovation requires. Public markets reward consistent performance. Breakthrough work is rarely consistent—it is lumpy, ambiguous, and uncomfortable to explain on earnings calls.

His projection for the voice market—$15 to $20 billion—was not unreasonable given what was visible in 2007. What was invisible was how radically the interface layer would shift. The speech recognition in Tellme’s systems worked because the grammar space was constrained: a customer calling FedEx is asking about a package, not asking about the meaning of life. Free-form voice interaction at the scale that Siri, Alexa, and Google Assistant later demonstrated required a different order of magnitude of language modeling.

The advice embedded in McCue’s story—think carefully about the expectations you accept when you take other people’s capital—has not aged. Neither has the underlying insight about voice: the interface that feels most natural to humans is also the hardest to engineer. VoiceXML was an early bet on that difficulty. The bets people are placing now are just bigger.