R7 Speech Sciences is an AI company focused on understanding spoken conversations. We believe in a voice-first future, and R7 was born out of our frustration with the existing voice products and their inability to capture nuances in conversations. By pushing the state-of-the-art in machine learning and speech science, harnessing massive datasets, and a strong team, we are building the next generation of voice products that go beyond just words. We are also introducing this blog which will be mostly science and engineering focused, other than the occasional product-related announcement.
“We are all prodigious olympians in perceptual and motor areas, so good that we make the difficult look easy.” — Hans Moravec (c.f. Moravec Paradox)
Understanding the intricacies and nuances of spoken conversation is an innately human ability, yet it has been one of the hardest challenges for AI agents to overcome despite stellar advances in AI in recent years. At its core, this is a difficult problem because understanding speech requires more than simply understanding the words.
Watch Jerry demonstrate how conversation is more than words
Current approaches to understanding spoken conversations, which use Automatic Speech Recognition (ASR) to process speech into transcripts, and then apply standard Natural Language Processing (NLP) techniques on the transcripts just won’t cut it. Human conversations carry a trove of latent signals, such as prosodic and paralinguitic cues, that add to the meaning of a conversation. Most of this information is dropped on the floor during ASR. This is why we, at R7, are motivated to understand spoken conversations from scratch: starting with raw audio signals.
Enter, R7 Speech Sciences
At R7 our goal is to achieve a near human-like understanding of conversations, which first begins with good listening. Our research will be focused on this singular goal – build agents that can understand human conversations better. We do this by building novel models, better datasets, and a strong team. In future blog posts and announcements, we will share more about our models, datasets (as much as our lawyers allow us!), and approaches. The current seed team has a collective research experience of 30 years in speech and language processing, and growing. We are currently a small yet passionate team of researchers and hackers based in San Francisco.
Why the name R7?
There’s one way to find out: [email protected]
Follow this blog and our Twitter for future product, science, and engineering announcements. Our next post will be summarizing new research results.