aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Engineering
  • Technology

From Insights To Robots, Speech AI Use Cases Have Exploded

  • aster.cloud
  • May 3, 2022
  • 5 minute read

It’s been five years since we launched the Google Cloud Speech-to-Text (STT) API, and we’re awed by the things our customers have done. From powering voice-controlled apps to generating captions for videos, the API processes more than 1 billion minutes of spoken language each month—enough to transcribe the entirety of the Oxford English Dictionary more than half a million times (including obsolete words), assuming normal speaking speeds.

“With voice poised to become the next major disruption in human-computer interaction, technologies like Google’s Cloud Speech API are becoming increasingly important to enterprises looking to keep pace with changing consumer behaviors and expectations. In partnership with DeepMind and Google Brain, Google continues to invest in this space and bring new innovations to the market that enable organizations to quickly and easily add voice components to their consumer-facing applications,” says Ritu Jyoti, group vice president, AI and Automation Research Practice at IDC.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

Familiar use cases, like giving instructions to a smartphone assistant or watching text appear as someone speaks during a video meeting, are just the beginning, with customers making more advanced and creative uses of these AI technologies each day. Once you can accurately transcribe and understand spoken language at scale, you can layer on a variety of other AI services and applications to create more engaging experiences or deeper insights from this data.

 

To explore new frontiers in this technology, and illustrate how your business might do more with voice, let’s examine some of the novel ways Google Cloud customers are using the Speech API, from creating better sales experiences to building friendly robots.

Moving from speech to insights and sales: InteractiveTel

Phone calls are a significant source of leads and sales for automobile dealers, but historically, dealers have struggled to collect and act on call data, even failing in some cases to call back the majority of would-be buyers. Leaders at InteractiveTel, a provider of cloud-based telephony applications that help improve customer service and improve sales, recognized that AI could erase these challenges.

Read More  Frontier Model Security

They envisioned voice data as an opportunity to provide dealers with real-time insights for more productive conversations, more reliable follow up, and ultimately, more robust sales. Early in its history, however, InteractiveTel relied on speech recognition technologies that produced inconsistent results.

This led the company to become one of the first STT API customers when the product was released in 2017. The company almost immediately enjoyed a 30% improvement in transcription accuracy and has been growing more advanced and reliable ever since.

“The biggest KPI that speaks to our platform’s power is retention,” said co-founder Gary Graves. “We have a 96% retention rate.”

Graves noted that the Google Cloud Speech API is central to this success. “Without it, we’re just vanilla ice cream,” he stated. “When we first started, we baked the Cloud Speech API into our core. Every discussion has to be transcribed with the API, and generating that data in near real-time creates a foundation for richer services.”

For example, if a customer calls about a specific vehicle that is not available, InteractiveTel surfaces alerts for the dealer as the conversation happens, helping them to know if a similar vehicle will soon be in stock. The platform also knows if the customer has had past interactions, such as appointments at the dealership, and even includes sentiment analysis to detect events like disagreements between a customer and salesperson that may require a sales manager to join the call.

“The API is pretty low maintenance,” according to Graves. “It has scaled with the company, keeping up with velocity and never causing a bottleneck.”

“I’m data driven. We tested everything out there at the time,” he added. “Google works best. Other providers reach out every six months or so, and I always tell them, ’Try again in six months.’ That’s been happening for years.”

Read More  How Bayer Crop Science Uses BigQuery And Geobeam To Improve Soil Health

Fostering childhood development with a robot friend: Embodied

While InteractiveTel’s platform speaks to trends in the business world, Embodied’s Moxie robot shows how Speech AI can impact social-emotional learning, from hospitals to the home. Designed for continuous conversations, not just predefined prompts and responses, Moxie encourages children to interact with it as they might with a friend. For example, if a child says, “I like space,” Moxie can automatically shift into a conversation filled with astronomical facts, or if a child reads a book from Moxie’s Book Club, the robot can lead a targeted question and discussion session after reading.

Embodied’s Moxie

 

Though a fun way for all children to work on social, emotional, and critical thinking skills, Moxie has been particularly promising for children facing adversity, from social isolation to difficulty making friends. Some parents of children with developmental disorders have shared promising feedback about their children’s social-emotional development after spending time with Moxie. The robot can discern whom to address and how to proactively engage, using subtle eye gaze signals, facial expressions, and body language as part of its response to create a lifelike, believable AI friend that can gain build rapport with a child.

“We want to empower parents to help children with technology,” said Paolo Pirjanian, Embodied’s founder and CEO. A former NASA scientist who previously served as CTO of iRobot, Pirjanian noted that though the market for interactive robots is in “early innings,” they’re encouraged by reception to Moxie. The robot “provides a non-judgmental space that helps kids to share hard feelings and encourages engagement with friends and family and the world around them,” he said.

A number of AI technologies enable Moxie’s multi-modal interactions, as well as the accompanying app for parents. Computer vision technologies, for example, help to decipher a child’s body language. But as with InteractiveTel, the Cloud Speech API is the starting place for interactions, as the robot cannot tap into resources appropriate to the situation if it cannot accurately understand the child in the first place.

Read More  How CISOs Need To Adapt Their Mental Models For Cloud Security

When Speech meets CRM: HubSpot

HubSpot is also using speech-derived data for insights, through its Conversation Intelligence products. Hubspot customers can use AI to automatically take notes in meetings, for example, and connect voice data to CRM data to measure trends, identify changes in market dynamics, and even unlock coaching opportunities.

To offer Conversation Intelligence, HubSpot uses a proprietary stack of several models built atop the STT API. HubSpot leverages a variety of the API’s features, from contextual biasing to speaker tagging, said Ian Leaman, Senior Product Manager, AI, at HubSpot.

“It had the best word error rate, and it was plug and play while still giving us the freedom to mess around and find the best configurations, as we figured out which models work best for different segments of our customer base,” he added. “It’s helped us to support happy customers, achieve faster dev times, and support more languages”

Conversations enable richer AI experiences and services

As these stories attest, speech AI technologies are powerful in and of themselves, but they’re also an important starting point for many more advanced and ambitious use cases that combine many AIs for never-before-seen experiences. Five years ago, many of the customer stories we see today would have seemed more aspirational than feasible, and we expect half-a-decade from now, we’ll continue to be humbled by the ways AI changes how we interact with machines and even one another. To learn more about Google Cloud’s Speech API, click here.

 

By: Calum Barnes (Product Manager, Speech, Google Cloud)
Source: Google Cloud Blog


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Artificial Intelligence
  • Google Cloud
  • Machine Learning
  • oogle Cloud Speech-to-Text
  • Robots
  • Speech AI
You May Also Like
View Post
  • Engineering

Just make it scale: An Aurora DSQL story

  • May 29, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

Reliance on US tech providers is making IT leaders skittish

  • May 28, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

Examine the 4 types of edge computing, with examples

  • May 28, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

AI and private cloud: 2 lessons from Dell Tech World 2025

  • May 28, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

TD Synnex named as UK distributor for Cohesity

  • May 28, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

Broadcom’s ‘harsh’ VMware contracts are costing customers up to 1,500% more

  • May 28, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

Weigh these 6 enterprise advantages of storage as a service

  • May 28, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

Pulsant targets partner diversity with new IaaS solution

  • May 23, 2025

Stay Connected!
LATEST
  • 1
    Just make it scale: An Aurora DSQL story
    • May 29, 2025
  • 2
    Reliance on US tech providers is making IT leaders skittish
    • May 28, 2025
  • Examine the 4 types of edge computing, with examples
    • May 28, 2025
  • AI and private cloud: 2 lessons from Dell Tech World 2025
    • May 28, 2025
  • 5
    TD Synnex named as UK distributor for Cohesity
    • May 28, 2025
  • Weigh these 6 enterprise advantages of storage as a service
    • May 28, 2025
  • 7
    Broadcom’s ‘harsh’ VMware contracts are costing customers up to 1,500% more
    • May 28, 2025
  • 8
    Pulsant targets partner diversity with new IaaS solution
    • May 23, 2025
  • 9
    Growing AI workloads are causing hybrid cloud headaches
    • May 23, 2025
  • Gemma 3n 10
    Announcing Gemma 3n preview: powerful, efficient, mobile-first AI
    • May 22, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    Cloud adoption isn’t all it’s cut out to be as enterprises report growing dissatisfaction
    • May 15, 2025
  • 2
    Hybrid cloud is complicated – Red Hat’s new AI assistant wants to solve that
    • May 20, 2025
  • 3
    Google is getting serious on cloud sovereignty
    • May 22, 2025
  • oracle-ibm 4
    Google Cloud and Philips Collaborate to Drive Consumer Marketing Innovation and Transform Digital Asset Management with AI
    • May 20, 2025
  • notta-ai-header 5
    Notta vs Fireflies: Which AI Transcription Tool Deserves Your Attention in 2025?
    • May 16, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.