Addressing the Cocktail Party Problem: AudioTelligence's Novel Approach to Improve Hearing in Noise

AudioTelligence is a UK-based audio startup that is developing and leveraging audio technologies for social good within the consumer technologies and hearing care markets.

In this episode, Rick Radia, the Product & Partnerships Manager at AudioTelligence, shares the company’s mission and motivation in disrupting these markets with their software solution, aiso™. This software solution has the capability to improve speech understanding in noise through the use of multi-microphone devices including smartphones and hearing solutions.

For more information on aiso™, you can visit the company’s website here.

Full Episode Transcript

Amyn Amlani 0:10
Welcome to This Week in Hearing. The ability to hear noise if that’s all listeners, those with normal hearing, and especially those with hearing loss. The folks at AudioTelligence, a startup in the UK, has been grappling with solutions to this issue. With me today is Rick Radia. Rick, welcome. And let’s begin by you sharing a little bit about yourself.

Rick Radia 0:33
Thanks Amyn, for inviting me to be part of This Week in Hearing’s webcast. I’m the product and partnerships manager at AudioTelligence, and I’m responsible for the product and commercial strategy, as well as business development. I’ve been working in innovation and business transformation within the healthcare industry for over 10 years. And I’m really delighted, delighted to be here today to talk to you about the cocktail party problems, speech in noise and how we at AudioTelligence are working to solve this.

Amyn Amlani 1:03
Yeah, we’re really looking forward to hearing some of the solutions and and opportunities that you’re going to share with us today. But before we do that, let’s talk a little bit about AudioTelligence and the motivation behind your startup.

Rick Radia 1:18
So AudioTelligence is a audio startup based in Cambridge in the UK, we have strong ties to Cambridge University. The team here includes the original inventors of our patented algorithms. And we actually spun out from a company called CEDAR Audio, a company which has over 30 years of experience in professional audio products for the film and recording industry. AudioTelligence was actually formed, so that we can further develop and leverage our audio technologies for social good within both the consumer tech and the hearing market.

Amyn Amlani 1:51
Yeah, so your company is trying to solve the speech in noise issue or the cocktail party effect. And so as we start having discussions about the solutions that you have, we think of the cocktail parties issue, not just as a hearing issue, but also as a cognitive issue. Can you talk a little bit about that, please?

Rick Radia 2:15
Sure. So, age related hearing loss is gradual. And the first signs of hearing loss can be really subtle and hard to detect. One of these signs is the inability to follow a conversation in a noisy place. This is often referred to as the Cocktail Party Problem. This means that in the presence of background noise, people with hearing loss are unable to discern the speaker of interest, or the mixture of sound. So people who suffer from this Cocktail Party Problem might think that hearing is fine, because they can hear clearly at home. But when they’re in a crowded bar or restaurant, they find themselves struggling to follow the conversation. Now, this difficulty in following the conversation and understanding speech in noisy environments, gives rise to quite a number of issues. So when speech intelligibility decreases, in need more efforts to understand the speech, and this is increased cognitive load, and causes fatigue. And in the long term, people begin to avoid those situations. And this can lead to social isolation.

Amyn Amlani 3:18
Yeah, you know, it’s as we’re starting to learn, the whole issue of social isolation has has huge implications as it relates to the cognitive issue, which is another conversation. But I’m also understanding that you have data on the percentage of individuals that these problems effect – this cocktail party effects. Can you share some of that research with us – So that data with us?

Rick Radia 3:40
Absolutely. So we actually conducted research in five key global markets, the US, UK, Germany, China and Japan. And we interviewed over 5000 people. And the results even surprised us, to be honest, we actually found that 81% of 40 to 64 year olds struggle to hear in busy and noisy environments. And this high, this is a high percentage of the population. And a huge number of those are looking for a solution. Now, not all of these people will need hearing aids, they may only suffer from mild hearing loss and need an alternative intervention at this stage. This could be the start of their hearing loss journey, essentially. But the research also showed that 69% of people with a hearing solution, whether that was a cochlear implant, or a hearing aid or even a remote microphone, we’re looking for an alternative for social situations, because those solutions out there currently fall short in those more complex scenes, acoustic scenes, in terms of social situations, and busy and noisy environments.

Amyn Amlani 4:46
Yeah, and you know, the data shows at least industry data shows that one of the continued satisfaction or dissatisfaction areas are people who are aided who have issues still be unable to hear against background noise. And so now we come up to the issue of speech intelligibility. Right, so we’re trying to improve speech intelligibility. And you guys have done some work here. In looking at this, can you share that information with our viewers?

Rick Radia 5:17
Sure. So, speech, intelligibility can be defined as how clearly a person speaks. So his or her speech is comprehensible to a listener. So it can be really difficult when we look at technology on the market to compare that and quantify how much they improve speech intelligibility. But one of the methods that we can use to measure speech intelligibility is the speech reception threshold. The speech reception threshold is defined essentially as a sound pressure level necessary for 50% correct sentence recognition. So it allows experimental results to be compared with one another. So it allows the end user and professionals to compare different technologies. And this measure can easily be expressed in terms of the speech noise ratio, and it gives a good baseline for evaluations of different solutions and systems. So when it comes to poor speech to noise ratio, and those environments, they significantly affect speech intelligibility. So poor speech to noise ratio environments significantly affect speech intelligibility. And some of the studies that we’ve seen, have found that the SNR required 50% correct sentence recognition in noise is about 1.6 dB for people who suffer for hearing loss. Now, to give you a picture of what that’s like, and the impact it can have, a meal at home will take typically have an SNR of between 1.5 dB and 3.1 dB. So people with hearing loss without intervention, and already at that 50% correct sentence recognition. These scenarios, even at home become really difficult experiences.

Amyn Amlani 7:03
Yeah, so these individuals are struggling in sometimes they’re wearing these devices. How do these devices ya know your cochlear implants and your hearing assistive technologies and of course, your traditional hearing aids, how are they overcoming these these limitations in signal to noise ratio.

Rick Radia 7:20
So most hearing aids and cochlear implants, or even assistive listening devices rely on directional microphones, beamforming, or even AI based noise suppression. And these audio processing techniques do have benefits, but they also have some limitations in those complex environments. So if we were to take beamforming, for example, in principle is a spatial filter, so uses the physics of wave propagation to focus on a particular direction or sound source. This means that a beamformer can extract a signal from a specific direction and reduce interference of signals from other directions. Beam formers are have an advantage of being mathematically simple and relatively easy to implement in real time systems, which is why we see them in hearing aids and other devices in the market. However, traditional beamformers need to know lots of information about the acoustic scene, such as the target source direction, and the microphone geometry. And even more sophisticated systems will have to have very have to have collaborate, calibrated microphone sorry. So from our research, one of the limitations of beamforming, which is probably quite well known is that it cannot separate two sound sources in the same direction, because ultimately, it’s amplifying everything in the direction of interest. So if we were to look at AI based solutions, they actually analyze the signal in time frequency domain, and try to identify which components are due to the signal and which components are due to noise. The advantage of this approach is that it can work with just a single microphone. But the big problem with this technique is that extracts a signal by dynamically gating the time frequency content. And this gating can lead to unpleasant artifacts in poor signal to noise ratio. And so as a result, voices can seem unnatural distorted, and these are the observations that are seen from the end users. So AI solutions often use deep neural networks as well to separate speech from noise. And this high level approach consists of three distinct steps. We have data collection, training and interference. And data collection involves generating a big dataset of synthetic noises, speech by mixing clean speech with noise. And the system is trained by feeding this data into the deep neural networks, then a mask which will leave the human voice but filter out noise is produced, and that’s why we are seeing the artifacts. But also with the deep neural networks, we see an introduction of latency, so we get lip sync issues between the person speaking and the lips itself and what the what the end user is hearing

Amyn Amlani 10:03
Yeah, so the overcome these limitations, you guys have come up with a software solution, can you share that solution with us, please.

Rick Radia 10:12
asio for hearing is a combination of 26 person years of research. And it takes inspiration from the way our brain works by using these algorithms to separate different sources of sound, as well as suppressing background noise and babble. Now, this technology is not AI is not beamforming. And it’s not machine learning. This the technology at the heart of asio For Hearing, is actually Blind Source Separation. And we’ve combined that with our low latency noise suppression. This technology delivers a significant improvement in speech noise ratio, even in extremely noisy and busy environments. And it outperforms traditional interventions that rely on beamforming or AI, to actually improves speech, the speech recognition threshold by 16 dB, and provides up to 26 dBs of noise suppression. With this asio For Hearing solution, it can be – it is a software solution that’s ready to be integrated. And it can be integrated into multi microphone arrays, whether that’s a hearing accessory or remote microphones, smartphones and tablets. And this gives them enhanced assisted listening capabilities, and helps the end user in complex acoustic environments like busy restaurants and cafes.

Amyn Amlani 11:27
Wow. So this sounds really, really promising. But I have to be honest with you, I don’t know much about blind source separation. Can you expand on that a little bit for us please?

Rick Radia 11:38
Sure. So, as I said, a safer hearing combines blind source separation and low latency noise suppression, which together solves the speech in noise problem or the Cocktail Party Problem. Blind Source Separation is a data driven approach based on Bayesian statistics. blind source separation analyzes the raw signal data provided by the microphones to locate the sound sources in a scene. So first separates the sources it binds into channels. We then have different methodologies in order to select the channel that best corresponds to the source of interest, so that the interference of the other channels is rejected, and only the selected channel can be heard. Now, one of those methodologies that we use, and we can implement is based on conversational dynamics and signal content. We you know, we often get asked how this differs from beamforming found in other hearing solutions, including remote microphones, or blind source separation does not need to know the acoustic scene, so works in any environment without the need to calibrate. Additionally, blind source separation works with off the shelf microphones, which don’t require calibration or training. And we use all the audio data from the microphones rather than any prior knowledge from the geometry or the acoustic scene. So we don’t need that huge data set that you see in AI. So and we don’t need to train. So alongside the blind source separation, and further enhanced performance of the technology, we’ve combined this with our low latency noise suppression. Now, the low latency noise suppression reduces reduces background noise, including babble. And this combined with the BSS not only improves intelligibility, but it helps with that listening fatigue we talked about earlier. So our noise suppression differs from other two techniques in two ways. Firstly, traditional noise suppression techniques use single channel approach to single channel audio. Whereas our technique utilizes multi channel audio to further enhance the noise suppression. So we’re using all the data that we have on the microphone to provide that degree of noise suppression. And also, the way in which we’re working. As I said, we’re not using AI, and we’re not using deep neural networks. And so this means that when we’re doing the noise suppression, we’re not introducing any artifacts or distortion, and the voices remain really natural. And this is actually something we’ve confirmed through testing of products with people from mild to severe hearing loss, who reported, you know, the natural sounding element of the voices that they had.

Amyn Amlani 14:25
So you have this this what I would consider a novel approach. But here’s the main question, what’s the improved outcome- does it work?

Rick Radia 14:36
Well, since the company was established, we’ve been creating these algorithms, but made sure they’re integration ready. We wanted to ensure that this was not just a theory and actually wants to prove our technology works in the real world. We therefore integrated our so for hearing software into remote microphone prototype, which I can show you here. This is a eight microphone to vise, and this prototype was developed to validate the technology in the real world. With this prototype, we’ve managed to run user trials in busy noisy cafes and restaurants, with people with mild to severe hearing loss. And the outcome for us as a company was incredible. All of the participants asked if they keep the device due to the significant benefit or had they felt they can actually finally join the conversation and actually socialize in those environments. Again, this, we were really divided delighted to hear, because it actually shows that we have something demonstratable to our future partners. So asio For Hearing offers 30 dBs improvement in the speech to noise ratio, and is uniquely valuable when there’s overlapping speech signals, a problem that cannot be addressed effectively with noise suppression technology alone. So if you’re in a social setting where there’s multiple conversations happening around you, you can focus in on one conversation, and you can reject the other conversation. So it makes those social situations a lot more easy to manage. Now, we recently benchmarked our technology to see how we can improve speech intelligibility in different conditions. We actually tested it from minus -15 dB SNR to 15 dB SNR. And you can see this data from our white paper as well. And we use an objective measure called soy, which has a high correlation to the speech recognition ratio and can be easily mapped. So to calculate soy, we use unprocessed microphone data and process mic microphone data at different SNR. So if we were to look at minus five dB SNR, similar to a noisy restaurant, the top of the range hearing aid would improve speech understanding to up to 50%. The top of the range assistive listening device, a remote microphone would improve speech understanding to 80%. Whereas our asio For Hearing technology improves speech understanding up to 90%. And we’ve tested that, to clarify in the real world with diffuse background noise.

Amyn Amlani 17:04
Wow that’s incredible. I mean, those are huge gains, compared to the technology that’s available today. So hats off to you guys for for essentially helping to solve this issue. But I’m also understanding that you guys are not only promoting this with with assistive technologies for those with hearing loss, it’s also going into the consumer electronics market, am I correct?

Rick Radia 17:29
So we’re actually in discussion with major players in the hearing market, including hearing aid companies and consumer brands. And we’re looking to integrate our software into hearing solutions, and consumer tech devices. And there’s many applications, including remote microphones, potentially integrate into earbud charging cases, and mobile phone accessories, and even apps. And we hope see this technology in the market soon. Because we see there’s a benefit not only to those people with diagnosed hearing loss, but those people who want to start their hearing loss journey, or who needs to start their hearing loss journey and any struggling those social situations to begin with.

Amyn Amlani 18:10
So Rick, great concept, we’re really looking forward to see this hit the market at some point in time here in the near future. So best, best wishes for you all at AudioTelligence is moving in that direction. Now that you’ve shared this information with us, what are some future things that you all are considering?

Rick Radia 18:31
At the moment, we have a software that is ready to be integrated. And so we’re in discussions with major players in the hearing market, including hearing aid and consumer tech brands. And we’re looking to integrate our software into their devices. But as a company, we are also continuously testing, researching and refining our core algorithms to enhance their performance. So the next big step for us is to integrate this technology into different form factors, including the earbud charging case, and mobile accessories. And we also want to continue working with the end users and testing the product with them. And we have some user trials planned with them in the next few months to test new features, and get their feedback to ensure we’re always developing with them in mind.

Amyn Amlani 19:18
Well that sounds fantastic. So Rick, last question. What are some final thoughts for our viewers?

Rick Radia 19:25
So at AudioTelligence, we are really passionate about improving the lives of people with hearing loss. And that’s why throughout the development, we’ve engaged the hearing loss community, we’ve ensured our solution has been tested by real people in the real world. And the outcome is our integration ready software that’s had a tremendous amount of positive feedback from the end user. The asio For Hearing technology improves intelligibility and helps the end user with situation hearing loss. And we believe this technology has could have a big impact on the market. Not just for those people who have hearing aids that need additional intervention, but also for those people who are looking to start their hearing loss journey and manage their hearing health. And we really hope that we can get this into the market in the coming months with the right partner.

Amyn Amlani 20:16
Well Rick, we wish you guys all the best. We look forward to hearing an update from you down the road and want to thank you so much for your time today.

Rick Radia 20:26
Brilliant. Thank you so much for having us on the show.

Be sure to subscribe to the TWIH YouTube channel for the latest episodes each week and follow This Week in Hearing on LinkedIn and Twitter.

Prefer to listen on the go? Tune into the TWIH Podcast on your favorite podcast streaming service, including Apple, Spotify, Google and more.

About the Panel

Rick Radia is the product and partnerships manager at AudioTelligence. He is responsible for the product and commercial strategy, as well as business development for the company. He has a background in of working in innovation and business transformation within the healthcare industry for over 10 years.

Amyn M. Amlani, PhD, is President of Otolithic, LLC, a consulting firm that provides competitive market analysis and support strategy, economic and financial assessments, segment targeting strategies and tactics, professional development, and consumer insights. Dr. Amlani has been in hearing care for 25+ years, with extensive professional experience in the independent and medical audiology practice channels, as an academic and scholar, and in industry. Dr. Amlani also serves as section editor of Hearing Economics for Hearing Health Technology Matters (HHTM).

Addressing the Cocktail Party Problem: AudioTelligence’s Novel Approach to Improve Hearing in Noise

Full Episode Transcript

Helpful Links

Sections

HHTM eBooks