Alexa, do I have irregular heart rhythm? Using smart speakers to contactlessly monitor heart rhythms
We developed an artificial intelligence (AI) system that transforms commodity smart speakers into short-range active sonar hardware to measure heart rate and inter-beat intervals in a contactless manner.
A quarter of U.S. households have a smart speaker like Google home and Amazon Alexa. These devices are used predominantly as voice assistants and currently serve the purpose of glorified alarm clocks and music players. They however have sophisticated computing capability, storage and sensors that often rival many clinical-grade medical hardware.
My lab at the University of Washington has been pursuing the following question over the past few years: how can we efficiently use these ubiquitous smart speakers as medical diagnostic tools? In particular, one of our goals is to contactlessly monitor vital signs like breathing and heart rate and diagnosis various medical conditions using smart speakers. The key idea is to transform smart speakers into active sonar devices to perform contactless wireless sensing. At a high level, we emit 18–22 kHz inaudible frequency modulated continuous wave (FMCW) sound signals from the smart speaker. These signals that are reflected off the human body and received by a microphone array. When the human body moves (e.g., breathing motion), these reflections arrive either a little early or late depending on whether the person is inhaling or exhaling. By running signal processing algorithms on the reflections at the microphone array, we can extract information from the human motion in a contactless manner. Active sonar is a nature fit for smart speakers since these devices have high-quality speakers and some of them like the Apple HomePod and Amazon Echo support an array of 6-7 microphones, that are used for sophisticated acoustic beamforming processing.
Building on this idea, in prior work, we showed how we can use this technique to monitor breathing motion in a contactless manner on commodity devices like phones and smart speakers. We used this breathing motion signal to demonstrate the feasibility of diagnosing various medical conditions like sleep apnea, opioid overdose detection and for infant monitoring.
However, extracting the sub-millimeter heartbeat motion using smart speaker had remained a hard to solve problem for multiple reasons. First, heartbeats result in a 0.3–0.8 mm motion on the surface of the human body, which is an order of magnitude smaller than the wavelength of sound at our operational frequencies. Second, commodity smart speakers are designed primarily to transmit in the audible frequencies, and the inaudible frequencies (>18 kHz) they support have a limited bandwidth and a non-ideal frequency response. Third, breathing creates a much larger motion than heartbeats on the surface of the body. Though respiration rates are typically lower than heart rates, respiration is not a perfect sinusoidal motion since inhalation and exhalation durations can differ. This creates high-frequency components in the breathing motion that interfere with the minute heartbeat motion. At low signal-to-noise ratios, this prevents the latter from being reliably separated in the frequency domain using filtering; when the heart signal is weak and overwhelmed by interference from breathing motion, it becomes challenging to extract individual heartbeats in irregular rhythm that is common in cardiac patients.
Notably, when we first started working on this problem in 2019, we could in a few months extract the heart rate for healthy participants. However, the algorithm trained on healthy patients with regular heart rhythm failed to identify any heart beats when tested with cardiac patients from the acute care general cardiology unit at the University of Washington Medical Center.
To address this problem and design a system that works with both regular and irregular heart rhythms, we introduced an adaptive learning-based beamforming algorithm that maximizes the signal-to-interference and noise ratio by aligning heartbeat signals across microphones while minimizing the interference from breathing motion and noise. The adaptive beamformer uses complex weights to combine the signals from different microphones across frequencies. To compute the weights, we formulate an optimization function that we solve using a gradient descent algorithm that is commonly used in machine learning frameworks. We then designd a segmentation algorithm that runs on the resulting heart rhythm signal to identify individual heartbeats and compute the beat-to-beat (R-R) intervals.
While COVID-19 delayed evaluation by more than six months, we eventually were able to evaluate our system with both healthy participants and cardiac patients with diverse structural and arrhythmic cardiac abnormalities including atrial fibrillation, flutter and congestive heart failure. Our results showed that the smart speaker system was able to compute average heart rate and heart rate variability with accuracies similar to ground truth clinical-grade electrocardiogram (ECG) sensors and was able to identify patients with irregular heart rhythms like atrial fibrillation.
This, I believe, is a significant technical step towards realizing the vision of transforming smart speakers into medical diagnostic tools. Looking into the future, if you have a device like this, you can monitor a patient on an extended basis and define patterns that are individualized for the patient. For example, we can figure out when arrhythmias are happening for each specific patient and then develop corresponding care plans that are tailored for when the patients actually need them. This technology could change how doctors conduct telemedicine appointments by providing vital sign data including breathing and heartbeat signals that would otherwise require in-person clinical visits. Given the growing ubiquity of smart speakers, we are excited to see how this technology can be part of the future of cardiology and more generally medical care.