How Accurate Are Consumer Sleep Trackers?

Not all sleep trackers are created equally, according to West Virginia University (WVU) neuroscientists.

Prompted by a lack of independent, third-party evaluations of these devices, a research team led by Joshua Hagen, director of the Human Performance Innovation Center at the WVU Rockefeller Neuroscience Institute, tested the efficacy of eight commercial sleep trackers.

Fitbit and Oura came out on top in measuring total sleep time, total wake time and sleep efficiency, the results indicate. All other devices, however, either overestimated or underestimated at least one of those sleep metrics, and none of the eight could quantify sleep stages (REM, non-REM) with effective accuracy to be useful when compared to an electroencephalogram, or EEG, which records electrical activity in the brain.

The study is published in the Nature and Science of Sleep.

“The biggest takeaway is that not all consumer devices are created equal, and for the end user to take care in selecting the technology to suit their application based on the data,” Hagen said in a statement. “Some devices are currently performing well for total sleep time and sleep efficiency, but the community at large seems to still struggle with sleep staging (deep, REM, light). This is not surprising, since typically brain waves are needed to properly measure this. However, when thinking about what you generally have control over with your sleep – time to bed, time in bed, choices before bed that impact sleep efficiency – these can be accurately measured in some devices.”

Researchers observed five healthy adults – two males, ages 26 and 41, and three females, ages 22, 23 and 27 – who participated by wearing the sleep trackers for a combined total of 98 nights.

The commercial sleep technologies displayed lower error and bias values when quantifying sleep/wake states as compared to sleep staging durations. Still, these findings revealed that there is a remarkably high degree of variability in the accuracy of commercial sleep technologies, the researchers stated.

“While technology, both hardware and software, continually advances, it is critical to evaluate the accuracy of these devices in an ongoing fashion,” Hagen said. “Updates to hardware, firmware and algorithms happen continuously, and we must understand how this affects accuracy.”

Research in this area will evolve with the technology, added Hagen, who himself utilizes four to five sleep devices to keep monitoring his ZZZs.

“I’m a big believer in living the research,” he said. “I need to understand what the consumer sees in the smartphone apps, what the usability of the devices is, etc. Without that objective sleep data, you can only rely on how you feel when you wake up – and while that is important, that doesn’t tell the whole story. If your alarm goes off and you happen to be in a deep sleep stage, you will wake up very groggy, and could feel as though that sleep was not restorative, when in fact it could have been. It’s just not subjectively noticeable right at that moment.”

At the end of the day, however, it’s up to the user’s needs as to which product may be most suited for that person, Hagen added.

“After accuracy, it comes down to logistics. Do you prefer a watch with a display? A ring? A mattress sensor? What is the price of each? Which smartphone app is most appealing? But again, that is if all accuracies are close to equal. If the price is right and the form factor is ideal, but the data accuracy is extremely poor, then those factors don’t matter.”

The Human Performance Innovation Center works with members of the military along with collegiate and professional athletes to better understand and optimize human performance, resiliency, and recovery, applying these findings to solutions for the general and clinical populations.