Survey data shows sleep lab leaders prioritize validation and certification when implementing sleep study autoscoring software.

By Sree Roy

A new Sleep Review survey of sleep lab supervisors reveals ambivalence surrounding sleep study autoscoring software. Case in point: When asked about the biggest positive of the technology, 30% of respondents declared that it offers “no significant positives.” An equal 30% cited its ability to standardize scoring quality as its best feature; 22% said its best feature is being faster than human scoring; and 9% each responded that it costs less than hiring more sleep techs and that it’s available 24/7.

autoscoring software biggest positive — *We asked: What do you consider the biggest positive of autoscoring software? (Select one.)*

As sleep lab owners and managers appreciate the ease of sleep staging with software assistance, as well as many auto-calculated metrics, they also share concerns about over-relying on auto-scoring, its impact on sleep tech jobs, and mislabeled artifacts and other discrepancies that must be fixed by a human.

Most Labs Don’t Purchase Standalone Autoscoring Software

While many sleep labs have access to autoscoring, only 17% reported paying for dedicated software, while 21% are in the process of actively evaluating options. Meanwhile, about 46% of respondents use free autoscoring software that is included with their polysomnography equipment. Also, nearly 30% of sleep labs report having no plans to adopt the technology. None reported discontinuing use after implementing it.

implement autoscoring — *We asked: How does your sleep lab currently incorporate sleep study autoscoring software, if at all? (Select all that apply)*

Labs that use autoscoring software find it helpful for specific parameters, particularly sleep staging and metrics related to oxygenation. When asked to select the most helpful use cases from a list, the supervisors selected the following as the top 5 (they could select up to 3 from a list of 17):

1. Nadir oxygen saturation (41.7%)

2. Sleep staging (37.5%)

3. (tie) Oxygen saturation trends (33.3%)

3. (tie) Snore events (33.3%)

5. (tie) Mean oxygen saturation (29.2%)

5. (tie) Periodic limb movement analysis (29.2%)

Supervisor Concerns

But some sleep lab supervisors also voiced concerns. “The overscoring is ridiculous because they catch so much wrong that it would save money just having the person score it themselves from the start,” one respondent commented. “If AI could get scoring down and AASM [American Academy of Sleep Medicine] approves it, and we would NOT need an ‘overscorer,’ then this all becomes beneficial.”

Provided with a list of eight choices, the top concerns among lab supervisors are as follows (these were weighted averages on a 5-point scale):

Staff relying too heavily on auto-generated results (3.83)
Reduced review of raw data (3.71)
Job displacement for technologists (3.58)
The software not performing as well as a human scorer (3.52)
Technical problems with the software (3.46)

autoscoring software concerns — *We asked: How concerned are you about the following?*

Validation, Certification Wanted

In 2023, the AASM launched a 2-year pilot to certify in autoscoring software, certifying Philips Respironics Sleepware G3 with Somnolyzer versions v4.2.0.0 and 4.0.2.0, EnsoData EnsoSleep v6.26.0, and SOMNOmedics DOMINO v 3.0 for sleep staging accuracy. The pilot is no longer accepting applications, but the AASM has expressed plans to develop a full PSG autoscoring certification program.

According to our survey, this type of third-party certification is exactly what sleep lab supervisors want to see when making their purchasing decisions. When asked what factor weighs most heavily in a purchasing decision, “AASM certification” was the top choice for a third of respondents (33.3%). “Validation studies” ranked second, chosen by 25% of lab leaders. Cost and compatibility with existing systems were tied for third (16.7%).

Further emphasizing this point, nearly half of respondents (46%) said that a software’s recognition in the AASM Autoscoring Certification Pilot Program would influence their purchasing decision either “significantly” or “somewhat.”