For the last decade, technology has driven sleep diagnostics. The development of autoscoring or computer-assisted scoring technology within the field of polysomnography has been one such driver. While most sleep labs in the United States continue to rely on manual scoring, automated scoring has started to take a modest foothold in the industry.
Autoscoring programs use a mathematical algorithm to distinguish between sleep stages, respiratory events, and periodic limb movements. While manufacturers tout the technology’s time saving and efficiency benefits, the reality is that it is a tool that can assist sleep technologists in scoring, but it cannot replace them.
First and foremost, automated scoring technology is a computer application, and like the most successful computer applications, it is designed to save time and money. But the difference between, for example, a word processing application and an autoscoring application is standardization.
“When you’re typing a document, it is a standardized, very well-defined thing that you’re doing over and over again,” says Max Hirshkowitz, PhD, DABSM, associate professor in the Department of Medicine and Psychiatry, at Baylor College. “But when you get into areas like analyzing waveform, it may not be that well defined.”
A key requirement of any automated scoring program is well-defined detection criteria. Without scoring standards, a programmer is going to be hard pressed to write a useful scoring algorithm.
Until recently, the sleep field had relied on the scoring parameters originally laid out by Allan Rechtschaffen and Anthony Kales’ (R&K) 1968 scoring manual. While good, according to Hirshkowitz, it lacked definition.
“I know a lot of people bash it, but it was actually amazingly good; however, it didn’t define things. It defined things by example,” says Hirshkowitz, who is also director of the Houston VAMC Sleep Disorders and Research Center. More importantly, he adds, R&K defined normal sleep, while the primary application of automated scoring technology these days is abnormal sleep.
The abnormal patient is where the differences and difficulty lie, according to Mark Rizk, RPSGT, manager of Nihon Kohden America’s sleep products business unit. “Respiratory autoscoring is generally accurate as long as the quality of the recording is reasonable. We study patients who have all different types of records—people with complex apnea are less accurate; and people with straightforward obstructive apnea are more accurate in terms of the auto analysis,” says Rizk.
The 2007 publication of the American Academy of Sleep Medicine’s Manual for the Scoring of Sleep and Associated Events: Rules, Terminology, and Technical Specifications has been a step forward in the development of less ambiguous parameters and better autoscoring technology. Still, the AASM scoring manual is not the gold standard sleep professionals may have hoped for.
“When you look at autostaging, you have to consider that even though we’re using the new rules for AASM guidelines—N1, N2, N3—they are still based on the same basic rules of Rechtschaffen and Kales that were done 40 years ago and that were based primarily on normal people,” says Rizk.
Even with the AASM manual, sleep scoring is still plagued by experts who don’t agree, says Hirshkowitz. While one expert may look at an event and see a hypopnea, another will disagree and identify the event as a respiratory effort related arousal; or where one expert identifies the event as stage 3, another will argue it doesn’t have enough delta.
“So here’s the dilemma: the expert opinion is the standard; but the standard is inconsistent. You’re trying to make a computer agree with two people who disagree. You can’t agree with both of them. In engineering, that’s called the ‘rubber ruler problem,’ because the standard is not standard—it’s rubber. It stretches this way and that and bends,” says Hirshkowitz.
So, where does autoscoring work best? Its best attribute may be its ability to save time.
“The positives of [autoscoring technology] are that certain parameters can be reasonably reliably scored, but they still have to be validated and looked at to make sure that the computer has not run amok. But that does save time,” says Hirshkowitz.
“What a technician gets back [from autoscoring technology] is a study that looks exactly like it would if a human scorer had done it; only it takes advantage of the automated algorithms that ensure the highest accuracy and consistency of that scoring. The technician can apply what we call the expert review, and the software indicates to the technician portions of the study where [they should] focus their time,” says Tim Murphy, senior director/general manager of Philips Home Healthcare Solutions’ Diagnostic & Clinical Software Applications. He estimates that by using Philips’ Somnolyzer 24/7 autoscoring software, technicians can save about 75% of the time they would normally take if they were scoring a study manually.
Joel Porquez, BS, RPSGT, from Children’s Healthcare of Atlanta and the Atlanta School of Sleep Medicine and Technology, agrees that autoscoring technology saves time, allowing the technician to better serve the patient. “I’d say what works with autoscoring is the ability to produce an immediate report. Although I have a conservative view of autoscoring, I can see the benefit if the sleep tech notices severe sleep apnea and that morning the sleep lab has the ability to give the patient an autotitration CPAP machine. In this scenario, autoscoring might help expedite the treatment process,” he says.
For Martin Moomaw, director of scoring and data analysis at MD-Sleep, Carmel, Ind, the current technology does a good job with EEG and respiratory. “It usually does very good on moderately severe patients, or patients with moderate alpha intrusion. If it gets into a situation where there is severe alpha intrusion, or a very high AHI [apnea/hypopnea index] patient where the arousals are so frequent that the effects are always wake, that can be a problem. There’s got to be some human intervention to come in and remark on some of that stuff.”
According to Moomaw, MD-Sleep sites that use computer-assisted scoring technology send all their studies through it. And as is true with a human scorer, the cleaner the study, the better and more accurate the report. “It’s just like a human; if the study is clean, it does a very good job. If there are problems in the study that are going to give a human a problem, then it’s going to give the software a problem,” says Moomaw.
Still, the computer-assisted scoring technology can provide a level of consistency to scoring sleep studies. “If you take two good scorers and they score the same study today and tomorrow, they’re going to be off—as much as 10% to 15% usually. So, what we find is that we can provide better consistency in our scoring. We do see some increased efficiency, but our focus is to have better patient care, better consistency,” says Moomaw.
According to Ron Fligge, RRT, senior global product manager for sleep diagnostic software applications at Philips Home Healthcare Solutions, the company’s Somnolyzer 24/7, which has FDA clearance for the purpose of autoscoring, excels at this reliability. “Human scoring is known to have variability. If you have the most experienced scorer do a study in the morning versus the afternoon, you’re going to get somewhat different results. With Somnolyzer, you’re going to get exactly the same results all the time and then you apply the expert review,” he says.
In order to advance autoscoring technology, a common file format to share reports is needed. While the European Data Format (EDF) can be used to read raw data, there is no standard file format that allows users to read shared scoring data.
“If I have a person on the other side of town or the other side of the country and they want me to look at a record, they can send me the raw data in EDF and I can load that with a public domain EDF reader. But I cannot see what they scored on that record. So they can’t ask, ‘Do you think this is scored properly?’, because there’s no way for me to [read it] unless I have an identical system to theirs,” says Hirshkowitz.
Philips’ Fligge agrees that the lack of a standard format is a problem, but points out that Somnolyzer 24/7, which Philips acquired from Siesta Group last year, uses the European Data Format and can work with certain competitors’ systems. Still, he admits there is a need for a more “seamless solution.”
While human physiology entails complex processes that can be difficult to relegate to a computer, there are advancements that could be made to improve the technology and make it a more effective tool for sleep technologists. For users like Porquez, chief among the advancements he’d like to see are ways to better determine abnormal brain activity and breathing patterns.
Manufacturers concede improvements are necessary and are continually updating their software packages. Nihon Kohden, whose Polysmith software is currently in its eighth generation, has spent the last 12 years modifying its algorithm for staging respiration, microarousals, and leg movements. Now, according to Rizk, the company is working with a task force to improve the reliability of its software in certain circumstances, like split night decisions. In addition, it is working on improvements to the user interface that will allow techs to spend less time scoring and increase accuracy.
“If you want computers to do a better job, rules need to be set up that are conducive to automated analysis for sleep staging. Respiratory analysis is already done well by computers, but event type distinctions are where some improvement can take place,” says Rizk. Other improvements, according to Rizk, include assisted autoscoring, which allows users to interact with the results while at the same time allowing them to spend less time getting through the records.
While Philips’ recent updates to its Somnolyzer 24/7 software have included making it compliant with both R&K and the AASM scoring manual and improving ease of use, the company is now focusing on how to better integrate the Somnolyzer 24/7 study back into the lab’s workflow to enhance the overall workflow and enable the software to work in “real time.”
“As the study is actually being compiled and captured, the autoscoring tool could begin the process. In the current approach, the study in its entirety is captured and then that study is run through the Somnolyzer post-acquisition. [With] the post-acquisition [analysis,] you’re talking about 10 to 20 minutes per study [currently],” says Murphy.
Manufacturers admit that changes to Medicare reimbursement are a key consideration when making improvements to their autoscoring technology today.
“Enhancing productivity and efficiency is going to become increasingly important as labs continue to work into the future and understand potentially the possible reimbursement impacts as the business moves forward,” says Murphy.
For Cadwell, which manufactures the Easy II and Easy III PSG system and currently releases two or three software updates a year, user feedback has been key to improving its software and refining its detection algorithms. “[With Medicare], there are requirements where the users need to know if the patient has spent a percentage or a certain amount of time below certain Spo2 levels. So, [users] are asking that we as accurately as possible summarize that information to help them qualify patients who need treatment,” says Bill Antilla, RPSGT, senior product manager at Cadwell Laboratories, Kennewick, Wash.
THE HUMAN ELEMENT
Even with advancements in autoscoring technology and better accuracy, a good sleep report will always require the human element to ensure quality.
Hirshkowitz contends that even if autoscoring technology proved to be 95% accurate, technicians will still have a role to play. “You still need to have some quality control and you have to have some way of manually intervening. And that is true of any computer application,” he says.
And Nihon Kohden’s Rizk agrees. “I think you have to have sleep techs review the data all the time and ensure quality of recordings. At some level, there has to be some review. Experienced techs are vital,” he says.
The main benefit of autoscoring technology may be its ability to allow sleep technicians to better use their expertise for those parts of the sleep study that require it. “We believe that Somnolyzer enables [technicians] to focus their time on those portions of the study that require that expert talent, then enabling the software in essence to do the other parts of the study that are able to be done very easily without the human element,” says Murphy.
For Cadwell’s Antilla, the autoscoring technology allows technicians to dig deeper and do more. “We’re not trying to create a black box that eliminates them. We’re simply trying to find tools that allow them to work smarter and do more with patients to more accurately diagnose them.”
Alison Werner is associate editor of Sleep Review. She can be reached at [email protected].