A sleep physician reviews limitations of the apnea-hypopnea index and considers inclusion of additional variables to develop measures of OSA severity that may be more useful.

By Edward D. Michaelson, MD, FACP, FCCP, FAASM

The apnea-hypopnea index (AHI) is the number of apneas and hypopneas that occur per hour of sleep. The term AHI has become ingrained in the language of sleep medicine stakeholders including healthcare providers, equipment manufacturers, third-party payors, commercial transportation, regulators, patients, and others. As a tool to indicate the severity of sleep apnea, AHI has important implications and consequences for the diagnosis, treatment type and effectiveness, payment for products and services, disability ratings, and employability for patients with or suspected of having sleep disorders.

While most sleep medical professionals and some stakeholders are aware that AHI has limitations as an indicator of sleep apnea severity, the index is regularly used with impunity. Few studies offer specific solutions addressing these limitations, although there has been increasing interest in this issue recently.

In this article, the “true” index calculated using total sleep time (TST) determined by EEG will be designated as AHI, and the index determined using indirect methods of estimating sleep time will be referred to as “AHI.”

The question of what AHI is actually measuring has been addressed with an in-depth mathematical analysis by Eyal Shahar, MD, MPH, whose paper in Nature and Science of Sleep1 also suggests that AHI has taken on a life of its own. Hopefully my article will clarify the potential and actual effects of the multiple variables that affect AHI and “AHI,” provide food for thought, and help to design studies that will develop more clinically and administratively useful methods to evaluate the severity of sleep apnea.

Lumpers and Splitters

Much of the sleep medicine industry falls into the category of Lumpers, that is, those that use a broad view of this single index, a mixed basket of multiple variables, to make important decisions regarding the care of patients with sleep-disordered breathing. Is AHI better than nothing? Maybe; it depends on the situation. The devil is in the details.

There are different implications for a 40-second respiratory event than a 10-second one. O2 desaturations have different magnitudes, durations, and frequencies. And either may be associated with different sleep stages and/or body positions.

For the Splitters among us (including yours truly), which variables are more or the most important for determination of sleep apnea severity? Are they the same for diagnosis, therapy (and its effectiveness and follow up), particularly when the additional variables of different patient types (for example, BMI, phenotype, craniofacial structure), coexisting medical conditions, and medications are added to the mix? Clearly, additional studies are needed.

Other uncertainties could include hand scoring versus automatic scoring algorithms, the percent of REM and/or supine sleep, the method used to score hypopneas, the number of central apneas included in the AHI, and the inclusion of EEG arousals, whether non-specific or related to other parameters such as limb movements and respiratory effort-related arousals (RERAs). Also, most studies are a single night so the reported AHI/“AHI” and/or oxygen saturations may be the best that we see, whereas during subsequent nights these parameters may vary individually or in any combination with body position, sleep environment, sleep stage, alcohol, medication use, and other variables. The severity of sleep apnea could be significantly worse than indicated by AHI based on these uncertainties—a reasonable argument for initial APAP and follow up overnight oximetry.

But we have to start somewhere. The good news is that many of us are aware of at least some of these issues and would-be Splitters are already considering all of the data on a quasi-quantitative basis and making our best clinical judgment.

Home Sleep Testing and ‘AHI’  

The myriad of issues and limitations affecting the usefulness of AHI as an indicator of sleep apnea severity also apply to “AHI,” which adds still more limitations. The limited-channel polysomnogram—referred to as HST, HSAT, or type II PSG (HST used in this article for convenience)—is a moving target as measurements of additional parameters continues to evolve with technology.2

Generally a HST is likely to underestimate AHI. Most testing devices do not record EEG (and hence actual sleep time) and most patients do not sleep 100% of the device recording time. For example, if actual sleep occurs for 50% of the night and the reported “AHI” based on recording time is 10, the actual AHI would be 20.

Not infrequently, a HST may report a low “AHI” for the night with a lower than expected oxygen saturation. For example, an “AHI” ≤ or slightly > 5 with a minimum SpO2 of say ≤ 85%. However, closer inspection may show more frequent respiratory events occurring during shorter time frames such as during REM and/or supine sleep, underestimating OSA severity, or worse, resulting in false negatives. In some cases, coexisting conditions such as pulmonary and/or cardiac disease may be contributing factors to O2 desaturation.

What’s more, some HST devices use different indirect methods of estimating sleep time (such as actigraphy or peripheral arterial tonometry), adding more variability and uncertainty. There is disagreement among sleep medicine professionals as to the reliability of indirect measurements.

Unfortunately, the terminology used in HST reports is inconsistent. Lack of standardization of definitions for and calculations of “AHI” vary among HST device manufacturers, payors, and PAP machine download data, making comparisons difficult.

The correct definition of the respiratory disturbance index (RDI), apneas + hypopneas + RERAs per hour of actual sleep time, is often used incorrectly. With HSTs the abbreviations “AHI” and RDI are sometimes used interchangeably. Some payors define RDI based on actual sleep time, and others use recording time or sleep time estimated by other (non-EEG) methods. Some facilities have used the term RDI to describe a calculation that includes other sleep disturbances, such as snoring or flow limitation based on waveform shape. Also, in the case of PSG, snoring with EEG arousals is sometimes included as a respiratory event into RERAs.

Various payors, professional societies, health care providers, and manufacturer’s PAP data downloads may use different definitions of hypopneas and different types of sensors and/or combinations of sensors.

“AHI” (as currently reported with many HST devices) in most instances is the respiratory event index (REI) using total recording time (TRT) or other indirect methods of estimating sleep time as the denominator.  Clearly there remains the need for standardized terminology. Some clarification is provided in the sidebar.

Clarification of terminology:

AHI = apneas + hypopneas per hour of actual sleep time (via EEG); RERA = respiratory effort related arousal (via EEG); RDI = apneas + hypopneas + RERAs per hour of actual sleep time. The term respiratory event index (REI) = apneas + hypopneas per hour of actual sleep time (via EEG) or recording time.3 This definition for REI may introduce confusion as there could be three possible denominators if non-EEG estimates of sleep time are used.

Why We Need Better Ways to Assess Severity of Sleep Apnea

Oxygen desaturations may be equally as or more important than AHI. We have all seen cases where portions of a recording show more frequent respiratory events associated with more significant oxygen desaturations. For example, a study may show an AHI/“AHI” ≤ 5 or slightly > 5, but a particular portion of the recording (say 30 minutes during supine and/or REM sleep) has 20 respiratory events, equivalent to an AHI/“AHI” of 40 (for that segment) and which might be associated with a SpO2 of 80% or less. Such a patient may need treatment more than a patient with an AHI/“AHI” of 15 with only mild O2 desaturations.

Parameters that might be used to develop more clinically useful measures of sleep apnea severity include:

  • Respiratory events: duration and type (obstructive, mixed, central, and unclassified apneas), Cheyne-Stokes respiration, hypopneas, and RERAs and their contribution (particularly centrals) to the AHI/“AHI,” as well as their frequency, timing, and relationship to sleep stage and/or body position.
  • Oxygen desaturations: minimum, maximum, and average SpO2, duration, frequency (index), and duration at different SpO2 levels, timing in relationship to sleep stage and/or body position, possibly an estimate of a total O2 desaturation burden, and the association of these parameters with cardiovascular events.
  • Intended use: diagnostic (for different patient demographics, phenotypes, and comorbidities), treatment type (for example, PAP, ASV, oral appliances, neurostimulation, etc) and effectiveness, adherence, contribution of alcohol and/or medication, and correlation with sleep log/diaries.
  • Other variables to consider include visual versus autoscoring, first-night effect, results of previous studies, snoring, heart rate, arousals, comorbidities, and known or suspected coexisting sleep disorders.

I suggest that in addition to AHI—which in most scored reports is reported for total sleep time (TST), REM and non-REM sleep, and in different body positions—a “maximum density index” be considered. This index could be calculated for all periods of the recording during which variables such as respiratory events, O2 desaturations, snoring, heart rate, arousals, and leg movements are more frequent compared to the average value for the study.

Future Directions

Substantial additional research is needed. It will take a long time to design and complete these studies, not to mention the difficulties in finding enough patients of each type to address the many variables.

Any single (even if improved) index for the assessment of sleep apnea severity likely would still be insufficient to address the many clinical variables or every specific clinical use for the parameter.

Full overnight attended polysomnography already contains a large amount of data. The number of patients and the time it would take to analyze the relationship of each parameter for different patient types and various clinical situations is daunting. Reviewing a massive number of previously published studies using methodology that has become known as Big Data may identify trends not otherwise easily observed and speed the process by showing ways to design more focused and applicable studies.

As technology advances and the measurement of additional parameters (particularly EEG) and artificial intelligence-based automatic scoring can be economically incorporated into HST (and ultimately into wearables and other consumer-based apps), I predict that many of the differences between facility-based attended diagnostic PSG and HST will disappear. There will no longer be a need for the terms HST or limited-channel PSG, although unattended or remotely attended will probably remain in the vocabulary. Additionally, these technological developments will extend to evaluation of other therapies for sleep apnea and the diagnosis of other sleep disorders. There will of course be cost, reimbursement, and regulatory issues and it will take some time for other stakeholders in the industry to catch up.


1. Shahar E. Apnea-hypopnea index: time to wake up. Nat Sci Sleep. 2014 Apr 5;6:51-6.

2. Roy S. Type II PSG enters patients’ homes. Sleep Review. 2022 June/July:14-8.

3. Collop NA, Tracy SL, Kapur V, et al. Obstructive sleep apnea devices for out-of-center (OOC) testing: technology evaluation. J Clin Sleep Med. 2011 Oct 15;7(5):531-48.

Photo 33400593 © Shannon Fagan | Dreamstime.com