Ultrasonography indicators for predicting difficult intubation: a systematic review and meta-analysis

Background Ultrasonography (US) is recently used frequently as a tool for airway assessment prior to intubation (endotracheal tube (ETT) placement), and several indicators have been proposed in studies with different reported performances in this regard. This systematic review and meta-analysis reviewed the performance of US in difficult airway assessment. Methods This systematic review and meta-analysis was conducted according to the guideline of Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) and the Cochrane book. All the studies that had carried out difficult airway assessments using US, had compared the indicators in difficult and easy groups, and had published the results in English by the time we conducted our search in April 28, 2020, were included. Results In the initial search, 17,156 articles were retrieved. After deleting the duplicate articles retrieved from multiple databases, 7578 articles remained for screening based on the abstracts and titles. Finally, the full text of 371 articles were assessed and the data from 26 articles were extracted, which had examined a total of 45 US indicators for predicting difficult intubation. The most common US index was the “thickness of anterior neck soft tissue at the vocal cords level”. Also, “skin to epiglottis” and “anterior neck soft tissue at the hyoid bone level” were among the most common indicators examined in this area. Conclusion This systematic review showed that US can be used for predicting difficult airway. Of note, “skin thickness at the epiglottis and hyoid levels”, “the hyomental distance”, and “the hyomental distance ratio” were correlated with difficult laryngoscopy in the meta-analysis. Many other indicators, including some ratios, have also been proposed for accurately predicting difficult intubation, although there have been no external validation studies on them.

Conclusion: This systematic review showed that US can be used for predicting difficult airway. Of note, "skin thickness at the epiglottis and hyoid levels", "the hyomental distance", and "the hyomental distance ratio" were correlated with difficult laryngoscopy in the meta-analysis. Many other indicators, including some ratios, have also been proposed for accurately predicting difficult intubation, although there have been no external validation studies on them.

Background
Preparation is a key step in rapid sequence intubation (RSI) in emergency departments (EDs) and the assessment of difficult airways is an integral part of this procedure. Although physical examination and clinical criteria are used frequently for this purpose, there is still a 1.5% chance of difficult intubation (endotracheal tube (ETT) placement) with an increased rate in some populations, such as obese patients [1,2]. This rare but ominous catastrophe is partially due to test flaws and variable inter-observer agreement [3,4]. Furthermore, some rules might be difficult to apply in some settings with uncooperative patients, like those in EDs and critical care units [5]. Bedside ultrasonography (US) has recently been used in this regard. This safe, portable, and widelyavailable tool has been proposed for the assessment of airways [6]. Several indicators have been suggested in studies with different reported performances. This systematic review and meta-analysis reviewed the performance of US in difficult airway assessment.

Methods
This study was conducted to systematically review studies that had assessed and compared US indicators in difficult and easy intubation group patients. The methods adopted for this systematic review and meta-analysis were consistent with the guideline of Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) and the Cochrane book.

Search strategy
A comprehensive search was performed in international bibliometric databases including PubMed, ISI's Web of Science, SCOPUS, and EMbase. The search terms were categorized and combined in two groups: Ultrasonography and airway evaluation. In the ultrasonography group, we used all possible keywords, such as sonograph, ultrasonic, Cormack lehane, hyposmia, and hypoxia. In the airway evaluation group, we used any possible keywords such as airway evaluation, airway management, airway investigation, difficult laryngoscopy, difficult airway, difficult intubation, endotracheal tube, endotracheal intubation, tracheal intubation, orotracheal intubation. The keywords were combined with the Boolean operator of "OR" in each group and with "AND" between the groups. No limitations were imposed in this study for publication time, and any article published by the time of the search in April 28, 2020, was included. The search strategy used in PubMed is presented in Appendix I. We completed our search by reviewing the references of the retrieved studies and contacting experts in this field in order to access further studies.

Selection of studies and data extraction
The inclusion criteria for the studies were: 1) Having performed difficult airway assessment based on ultrasonography indicators, 2) Having compared the indicators in difficult and easy groups, and 3) Being published in English. The studies were excluded if they had used data from another included study or if their full text could not be accessed.
The identified documents were screened in two stages: 1) Screening the titles and abstracts to exclude the irrelevant studies and 2) Assessing the full texts for eligibility and inclusion criteria. Both stages were carried out independently by two reviewers, and discrepancies between the reviewers were resolved by a third reviewer. The full text was then reviewed to confirm that the eligibility criteria were met and for the extraction of the required information. Two researchers independently extracted the data of the included studies using a data extraction Excel-based sheet. The extracted data of each study were checked by two reviewers and discussed in the case of disagreements. The data extraction sheet included basic information (first author's name, year of publication and country, design, participants, sampling method, and the demographic characteristics of the participants, such as age and BMI), difficult intubation, and sonographic indicators. All data on the US indicators, presented in two easy and difficult groups, were extracted by two of the researchers. The data included the range, mean, and standard deviation (SD) of sonographic indicators in the two groups. Also, the accuracy of US indicators for predicting or discriminating difficult intubation, such as the area under the ROC curve, best cut-off point, odds ratio (OR) and predictive values and likelihood ratios were extracted if they were reported. The quality of the studies was assessed using the Quality Assessment of Diagnostic Accuracy Studies, version 2 (QUADAS-2).

Statistical analysis
All the eligible studies were included in the synthesis after their systematic review. We re-analyzed the raw data presented in one study (Wojtczak, J.A.; to obtain the mean of the US indicators in two easy and difficult groups. The mean difference of the US indicators in the two easy and difficult intubation groups were combined. The meta-analysis was conducted based on the random-effects model. The forest plot and pooled mean difference were presented for all the US indicators with at least two study. The heterogeneity of the preliminary studies was evaluated using the I-squared, Tau squared statistics, and Cochran's Q test. The meta-analysis was performed in STATA statistical software, version 16 (StataCorp, College Station, TX).

Screening and article selection
The initial search yielded 17,156 articles. After deleting the duplicates, 7578 articles entered for screening based on their abstracts and titles. Finally, the full text of 371 articles was assessed and the data of 26 articles were extracted (Fig. 1). These articles had assessed at least one US index for patients with difficult intubation, laryngoscopy, or for different Cormack-Lehane grade groups.    Characteristics of the included studies In the 26 studies we reviewed, the prevalence of difficult laryngoscopy and/or intubation was between 6.3 and 50%, and in the high-prevalence studies, the patients had been purposefully selected to compare US indicators and their high prevalence was therefore not generalizable (Table 1). We used QUADA-2 for assessing the quality of the studies ( Table 2). Patient selection bias was the most common bias encountered in the reviewed studies. Four studies were judged to have a high risk for patient selection bias due to their convenience sampling method [9], unclear recruitment method and small sample size [29], unknown recruitment method, too many exclusion criteria, and undisclosed excluded cases [31], or unknown recruitment method and including only parturient women [10]. Sixteen studies were deemed to have an unclear risk of patient selection bias due to their unclear recruitment strategy [8, 11-18, 20-22, 24, 27, 28, 30]. All of the studies were judged to have a low risk of index test bias. Seven studies were deemed to have an unclear risk of reference test bias because data about intubator blinding was not disclosed by them [5,14,17,20,27,29,31]. One study was judged to have a high risk of flow and timing bias due to performing airway ultrasonography 5-10 days after the intubation [23]. All of the studies were deemed to have low applicability concerns, but one study was at risk for high applicability concerns due to performing laryngoscopy in sedated non-paralyzed patients [10].

Predictive ultrasonography indicators for difficult intubation
A total of 45 US indicators for predicting difficult intubation were examined in the 26 reviewed studies. The most common assessed ultrasound criterion was "thickness of the anterior neck soft tissue at the vocal cords", which was studied in nine different studies between 2003 and 2020. Also, "skin to epiglottis" and "anterior neck soft tissue at the hyoid bone" were among the most common US indicators examined in eight different studies. The following are the results of each of the indicators.

Distance from the skin to the epiglottis
This criterion has been examined in eight studies ( Table 3). The mean of this index was assessed in six studies. In five of these studies, the mean of distance from the skin to the epiglottis was significantly higher in the difficult group (p < 0.05), and in the other study, it was higher in the easy group, although the difference was not significant. The pooled mean difference of distance from the skin to the epiglottis based on the meta-analysis results was 6.15 mm higher in the difficult group than the easy group, and the difference was statistically significant (p < 0.001) (Fig. 2). The AUC reported was 0.79 and 0.91 in four studies. In three of these studies, the optimal cut-off point calculated was 1.62 (sensitivity = 89.7 and specificity = 64.8), 2.54 (sensitivity = 82.0 and specificity = 91.0) and 3.0 (sensitivity = 56.3 and specificity = 88.2). In the other one, accuracy indicators were reported, but the cut-off point was not.

Thickness of the anterior neck soft tissue at the vocal cords level
This criterion has been studied in nine studies ( Table 3). The mean of this index was assessed in eight of these nine studies while the other one did not present the raw mean (±SD) data in the two groups and sufficed to report that the difference between the difficult and easy groups was not significant. As for the remaining eight studies, the mean thickness of the anterior neck soft tissue at the vocal cords was significantly higher in the difficult than the easy group in three studies. In three other studies, the mean of the index was higher in the easy group, and in one study, it was not statistically significant. The pooled mean difference of the anterior neck soft tissue at the vocal cords based on the meta-analysis results was 0.27 cm higher in the difficult than the easy group and this difference was marginally significant (p = 0.150) (Fig. 3). The AUC was reported as 0.47, 0.54 and 0.85 in three studies. In one study with an unknown cutoff point, the sensitivity and specificity were reported as 53 and 66%, respectively.      Pre

Anterior neck soft tissue at the hyoid bone level
This index was assessed in eight studies (Table 3). Seven studies assessed the mean difference and five of them showed that the mean of the anterior neck soft tissue at the hyoid bone was significantly higher in the difficult intubation group compared to the easy group. The pooled mean difference of this index based on the meta-analysis was 0.20 cm higher in the difficult than the easy group and this difference was significant (p < 0.001) (Fig. 4). The AUC of the anterior neck soft tissue at the hyoid bone was reported as 0.559 to 0.92 in five studies. Two studies reported the sensitivity and specificity of this index but their cutoff point was unknown. In two studies, the optimal cut-off point was calculated as 0.66 (sensitivity = 68.0 and specificity = 69.0) and 0.99 (sensitivity = 48.0 and specificity = 82.0).

Hyomental distance (HMD) with the neck extended
This index was assessed in three studies ( Table 3).
All of the three studies assessed the mean difference and none of them not showed a significant mean difference between the difficult and easy intubation groups. The pooled mean difference of this index based on the meta-analysis was 0.70 cm higher in the difficult than the easy group and this difference was significant (p < 0.001) (Fig. 5).

Hyomental distance ratio (HMDR)
This index was assessed in five studies (Table 3), and in all of them, the mean HMDR was significantly lower in the difficult group compared to the easy group. The pooled mean difference of HMDR based on the metaanalysis was 0.07 cm lower in the difficult than the easy group and this difference was significant (p < 0.001) (Fig. 6). The AUC of this index was reported as 0.71,

Ratio of the pre-epiglottic space (pre-E) and epiglottis vocal cord (E-VC) distances
This index was assessed in four studies ( Table 3) and all of them showed a significantly higher mean Pre-E/ E-VC in the difficult than the easy group. The pooled mean difference of the ratio of Pre-E and E-VC distances based on the meta-analysis was 0.73 cm higher in the difficult than the easy group and this difference was significant (p < 0.001) (Fig. 7). The AUC of this index was reported as 0.868 and 0.871 in two studies.
In two studies, the optimal cut-off point was 1.77 (sensitivity = 82.0 and specificity = 80.0) and 1.77 (sensitivity = 82.0 and specificity = 80.0).  Anterior neck soft tissue at the thyroid isthmus This index was examined in three studies ( Table 3). One of these studies, however, did not present the raw mean (±SD) data in the two groups and only reported that the groups were not significantly different in this regard. Also, the two remaining studies did not show a significant mean difference between the two groups in anterior neck soft tissue at the thyroid isthmus. The pooled mean difference of this index based on the meta-analysis was not significantly different (p = 0.880) (Fig. 8).

Anterior neck soft tissue at the suprasternal notch
This index was assessed in three studies ( Table 3). One of these studies did not present the raw mean (±SD) data in the two groups and only reported that the groups were not significantly different in this regard. As for the two remaining studies, the mean of this index was significantly higher in the difficult intubation group than the easy group in one study [33.0 (4.3) vs. 27.4 (6.6) mm; p = 0.013], while the other study did not show any significant differences between the groups in this regard  (p = 0.931). The pooled mean difference of this index based on the meta-analysis was 0.24 cm higher in the difficult than the easy group, although this difference was not significant (p = 0.440) (Fig. 9).

Tongue volume
This index has been assessed in three studies (Table 3), in which, the mean difference of tongue volume reported in two group and have inconsistent result. The pooled mean difference of this index based on the meta-analysis of the two studies was 6.29 cm 3 lower in the difficult than the easy group, although this difference was not significant (p = 0.760) (Fig. 10). The AUC of tongue volume was reported as 0.626 in one study, and in the other study with the cut-off point of 100 cm 3 , the sensitivity and specificity were reported as 66.7 and 62.7%, respectively.

Floor of the mouth muscle volumes
This index was assessed in three studies ( Table 3). The mean difference in the floor of the mouth muscle volumes was reported in two groups and it was not significant. The pooled mean difference of this index based on the meta-analysis of two studies was also not significant  (p = 0.460) (Fig. 11). The AUC of the floor of the mouth muscle volumes was reported as 0.559 in one study. Two studies reported a sensitivity and specificity for this index but their cut-off point was unknown.

Hyomental distance in the head positions (HMDE)
This index was assessed in two studies (Table 3), and in both of them, the mean HMDE was significantly lower in the difficult intubation group than the easy group. The pooled mean difference of this index based on the meta-analysis was 0.87 cm lower in the difficult than the easy group and the difference was significant (p = 0.020) (Fig. 12). The AUC of HMDE was reported as 0.758. The sensitivity and specificity of this index for an unknown cut-off point were 38.1 and 97.7% (Table 2).

Hyomental distance in the neutral positions (HMDN)
This index was assessed in two studies (Table 3), and in both of them, the mean HMDN was significantly lower in the difficult intubation group than the easy group. The pooled mean difference of this index based on the meta-analysis was 0.36 cm lower in the difficult than the easy group and this difference was significant (p < 0.001) (Fig. 13). The AUC of HMDE was reported as 0.66. The sensitivity and specificity of this index for an unknown cut-off point were 28.6 and 94.4% ( Table 2).

Length of the thyrohyoid membrane
This index was assessed in two studies (Table 3), and in both of them, the mean length of the thyrohyoid membrane was lower in the difficult intubation group than the easy group. Meanwhile, the mean difference was  The pooled mean difference of this index based on the meta-analysis was 0.24 cm lower in the difficult than the easy group and this difference was significant (p < 0.001) (Fig. 14).

Tongue thickness
This index was assessed in three studies (Table 3), and in all three, the mean (two study) and median (one study) of tongue thickness was higher in the difficult intubation group than the easy group. The pooled mean difference of this index based on the meta-analysis was 0.59 cm higher in the difficult than the easy group and this difference was significant (p < 0.001) (Fig. 15). The AUC of this index was reported as 0.72, 0.78 and 0.93. In two studies, the optimal cut-off point was calculated as 5.87 (sensitivity = 85.0 and specificity = 91.0) and 6.1 (sensitivity = 75.0 and specificity = 72.0).

Condylar translation
This index was assessed in two studies ( Table 3) and all of them showed that the mean condylar translation was significantly lower in the difficult intubation group than the easy group. The pooled mean difference of this index based on the meta-analysis was 3.41 cm lower in the difficult than the easy group and this difference was significant (p < 0.001) (Fig. 16). The AUC of this index was reported as 0.77 in one study, and its sensitivity and specificity with a 11.05-mm cut-off point were 0.70 and 0.81. In another study, the sensitivity and specificity were 0.81 and 0.91, respectively.

Anterior neck soft tissue at the thyrohyoid membrane
This index was assessed in three studies ( Table 3). One of them revealed the mean anterior neck soft tissue at the thyrohyoid membrane to be significantly higher in the difficult intubation group than the easy group and one study showed the opposite; the other study showed  non-significant differences between the two groups. The pooled mean difference of this index based on the metaanalysis was 0.20 cm lower in the difficult than the easy group and this difference was significant (p < 0.001) (Fig. 17). The AUC of this index was reported as 0.73 in one study. Its sensitivity and specificity with a 2.03 cutoff point were 65.0 and 69.0% (Table 2).

HMD distance between the hyoid bone and the posterior border of the symphysis menti
This index was assessed in one study (Table 3) in the form of three US indicators: Neutral, ramped, and the ratio between HMD in the ramped position to HMD in the neutral position (HMDR). Of the three main indicators, the ramped criterion showed a significant difference between the difficult and easy intubation groups, such that the mean value of this criterion was significantly lower in the difficult intubation group than the group with easy intubation [4.53 (0.1) vs. 5.17 (0.28); p = 0.03]. Also, the ratio between the HMD in the ramped position to that in the neutral position was significantly lower in the difficult intubation group (p = 0.02) ( Table 2).

Other indicators
A total of 28 US indicators were examined in 11 studies (Table 3). However, only one study was found for each of the following 28 indicators.  Out of a total of 28 US indicators, the mean difference for 24 was assessed. The mean of seven indicators (numbers 1 to 7 of the "other indicators" list) was significantly higher in the difficult intubation group than the easy group (p < 0.05). The mean of two indicators (8: Distance from the skin to the hyoid bone (SHB) in sniffing, and 9: Distance from the skin  to the thyrohyoid membrane (STM) in sniffing) was significantly lower in the difficult intubation group than the easy group, and the mean difference was not significant between the difficult and easy intubation groups for 15 indicators (numbers 10 to 24 of the "other indicators" list).
The AUC was reported for 12 indicators. The AUC was between 0.622 and 0.589 and non-significant for four indicators (2, 10, 11, and 12 of the "other indicators" list) and was between 0.648 and 0.930 and statistically significant (p < 0.05) for nine indicators (1,4,5,7,8,9,25, and 26 of the list). Sensitivity and specificity were reported for 12 indicators (1, 2, 4, 5, 7 to 12, 25, and 26 of the list), although the cut-off point was unknown for four indicators (2, and 10 to 12 of the list).
The ability to view the hyoid bone in the sublingual US index is a special issue to discuss; as its seen frequency in both difficult and easy group were ambiguous. This index had a significantly different distribution; however, its seen in difficult group was lower than easy intubation group (27.3% vs. 96.6%, p < 0.001). Also, the sensitivity and specificity of this index were 70 and 97%, respectively ( Table 2).
No raw data was presented in any of the two groups for the base of the tongue index, and only a nonsignificant difference was reported between the difficult and easy groups in this index (Table 2).

Discussion
This systematic review showed that US can be used for predicting difficult airway. The skin thickness at the epiglottis and hyoid levels, HMD, and HMDR were found to be correlated with difficult laryngoscopy in the metaanalysis. Many other indicators, including many ratios, are also proposed to accurately predict difficult intubation, although there are no external validation studies on them.  To have a clear visualization during direct laryngoscopy, many factors, such as mouth opening, oropharyngeal anatomy, mandibular space, neck motility, and performer's skill are involved [32]. Many bedside assessment methods are used to evaluate the aforementioned aspects. For example, the Mallampati score evaluates the tongue size relative to the oropharyngeal space [33]. While these methods are widely used, their performances are under question. Using other tools, such as imaging studies (computed tomography (CT) scan and X-ray), is also limited due to radiation hazards, costs, and logistics [34].
US is used in various aspects of airway management and for many purposes, such as the prediction of pediatric ETT size, confirmation of correct placement of ETT, guidance of percutaneous tracheostomy and cricothyroidotomy, and confirmation of proper laryngeal mask airway position [35,36]. In addition, US has been proposed to also measure some indicators that reflect the intubation-related anatomy directly or indirectly. This widely-available instrument can be easily applied and learned by clinicians. A meta-analysis study showed a similar performance between imaging studies and US in predicting difficult airways. It reported the overall accuracy of US as 0.89. The study did not assess different indicators separately and included studies including heterogeneity caused by differences in design and implementation [34].
It has been hypothesized that increased anterior neck soft tissue thickness can impair the mobility of the pharyngeal structures during laryngoscopy [30]. This distance can be measured via US machines in various levels, including the vocal cords, thyroid isthmus, and suprasternal notch, and hyoid bone. The results are conflicting among studies in this regard. While the pooled results were not significant at the vocal cords, thyroid isthmus, and suprasternal notch levels, it can be relied on at the hyoid bone and epiglottis levels. The difference is 0.2 cm at hyoid bone level and more than 0.6 mm with different cut-offs in epigglotic region. In a study by Yadav et al., many of the aforementioned distances were assessed in different positions, such as the sniffing position. The authors stated that the results were significant in many of the sites, though with low accuracy [13]. From another perspective, this soft tissue thickness could also be a presentation of a high BMI that is complicating orotracheal intubation.
Hyomental distance is considered an important factor for displacing the tongue during laryngoscopy [23]. Intubation might be more difficult in shorter distances that might present large mandibular size and its proximity to hyoid bone. This distance can be assessed in different positions. Petrișor et al. proposed that this distance is most accurate in hyperextension position among the obese in comparison with the neutral or ramped positions. While the study proposed a sensitivity of 100% for all the positions, specificity was 71.4% in hyperextension [37]. Our study showed that the difference in distance is less than 4 mm in the neutral position and is increased to 8 mm in the extension position. Therefore, its applicability should be further tested before clinical recommendations can be made. HMDR was first described among rheumatoid arthritis patients [38]. The distance between the hyoid bone and occipital bone remains constant during extension/flexion of the neck due to stylohyoid ligament. By neck extension just below the occiput, the mentum moves away from the hyoid bone, which increases the hyomental distance. It has been proposed that without this increase, lower cervical spine extension alone would take the larynx and glottis out of line of view by displacing the laryngeal structure forward [29]. According to the results, the ratio has a good specificity and fair accuracy.
Gupta et al. developed an oblique view for airway sonography [28]. By tilting the probe midline in the submandibular area caudally, they obtained a view that bisected the epiglottis and posterior-most part of the vocal folds. Using this view, they calculated the Pre-E and E-VC ratio. With an acceptable accuracy, both of the studies proposed 1.77 as the optimal cut-off and the meta-analysis showed significant differences between the difficult and easy intubation groups.
During the introduction of laryngoscope into the oral cavity, since the blade is positioned on the tongue, the tongue anatomy is important for better glottis view [33]. In addition to tongue thickness, it has been suggested to also calculate the tongue volume using a cross-sectional area of the tongue (at the midsagittal), multiplied by its width [29]. This method overestimates the tongue volume due to cautious tongue measurement along all its lengths from the mentum to the hyoid bone. Two studies on tongue volume yielded conflicting results; in the general public, tongue volume and area were a predictor of difficult intubation [23]. Meanwhile, in a study with a small sample of obese patients, there was no significant difference in this regard [29]. Pooling the data showed that while the difference between the two groups was 6 cm 3 , the accuracy was not acceptable. Concerning tongue thickness, while there is a fair accuracy, the measurements in the difficult intubation group failed to demonstrate differences in comparison with the easy ones. This study also showed that measuring the floor of the mouth muscle volume is inaccurate for difficult intubation prediction. Further studies are required on the measurement of the tongue width and thickness and their ratios to other parts.

Limitations
There are several reasons for the conflicting results and the heterogeneity among the reviewed studies. On the one hand, there are patients with different baseline characteristics (e.g., BMI and ethnic background). On the other hand, while the gold standard among studies is usually the Cormack and Lehane score, this objective index is assessed in different conditions (e.g., with or without the Backward, Upward, Rightward Pressure (BURP) maneuver) [30] and by different assessors. In addition, US is operator-dependent and there are some variabilities due to the level of operator expertise and machine properties.

Conclusion
To conclude, this systematic review and metaanalysis showed that US can be used to predict difficult airways. Nonetheless, its application should be carefully assessed in other settings before making any recommendations.