The accuracy and influencing factors of Doppler echocardiography in estimating pulmonary artery systolic pressure: comparison with right heart catheterization: a retrospective cross-sectional study

Background Noninvasive assessment of pulmonary artery systolic pressure by Doppler echocardiography (sPAPECHO) has been widely adopted to screen for pulmonary hypertension (PH), but there is still a high proportion of overestimation or underestimation of sPAPECHO. We therefore aimed to explore the accuracy and influencing factors of sPAPECHO with right heart catheterization (RHC) as a reference. Methods A total of 218 highly suspected PH patients who underwent RHC and echocardiography within 7 days were included. The correlation and consistency between tricuspid regurgitation (TR)-related methods and RHC results were tested by Pearson and Bland–Altman methods. TR-related methods included peak velocity of TR (TR Vmax), TR pressure gradient (TR-PG), TR mean pressure gradient (TR-mPG), estimated mean pulmonary artery pressure (mPAPECHO), and sPAPECHO. With mPAP ≥ 25 mm Hg measured by RHC as the standard diagnostic criterion of PH, the ROC curve was used to compare the diagnostic efficacy of sPAPECHO with other TR-derived parameters. The ratio (sPAPECHO–sPAPRHC)/sPAPRHC was calculated and divided into three groups as follows: patients with an estimation error between − 10% and + 10% were defined as the accurate group; patients with an estimated difference greater than + 10% were classified as the overestimated group; and patients with an estimation error greater than − 10% were classified as the underestimated group. The influencing factors of sPAPECHO were analyzed by ordinal regression analysis. Results sPAPECHO had the highest correlation coefficient (r = 0.781, P < 0.001), best diagnostic efficiency (AUC = 0.98), and lowest bias (mean bias = 0.07 mm Hg; 95% limits of agreement, − 32.08 to + 32.22 mm Hg) compared with other TR-related methods. Ordinal regression analysis showed that TR signal quality, sPAPRHC level, and pulmonary artery wedge pressure (PAWP) affected the accuracy of sPAPECHO (P < 0.05). Relative to the good signal quality, the OR values of medium and poor signal quality were 0.26 (95% CI: 0.14, 0.48) and 0.23 (95% CI: 0.07, 0.73), respectively. Compared with high sPAPRHC level, the OR values of low and medium sPAPRHC levels were 21.56 (95% CI: 9.57, 48.55) and 5.13 (95% CI: 2.55, 10.32), respectively. The OR value of PAWP was 0.94 (95% CI: 0.89, 0.99). TR severity and right ventricular systolic function had no significant effect on the accuracy of sPAPECHO. Conclusions In this study, we found that all TR-related methods, including sPAPECHO, had comparable and good efficiency in PH screening. To make the assessment of sPAPECHO more accurate, attention should be paid to TR signal quality, sPAPRHC level, and PAWP.


Background
Right heart catheterization (RHC) is recognized as the gold standard for measuring pulmonary artery pressure, but its invasiveness limits its general applicability. Doppler echocardiography (DE) can noninvasively assess pulmonary artery pressure by peak velocity of tricuspid regurgitation (TR Vmax) and its derived parameters, including TR pressure gradient (TR-PG), TR mean pressure gradient (TR-mPG), estimated mean pulmonary artery pressure (mPAP ECHO ), and pulmonary artery systolic pressure (sPAP ECHO ). The current guidelines recommend TR Vmax to avoid additional error in the estimated right atrial pressure (RAP) [1]. Furthermore, mPAP has been found to be superior to TR Vmax in identifying pulmonary hypertension (PH) [2]. As the most welladopted approach in PH screening, sPAP ECHO has also been shown to be a reliable method [3]; however, it has not yet been examined whether sPAP ECHO is superior to other parameters in determining the probability of PH. sPAP ECHO can also provide valuable information for evaluating treatment response and even predicting prognosis [4,5]; however, there is still a high proportion of overestimation or underestimation of sPAP ECHO [6]. To evaluate PH patients' condition appropriately and avoid too invasive examination, we need to understand situations in which sPAP ECHO is under/overestimated. Based on clinical experience and review of previous literature, we assumed that right ventricular systolic function, pulmonary artery pressure level, TR severity, and signal quality would affect the accuracy of sPAP ECHO . In addition, as an important parameter to distinguish pre-and post-capillary PH, pulmonary artery wedge pressure (PAWP) was also included in the analysis to examine whether there would be any difference in the accuracy of sPAP ECHO . Therefore, the first aim of this study was to compare the efficiency of sPAP ECHO and other parameters in PH screening, while the second aim was to find influencing factors that account for the inaccuracy of sPAP ECHO .

Methods
Between October 2015 and October 2020, a total of 430 patients admitted to our center with known or suspected PH were evaluated. The inclusion criteria were age ≥ 18 years and the interval between echocardiography and RHC ≤ 7 days. The exclusion criteria were as follows: lack of TR, pulmonary artery stenosis or right ventricular outflow tract stenosis, poor image quality not suitable for analysis, ventricular septal defect, or patent ductus arteriosus. Patients' demographic and clinical data were obtained from the electronic medical records. The institutional review board of the China-Japan Friendship Hospital waived the need for written informed consent as the study involved the retrospective analysis of clinically acquired data. The data underlying this article will be shared upon a reasonable request to the corresponding author.

Clinical data
Baseline assessment of the eligible patients included WHO functional class, the level of N-terminal pro B-type natriuretic peptide (NT-proBNP), and a 6-min walk test (6MWT).

RHC
Hemodynamic measurements were performed with a 7F Swan-Ganz catheter Philips Allura X-PER FD20 flatplate angiography system (Baxter Inc.). The system was zeroed and referenced at patients' heart level as previously described [7]. Right atrial pressure (RAP), pulmonary systolic artery pressure (sPAP RHC ), and PAWP were recorded at end-expiration at baseline over at least three heart cycles. Cardiac output (CO) was obtained using Fick's method. Pulmonary vascular resistance (PVR), cardiac index, stroke volume, pulse pressure, and diastolic pressure gradient were calculated using standard formulas. Pulmonary artery pressure was classified into low, medium, and high levels according to the tertiles of sPAP RHC .

Echocardiography
Echocardiographic images were acquired using a GE Vivid E95 machine (GE Healthcare, General Electric Healthcare) equipped with M5S phased-array transducers. Analysis was performed independently by two blinded investigators using EchoPAC software (GE Healthcare version 201). Two-dimensional echocardiography and Doppler echocardiography (DE) were performed based on current guidelines. TR-PG was calculated from the TR Vmax obtained from continuous-wave Doppler by the simplified Bernoulli equation: (95% CI: 2.55, 10.32), respectively. The OR value of PAWP was 0.94 (95% CI: 0.89, 0.99). TR severity and right ventricular systolic function had no significant effect on the accuracy of sPAP ECHO .

Conclusions:
In this study, we found that all TR-related methods, including sPAP ECHO , had comparable and good efficiency in PH screening. To make the assessment of sPAP ECHO more accurate, attention should be paid to TR signal quality, sPAP RHC level, and PAWP.
TR-PG = 4 (TR Vmax) 2 . TR-mPG was obtained by tracing the time-velocity integral of TR. sPAP ECHO and mPAP ECHO were calculated by adding the estimated RAP to TR-PG and TR-mPG, respectively. RAP was divided into three categories (3,8, and 15 mm Hg) based on the inferior vena cava (IVC) diameter and its respiratory variation [1]. The ratio (sPAP ECHO -sPAP RHC )/sPAP RHC was calculated and divided into three groups as follows: patients with an estimation error between − 10% and + 10% were defined as the accurate group; patients with an estimated difference greater than + 10% were classified as the overestimated group; and patients with an estimation error greater than − 10% were classified as the underestimated group. The severity of TR was classified into three grades by comprehensively evaluating the regurgitation jet area and vena contracta (VC) width. The mild group was defined as jet area < 5 cm 2 , VC TR ≤ 3 mm; the moderate group as jet area 5-10 cm 2 , 3 mm < VC TR < 7 mm; and the severe group as jet area > 10 cm 2 , VC TR ≥ 7 mm. TR signal quality was classified into three types according to the extension of the signal for more than half of the systole and well-defined border. Good signal quality was defined as the one that met both criteria. Medium signal quality met only one of these criteria, while poor signal quality did not meet any of the criteria [8] (Fig. 1). RV systolic function was assessed using multiple parameters, including RV wall thickness (RV WT), tricuspid annular plane systolic excursion (TAPSE), systolic annular tissue velocity of the lateral tricuspid annulus (S'), and RV fractional area change (FAC). All of these parameters were repeatedly measured and averaged. To determine the reproducibility of sPAP ECHO measurements, a total of 34 randomly selected examinations were analyzed twice by the first investigator at a 1-week interval and once by the second investigator.

Statistical analysis
Standard statistical software (SPSS version 26 for Windows, SPSS, Chicago, IL, USA) was used for the statistical analysis. Data are expressed as mean ± standard deviation for quantitative variables with normal distribution, or as median (interquartile range) for variables not complying with normal distribution. The correlation and consistency between TR-derived parameters and RHC results were tested by Pearson and Bland-Altman methods. With mPAP ≥ 25 mm Hg measured by RHC as the standard diagnostic criterion of PH, a receiver operating characteristic (ROC) curve was used to compare the diagnostic efficacy of sPAP ECHO and other TRrelated methods. The influencing factors of sPAP ECHO were analyzed by ordinal regression analysis. The intraclass correlation coefficient was used to determine interand intra-observer reproducibility for sPAP ECHO from 34 randomly selected patients using an identical cine-loop for each view. For all statistical tests, a P value < 0.05 was used to indicate significance.

Patients' characteristics
A total of 218 patients were finally identified and analyzed, as shown in Fig. 2. Baseline demographic and clinical characteristics are provided in Table 1. The mean age of the patients was 50.9 ± 13.3 years; 40.3% of them were men; 197 (90.4%) patients had PH. None of the patients experienced major cardiac events between DE and RHC examinations. Table 2 lists the DE and RHC variables grouped by estimated accuracy.

Association between invasively determined parameters and TR-derived parameters
All of the TR-derived parameters, including TR Vmax, TR-PG, TR-mPG, mPAP ECHO , and sPAP ECHO , showed a positive correlation with related RHC results (Fig. 3). sPAP ECHO had the highest correlation coefficient (r = 0.782, P < 0.001). Bland-Altman analysis demonstrated low bias between RHC and echocardiographic results, Fig. 1 Classification of the TR signal quality using continuous-wave Doppler. Good signal quality, complete envelope; Medium signal quality, partial envelope; Poor signal quality, unreliable envelope or no signal with wide limits of agreements (Fig. 4). The bias of sPAP ECHO (mean bias = 0.1 mm Hg; 95% limits of agreement: −32.1 to +32.2 mm Hg) was lower than that of TR-PG (mean bias = 5.9 mm Hg; 95% limits of agreement: −26.5 to +38.2 mm Hg). The mean deviations of mPAP ECHO and TR-mPG from mPAP RHC were −2.6 mm Hg (95% limits of agreement: −26.3 to +21.1 mm Hg) and 3.3 mm Hg (95% limits of agreement: −20.1 to +26.7 mm Hg), respectively.

Performance of different TR methods for predicting PH
The ROC analysis showed that sPAP ECHO had better predictive efficiency and sensitivity for determining the possibility of PH than other TR-related methods, including TR Vmax, TR-PG, TR-mPG, and mPAP ECHO (Table 3), but their differences were not significant (P > 0.05). Using Youden index quantification, the optimal cutoff value for our cohort was 49.5 mm Hg for the sPAP ECHO method with a sensitivity of 94.9% and a specificity of 85.7%.

Factors affecting the accuracy of sPAP ECHO estimation
There were 79 patients (36.2%) in the overestimated group, 81 patients (37.2%) in the accurate group, and 58 patients (26.6%) in the underestimated group. sPAP RHC was divided into three levels according to its tertiles (63 mm Hg, 85 mm Hg). The low-level group was defined sPAP RHC less than 63 mm Hg. Patients with sPAP RHC between 63 mm Hg and 85 mm Hg were considered the medium-level group, while patients with sPAP RHC higher than 85 mmHg were classified as the high-level group. The univariate ordinal analysis demonstrated that RV WT, FAC, TR signal quality, sPAP RHC level, RAP, PVR, PAWP, and mPAP were associated with the inaccuracy of sPAP ECHO estimation ( Table 2). After multivariate ordinal regression analysis, we found that TR signal quality, PAWP, and sPAP RHC level significantly affected the accuracy of sPAP ECHO (P < 0.05). Relative to good signal quality, the OR values of medium and poor signal quality were 0.26 (95% CI: 0.14, 0.48) and 0.23 (95% CI: 0.07, 0.73), respectively. Compared with high sPAP RHC level, the OR values of low and medium sPAP RHC levels were 21.56 (95% CI: 9.57, 48.55) and 5.13 (95% CI: 2.55, 10.32), respectively. The OR value of PAWP was 0.94 (95% CI: 0.89, 0.99). In contrast, TR severity and RV systolic function parameters (such as TAPSE, S' , and FAC) did not remain in the final equation.

Discussion
Key findings of our study are as follows: (1) All TRrelated methods, including sPAP ECHO , have comparable and good efficiency in PH screening. (2) The assessment of sPAP ECHO would be more reliable after taking TR signal quality, sPAP RHC levels, and PAWP into account.

Performance of sPAP ECHO in PH screening
In our study, sPAP ECHO showed comparable efficiency to other TR-related methods in PH screening. Compared with mPAP ECHO , sPAP ECHO is more convenient to measure. As a derived variable of TR Vmax, sPAP ECHO did not amplify measurement errors in assessing pulmonary artery pressure as indicated by the current guidelines; on the contrary, it showed better sensitivity while maintaining similar specificity. Relative to TR Vmax, TR-PG, and TR-mPG, sPAP ECHO contains more information from RAP, which may account for its better accuracy and lower bias. RAP elevates with the increase of RV overload [9], so it is an important measurement that provides heart failure and prognostic information [10]. Hellenkamp's study [2] on mPAP ECHO also supported that RAP is of additional diagnostic value in predicting PH. Compared with mPAP ECHO , sPAP ECHO has the advantage of being simple and convenient. Taken together, sPAP ECHO can be a convenient and effective measurement for clinical application in PH screening.

Reasons for the inaccuracy in sPAP ECHO estimation
First, our findings confirmed previous reports that the TR signal quality would affect the accuracy of sPAP ECHO [11]. Poor signal quality leads to the underestimation of sPAP ECHO , because interpretation error of peak velocity is further amplified by the square of the Bernoulli equation. We also found that good signal quality could bring overestimation of sPAP ECHO for some cases. In our cohort, sPAP ECHO was still overestimated in 41% of the patients who obtained good signal quality of TR. After further analysis, we found that the lower sPAP RHC level and PAWP were significantly associated with the overestimation of sPAP ECHO in patients with good signal quality. This phenomenon suggests that we cannot simply rely on good signal quality, and attention should also be paid to sPAP RHC level and PAWP because they both affect the accuracy of sPAP ECHO .
Second, as for the effect of sPAP RHC level on the accuracy of sPAP ECHO , Groh et al. [12] found that echocardiography inaccurately estimated right ventricular pressure in children with elevated right heart pressure. Our results provided further evidence that sPAP ECHO tends to be underestimated at a high sPAP RHC level. We assumed that the coupling mechanism between RV contractility and its load may account for this phenomenon. When sPAP RHC mildly elevates during the initial phase of PH, RV coupling could be maintained by enhanced RV contractility [13,14], and the estimation of sPAP RHC by DE is relatively reliable. However, as PH progresses and RV uncoupling occurs, CO would decrease and RV preload would increase, along with elevated RAP, so the right atrioventricular pressure gradient would decrease, and DE would underestimate sPAP RHC . sPAP RHC level may affect the accuracy of sPAP ECHO through the coupling mechanism between RV contractility and its load. This finding suggests that we should synthesize more echocardiographic signs when evaluating the efficacy of PH, because the decrease in sPAP ECHO at this time is not necessarily a result of disease improvement, but may also be a sign of underestimation of sPAP ECHO caused by RV decoupling. Third, we found that echocardiography tended to underestimate pulmonary artery pressure when PAWP increased. We speculated that the underestimation of sPAP ECHO due to the higher PAWP may be related to the lower threshold in post-capillary PH patients. Amsallem et al. [15] found that higher PAWP was associated with lower sPAP ECHO threshold for PH diagnosis, which is consistent with our findings. The optimal cutoff value of our cohort was 49.5 mm Hg, which is higher than the cutoff values in previous studies that had focused on post-capillary PH patients with higher PAWP [16,17]. Pre-capillary PH patients with lower PAWP accounted for 85.8% of the cohort, which may explain this phenomenon. Finkelhor et al. [18] also found that PAWP had a strong inverse correlation with the difference between sPAP RHC and sPAP ECHO . They speculated that elevated left atrial pressure can be transmitted to the right atrium via the shared interatrial septum as well as through pericardial constraint and limit TR velocities, thereby also affecting the accuracy of sPAP ECHO . Until now, the mechanism by which PAWP affects the accuracy of sPAP ECHO is still unclear, so more multicenter studies are needed to validate this deduction. Based on the above findings, we think that the accuracy of sPAP ECHO would be improved if combined with the assessment of left ventricular filling pressure by echocardiography. Although RHC is the gold standard for PAWP or left ventricular filling pressure, whether PAWP is elevated can be assessed by indirect signs of echocardiography, such as the ratio of mitral E peak velocity and averaged e'velocity (E/e'm ratio), TR Vmax, and left atrium volume index. Echocardiologists can synthesize such information to determine whether patients have PAWP elevation, to assess the sPAP ECHO and the possibility of PH more reasonably. Furthermore, there is no consensus as to how TR severity would interfere with the accuracy of the sPAP ECHO . Hioka et al. [19] reported that echocardiography increasingly overestimated the TR PG with the advance of TR severity, as was theoretically predicted by the pressure recovery phenomenon associated with the laminar regurgitant flow. However, Parasuraman et al. [20] reported that severe TR could cause equalization of right atrial and ventricular pressures, which may cause the TR Doppler envelope to be cut short, thereby leading to underestimation of sPAP ECHO . Our study differed from other studies in that the TR severity did not significantly affect the accuracy of sPAP ECHO . It should be noted that only 9.6% of patients in our cohort had severe TR, which was in line with the actual clinical situation that severe TR only appears in the minority of patients. However, in patients with mild or moderate TR, we could also obtain good signal quality and estimate sPAP ECHO appropriately (Fig. 5). TR severity was also affected by RV contractility and dimension. Thus, the overall impact of TR severity on the accuracy of sPAP ECHO is not as significant as that of TR signal quality.  We found that none of the RV systolic parameters had a significant impact on the accuracy of sPAP ECHO . Theoretically, RV systolic function would gradually decrease [21], but RV can remain coupled for the large increase in load by increasing contractility until heart failure [13]. Therefore, RV systolic parameters are relatively stable before the end-stage of PH. In addition, the heart movement and measurement angle dependence also affect the accuracy of the relevant parameters. Although RV systolic parameters had clinical significance for the assessment of PH, they did not have a significant effect on the accuracy of sPAP ECHO .
Finally, incorporating TR signal quality, sPAP RHC level, and PAWP into the assessment of sPAP ECHO would improve its accuracy and avoid overly invasive examination.

Limitations
This study has several limitations. First, this was a retrospective study with a small sample size. Although 90.5% of our patients had PH, and in 47.7% of them, PH was due to chronic pulmonary thromboembolism, the sample size of other types of PH was relatively small. Thus, we could not give specific suggestions for each type of PH. Second, we included patients who had undergone RHC and echocardiography within 7 days due to the restriction of clinical actual conditions. However, the average interval time was 3 days in this study, and the majority Fig. 5 Examples of different severity of TR with good signal quality and accurate sPAP ECHO . The upper image presents a 40 years old female with mild TR whose sPAP ECHO and sPAP RHC were 59 and 61 mmHg, respectively. The medium image shows a 50 years old female with moderate TR whose sPAP ECHO and sPAP RHC were 60 and 60 mmHg, respectively. The lower image demonstrates a 34 years old female with severe TR whose sPAP ECHO and sPAP RHC were 71 and 73 mmHg, respectively of patients had pre-capillary PH, which indicated that the patients' hemodynamics was relatively stable and did not change dramatically during this short time. Furthermore, contrast microbubbles were not adopted to enhance the tricuspid regurgitation jet for patients with mild regurgitation or poor signal quality. Finally, the single-center nature of the present study limited generalization.

Conclusions
In this study, we found that all TR-related methods, including sPAP ECHO , had comparable and good efficiency in PH screening. To make the assessment of sPAP ECHO more accurate, attention should be paid to TR signal quality, sPAP RHC level, and PAWP.