Observer variation in chest radiography of acute lower respiratory infections in children: a systematic review
© Swingler; licensee BioMed Central Ltd. 2001
Received: 17 September 2001
Accepted: 12 November 2001
Published: 12 November 2001
Knowledge of the accuracy of chest radiograph findings in acute lower respiratory infection in children is important when making clinical decisions.
I conducted a systematic review of agreement between and within observers in the detection of radiographic features of acute lower respiratory infections in children, and described the quality of the design and reporting of studies, whether included or excluded from the review.
Included studies were those of observer variation in the interpretation of radiographic features of lower respiratory infection in children (neonatal nurseries excluded) in which radiographs were read independently and a clinical population was studied. I searched MEDLINE, HealthSTAR and HSRPROJ databases (1966 to 1999), handsearched the reference lists of identified papers and contacted authors of identified studies. I performed the data extraction alone.
Ten studies of observer interpretation of radiographic features of lower respiratory infection in children were identified. Seven of the studies satisfied four or more of the seven design and reporting criteria. Six studies met the inclusion criteria for the review. Inter-observer agreement varied with the radiographic feature examined. Kappa statistics ranged from around 0.80 for individual radiographic features to 0.27–0.38 for bacterial vs viral etiology.
Little information was identified on observer agreement on radiographic features of lower respiratory tract infections in children. Agreement varied with the features assessed from "fair" to "very good". Aspects of the quality of the methods and reporting need attention in future studies, particularly the description of criteria for radiographic features.
Chest radiography is a very common investigation in children with lower respiratory infection, and knowledge of the diagnostic accuracy of radiograph interpretation is consequently important when basing clinical decisions on the findings. Inter- and intra-observer agreement in the interpretation of the radiographs are necessary components of diagnostic accuracy. Observer variation is however not sufficient for diagnostic accuracy. The key element of such accuracy is the concordance of the radiological interpretation with the presence or absence of pneumonia. Unfortunately there is seldom a suitable available reference standard for pneumonia (such as histological or gross anatomical findings) against which to compare radiographic findings. Diagnostic accuracy thus needs to be examined indirectly, including assessing observer agreement.
Observer variation in chest radiograph interpretation in acute lower respiratory infections in children has not been systematically reviewed.
The purpose of this study was to quantify the agreement between and within observers in the detection of radiographic features associated with acute lower respiratory infections in children. A secondary objective was to assess the quality of the design and reporting of studies of this topic, whether or not the studies met the quality inclusion criteria for the review.
Studies meeting the following criteria were included in the systematic review:
1. An assessment of observer variation in interpretation of radiographic features of lower respiratory infection, or of the radiographic diagnosis of pneumonia.
2. Studies of children aged 15 years or younger or studies from which data on children 15 years or younger could be extracted. Studies of infants in neonatal nurseries were excluded.
3. Data presented that enabled the assessment of agreement between observers.
4. Independent reading of radiographs by two or more observers.
5. Studies of a clinical population with a spectrum of disease in which radiographic assessment is likely to be used (as opposed to separate groups of normal children and those known to have the condition of interest).
Studies were identified by a computerized search of MEDLINE from 1966 to 1999 using the following search terms: observer variation, or intraobserver (text word), or interobserver (text word); and radiography, thoracic, or radiography or bronchiolitis/ra, or pneumonia, viral/ra, or pneumonia, bacterial/ra, or respiratory tract infections/ra. The search was limited to human studies of children up to the age of 18 years. The author reviewed the titles and abstracts of the identified articles in English or with English abstracts (and the full text of those judged to be potentially eligible). A similar search was performed of HealthSTAR, a former on-line database of published health service research, and the HSRPROJ (Health Services Research Projects in Progress) database. Reference lists of articles retrieved from the above searches were examined. Authors of studies of agreement between independent observers on chest radiograph findings in acute lower respiratory infections in children were contacted with an inquiry about the existence of additional studies, published or unpublished.
Data collection and analysis
Characteristics of study design and reporting
Validity eligibility criteria
Independent assessment of radiographs
Relevant clinical population (not case-control design)
Other validity characteristics
Description of study population (3 of age, M:F ratio, clinical features and eligibility criteria)
Description of criteria for radiological signs
Presentation of indeterminate results
Meaningful measures of agreement (kappa or equivalent)
Confidence intervals for measures of agreement
Assessment of intra-observer variability
In studies meeting all the inclusion criteria for the review, the author extracted the following additional information: number and characteristics of the observers and children studied, and measures of agreement. When no measures of agreement were reported, data were extracted from the reports and kappa statistics were calculated using the method described by Fleiss . Kappa is a measure of the degree of agreement between observations, over and above that expected by chance. If agreement is complete, kappa = 1; if there is only chance concordance, kappa = 0.
Characteristics of included studies
Simpson et al 1974 a
330 children under 14 years hospitalized with acute lower respiratory infection
McCarthy et al 1981 
128 of 1566 children seen in a pediatric emergency room with a pulmonary infiltrate in chest radiography (as judged by the duty radiologist)
Crain et al 1991 
230 of 242 febrile infants under 8 weeks evaluated in an emergency room and who received a chest radiograph
Kramer et al 1992 
287 unreferred febrile children, aged 3–24 months, in an emergency unit
1 duty radiologist,
1 "blind" pediatric radiologist
Davies et al 1996 b
40 children under 6 months, 25 with pneumonia and 15 with bronchiolitis, admitted to a tertiary care pediatric hospital
3 pediatric radiologists
Coakley et al 1996 
113 previously well children under 3 years hospitalized with acute respiratory infections and no focal abnormality on radiography
Observer agreement: kappa statistics (95% confidence intervals)
0.46 (0.34–0.58) 0.47 (0.35–0.60)
Peribronchial/ bronchial wall thickening
Perihilar linear opacities
Bacterial vs. viral etiology
Peribronchial /bronchial wall thickening
Perihilar linear opacities
The quality of the methods and reporting of studies was not consistently high. Only six of 10 studies satisfied the inclusion criteria for the review. The absence of any of the validity criteria used in this study (independent reading of radiographs, the use of a clinical population with an appropriate spectrum of disease, description of the study population and of criteria for a test result) has been found empirically to overestimate test accuracy, on average, when a test is compared with a reference standard . A similar effect may apply to the estimation of inter-observer agreement, in that two observers may agree with each other more often when aware of each other's assessment, and radiographs drawn from separate populations of normal and known affected children will exclude many of the equivocal radiographs in a usual clinical population, thereby possibly falsely increasing agreement. Only four of ten studies described criteria for the radiological signs, with potential negative implications for both the validity and the applicability of the remaining studies.
The data from the included studies suggest a pattern of kappas in the region of 0.80 for individual radiographic features and 0.30–0.60 for composite assessments of features. Kappa of 0.80 (i.e. 80% agreement after adjustment for chance) is regarded as "good" or "very good" and 0.30–0.60 as "fair" to "moderate" . The small number of studies in this review however makes the detection and interpretation of patterns merely speculative. Only two radiographic features were examined by more than one study. There is thus insufficient information to comment on heterogeneity of observer variation in different clinical settings.
The range of kappas overall is similar to that found by other authors for a range of radiographic diagnoses7. However, "good" and "very good" agreement does not necessarily imply high validity (closeness to the truth). Observer agreement is necessary for validity, but observers may agree and nevertheless both be wrong.
Little information was identified on inter-observer agreement in the assessment of radiographic features of lower respiratory tract infections in children. When available, it varied from "fair" to "very good" according to the features assessed. Insufficient information was identified to assess heterogeneity of agreement in different clinical settings.
Aspects of the quality of methods and reporting that need attention in future studies are independent assessment of radiographs, the study of a usual clinical population of patients and description of that population, description of the criteria for radiographic features, assessment of intra-observer variation and reporting of confidence intervals around estimates of agreement. Specific description of criteria for radiographic features is particularly important, not only because of its association with study validity but also to enable comparison between studies and application in clinical practice.
Financial support from the University of Cape Town and the Medical Research Council of South Africa is acknowledged.
- Lijmer JG, Mol BW, Heisterkamp S, Bonsel GJ, Prins MH, van der Meulen JH, Bossuyt PM: Empirical evidence of design-related bias in studies of diagnostic tests. JAMA. 1999, 282: 1061-1066. 10.1001/jama.282.11.1061.View ArticlePubMedGoogle Scholar
- Jaeschke R, Guyatt G, Sackett DL, for the Evidence-Based Medicine Working Group: Users' guides to the medical literature. III. How to use an article about a diagnostic test. A. Are the results of the study valid?. JAMA. 1994, 271: 389-391. 10.1001/jama.271.5.389.View ArticlePubMedGoogle Scholar
- Greenhalgh T: How to read a paper. Papers that report diagnostic or screening tests. BMJ. 1997, 315: 540-543.View ArticlePubMedPubMed CentralGoogle Scholar
- Reid MC, Lachs MS, Feinstein AR: Use of methodological standards in diagnostic test research. Getting better but still not good. JAMA. 1995, 274: 645-651. 10.1001/jama.274.8.645.View ArticlePubMedGoogle Scholar
- Cochrane Methods Working Group on Systematic Reviews of Screening and Diagnostic Tests:. Recommended methods,. 6 June 1996, [http://wwwsom.fmc.flinders.edu.au/FUSA/COCHRANE/cochrane/sadtdoc1.htm]
- Fleiss JL: Statistical methods for rates and proportions, 2nd edn. New York, John Wiley & Sons. 1981, 212-225.Google Scholar
- Coblentz CL, Babcook CJ, Alton D, Riley BJ, Norman G: Observer variation in detecting the radiographic features associated with bronchiolitis. Invest Radiol. 1991, 26: 115-118.View ArticlePubMedGoogle Scholar
- Coakley FV, Green J, Lamont AC, Rickett AB: An investigation into perihilar inflammatory change on the chest radiographs of children admitted with acute respiratory symptoms. Clin Radiol. 1996, 51: 614-617.View ArticlePubMedGoogle Scholar
- Crain EF, Bulas D, Bijur PE, Goldman HS: Is a chest radiograph necessary in the evaluation of every febrile infant less than 8 weeks of age?. Pediatrics. 1991, 88: 821-824.PubMedGoogle Scholar
- Davies HD, Wang EE, Manson D, Babyn P, Shuckett B: Reliability of the chest radiograph in the diagnosis of lower respiratory infections in young children. Pediatr Infect Dis J. 1996, 15: 600-604. 10.1097/00006454-199607000-00008.View ArticlePubMedGoogle Scholar
- Kiekara O, Korppi M, Tanska S, Soimakallio S: Radiographic diagnosis of pneumonia in children. Ann Med. 1996, 28: 69-72.View ArticlePubMedGoogle Scholar
- Kramer MM, Roberts-Brauer R, Williams RL: Bias and "overcall" in interpreting chest radiographs in young febrile children. Pediatrics. 1992, 90: 11-13.PubMedGoogle Scholar
- Norman GR, Brooks LR, Coblentz CL, Babcook CJ: The correlation of feature identification and category judgments in diagnostic radiology. Mem Cognit. 1992, 20: 344-355.View ArticlePubMedGoogle Scholar
- Simpson W, Hacking PM, Court SDM, Gardner PS: The radiographic findings in respiratory syncitial virus infection in children. Part I. Definitions and interobserver variation in assessment of abnormalities on the chest x-ray. Pediatr Radiol. 1974, 2: 155-160.View ArticlePubMedGoogle Scholar
- McCarthy PL, Spiesel SZ, Stashwick CA, Ablow RC, Masters SJ, Dolan TF: Radiographic findings and etiologic diagnosis in ambulatory childhood pneumonias. Clin Pediatr (Phila). 1981, 20: 686-691.View ArticleGoogle Scholar
- Stickler GB, Hoffman AD, Taylor WF: Problems in the clinical and roentgenographic diagnosis of pneumonia in young children. Clin Pediatr (Phila). 1984, 23: 398-399.View ArticleGoogle Scholar
- Altman DG: Practical statistics for medical research. London, Chapman & Hall. 1991, 404-Google Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2342/1/1/prepub
This article is published under license to BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.