Reproducibility of the STARD checklist: an instrument to assess the quality of reporting of diagnostic accuracy studies

Smidt, Nynke; Rutjes, Anne WS; van der Windt, Daniëlle AWM; Ostelo, Raymond WJG; Bossuyt, Patrick M; Reitsma, Johannes B; Bouter, Lex M; de Vet, Henrica CW

doi:10.1186/1471-2288-6-12

Table 1 Number of articles reported the items of the STARD statement at the first and second assessment and for each item the percentage agreement between the two assessments and kappa statistics of the two assessments.*

From: Reproducibility of the STARD checklist: an instrument to assess the quality of reporting of diagnostic accuracy studies

		First assessment (n = 32)	Second assessment (n = 32)	Inter-assessment agreement	Cohen's kappa
Item		n (%)	n (%)	n (%)
Title/abstract/keywords
1	Identify the article as a study of diagnostic accuracy (recommend MeSH heading 'sensitivity and specificity').	3 (9)	1 (3)	94	0.48
Introduction
2	State the research questions or study aims, such as estimating diagnostic accuracy or comparing accuracy between tests or across participant groups.	27 (84)	31 (97)	88	0.30
Methods
3	The study population: The inclusion and exclusion criteria, setting and locations where data were collected.	17 (53)	10 (31)	78	0.57
4	Participant recruitment: Was recruitment based on presenting symptoms, results from previous tests, or the fact that the participants had received the index tests or the reference standard?	28 (88)	32 (100)	88	NA
5	Participant sampling: Was the study population a consecutive series of participants defined by the selection criteria in item 3 and 4? If not, specify how participants were further selected.	20 (63)	25 (78)	84	0.64
6	Data collection: Was data collection planned before the index test and reference standard were performed (prospective study) or after (retrospective study)?	25 (78)	26 (81)	84	0.52
7	The reference standard and its rationale.	14 (44)	14 (44)	69	0.37
8	Technical specifications of material and methods involved including how and when measurements were taken, and/or cite references for a) index tests and b) reference standard.	28 (88) 19 (59)	29 (91) 19 (59)	97 75	0.84 0.48
9	Definition of and rationale for the units, cut-offs and/or categories of the results of the a) index tests and the b) reference standard.	26 (81) 20 (63)	26 (81) 23 (72)	81 78	0.39 0.51
10	The number, training and expertise of the persons executing and reading the a) index tests and the b) reference standard.	13 (41) 11 (34)	13 (41) 10 (31)	94 84	0.87 0.65
11	Whether or not the readers of the a) index tests and b) reference standard were blind (masked) to the results of the other test and describe any other clinical information available to the readers.	9 (28) 8 (25)	10 (31) 12 (38)	84 88	0.63 0.71
12	Methods for calculating or comparing measures of diagnostic accuracy, and the statistical methods used to quantify uncertainty (e.g. 95% confidence intervals).	4 (13)	4 (13)	94	0.71
13	Methods for calculating test reproducibility, if done a) for the index test b) for the reference standard.	4 (13) 2 (6)	8 (25) 2 (6)	88 94	0.60 0.47
Results
14	When study was performed, including beginning and end dates of recruitment.	17 (53)	17 (53)	100	1.00
15	Clinical and demographic characteristics of the study population (at least information on age, gender, spectrum of presenting symptoms).	14 (44)	16 (50)	81	0.63
16	The number of participants satisfying the criteria for inclusion who did or did not undergo the index tests and/or the reference standard, describe why participants failed to undergo either test (a flow diagram is strongly recommended).	20 (63)	19 (59)	66	0.28
17	Time-interval between the index tests and the reference standard, and any treatment administered in between.	7 (22)	9 (28)	81	0.50
18	Distribution of severity of disease (define criteria) in those with the target condition, other diagnoses in participants without the target condition.	9 (28)	15 (47)	63	0.23
19	A cross tabulation of the results of the index tests (including indeterminate and missing results) by the results of the reference standard, for continuous results, the distribution of the test results by the results of the reference standard.	24 (75)	24 (75)	75	0.33
20	Any adverse events from performing the index tests or the reference standard.	5 (16)	5 (16)	100	1.00
21	Estimates of diagnostic accuracy and measures of statistical uncertainty (e.g. 95% confidence intervals).	13 (41)	14 (44)	91	0.81
22	How indeterminate results, missing data and outliers of the index tests were handled.	20 (63)	21 (66)	66	0.25
23	Estimates of variability of diagnostic accuracy between subgroups of participants, readers or centers, if done.	14 (44)	17 (53)	91	0.81
24	Estimates of test reproducibility, if done. a) index test b) reference standard	8 (25) 1 (3)	10 (31) 2 (6)	81 97	0.54 0.65
Discussion
25	Discuss the clinical applicability of the study findings.	31 (97)	31 (97)	94	-0.032

* Data extraction form for assessing the 25 items of the STARD statement and the references of the 32 included articles are available on request of the first author; NA = not able to calculate.

Back to article page

ISSN: 1471-2288

Contact us

General enquiries: journalsubmissions@springernature.com

BMC Medical Research Methodology

Contact us