1 | Page
CochraneHandbookforSystematicReviews
ofDiagnosticTestAccuracy
Chapter10AnalysingandPresentingResults
PetraMacaskill,ConstantineGatsonis,JonathanDeeks,
RogerHarbord,YemisiTakwoingi.
Version 1.0 Released December 23rd 2010.
©The Cochrane Collaboration
Please cite this version as: Macaskill P, Gatsonis C, Deeks JJ, Harbord RM, Takwoingi Y. Chapter
10: Analysing and Presenting Results. In: Deeks JJ, Bossuyt PM, Gatsonis C (editors), Cochrane
Handbook for Systematic Reviews of Diagnostic Test Accuracy Version 1.0. The Cochrane
Collaboration, 2010. Available from: http://srdta.cochrane.org/.
Saveddateandtime23/12/201015:08JonDeeks
2 | Page
Contents
10.1 Introduction ....................................................................................................................................................... 4
10.1.1 Aims of meta-analysis for DTA reviews....................................................................................................... 4
10.1.2 When not to use a meta-analysis in a review ................................................................................................ 5
10.1.3 How does meta-analysis of diagnostic test accuracy differ from meta-analysis of interventions? ................ 5
10.1.4 Questions which can be addressed in DTA analyses .................................................................................... 6
10.1.4.1 What is the accuracy of a test? ............................................................................................................ 6
10.1.4.2 How does the accuracy vary with clinical and methodological characteristics? ................................. 6
10.1.4.3 How does the accuracy of two or more tests compare? ....................................................................... 6
10.1.5 Planning the analysis .................................................................................................................................... 7
10.1.6 Writing the analysis section of the protocol .................................................................................................. 7
10.2 Key concepts ..................................................................................................................................................... 8
10.2.1 Disease status ................................................................................................................................................ 8
10.2.2 Types of test data .......................................................................................................................................... 9
10.2.3 Analysis of a primary test accuracy study ..................................................................................................... 9
10.2.3.1 Sensitivity and Specificity ................................................................................................................. 10
10.2.3.2 Predictive values ............................................................................................................................... 10
10.2.3.3 Likelihood ratios ............................................................................................................................... 10
10.2.3.4 Diagnostic odds ratios ....................................................................................................................... 11
10.2.4 Positivity thresholds .................................................................................................................................... 11
10.2.5 ROC curves ................................................................................................................................................. 13
10.2.6 Relationships between ROC curves, diagnostic odds ratios and Q* ........................................................... 14
10.3 Graphical and tabular presentation .................................................................................................................. 15
10.3.1 Summary ROC plots ................................................................................................................................... 15
10.3.2 Linked ROC plots ....................................................................................................................................... 16
10.3.3 Coupled forest plots .................................................................................................................................... 16
10.3.4 Example 1: Anti-CCP for the diagnosis of rheumatoid arthritis - Descriptive Plots. .................................. 16
10.3.5 Tables of results .......................................................................................................................................... 18
10.4 Meta-analytical summaries .............................................................................................................................. 18
10.4.1 Should I estimate a SROC curve or a summary point? ............................................................................... 18
10.4.2 Meta-analytical methods not routinely used in Cochrane Reviews ............................................................. 19
10.4.3 Heterogeneity .............................................................................................................................................. 20
10.5 Model fitting .................................................................................................................................................... 20
10.5.1 Moses-Littenberg SROC curves (RevMan) ................................................................................................ 20
10.5.1.1 Properties of the curve ...................................................................................................................... 21
10.5.1.2 Choice of weights .............................................................................................................................. 22
10.5.2 Hierarchical models .................................................................................................................................... 22
10.5.2.1 Bivariate model ................................................................................................................................. 24
10.5.2.2 Example 1 continued: Anti-CCP for the diagnosis of rheumatoid arthritis. ..................................... 25
10.5.2.3 The Rutter and Gatsonis HSROC model ........................................................................................... 26
10.5.2.4 Example 2: Rheumatoid Factor as a marker for Rheumatoid Arthritis. ........................................... 28
10.5.3 Investigating heterogeneity ......................................................................................................................... 29
10.5.3.1 Heterogeneity and Regression Analysis using the Bivariate model .................................................. 29
10.5.3.2 Example 1 (cont).: Investigation of heterogeneity in diagnostic performance of anti-CCP ............. 31
10.5.3.3 Heterogeneity and Regression Analysis using the Rutter and Gatsonis HSROC model ................... 32
10.5.3.4 Criteria for model selection ............................................................................................................... 34
10.5.3.5 Example 2 (cont.): Investigating heterogeneity in diagnostic accuracy of Rheumatoid Factor ........ 35
10.5.4 Comparing Index Tests ............................................................................................................................... 36
10.5.4.1 Test comparisons based on all available studies ................................................................................ 36
10.5.4.2 Test comparisons using the Bivariate model ..................................................................................... 36
10.5.4.3 Example 3: CT versus MRI for the diagnosis of coronary artery disease .......................................... 38
10.5.4.4 Test comparisons using the Rutter and Gatsonis HSROC model ...................................................... 39
10.5.4.5 Test comparison based on studies that directly compare tests ........................................................... 40
10.5.4.6 Example 3 (cont.): CT versus MRI for the diagnosis of coronary artery disease .............................. 41
10.5.5 Computer software ...................................................................................................................................... 42
10.5.6 Approaches to analysis with small numbers of studies ............................................................................... 43
10.6 Special topics ................................................................................................................................................... 44
10.6.1 Sensitivity analysis ..................................................................................................................................... 44
10.6.2 Investigating and handling verification bias. .............................................................................................. 46
10.6.3 Investigating and handling publication bias ................................................................................................ 46
10.6.4 Developments in meta-analysis for DTA reviews ...................................................................................... 47
3 | Page
Appendix ................................................................................................................................................................................. 48
Data and SAS file for Example 1- Anti-CCP for the diagnosis of rheumatoid arthritis ...................................................... 48
Data and SAS file for Example 2 - Rheumatoid Factor as a marker for Rheumatoid Arthritis. ......................................... 51
Data and SAS file for Example 3 - CT versus MRI for the diagnosis of coronary artery disease ...................................... 54
References ............................................................................................................................................................................... 59
4 | Page
10 AnalysingandPresentingResults
10.1 Introduction
Thestatisticalaspects ofasystematicreviewofdiagnostictestaccuracyaremorechallengingthan
forreviewsofinterventions,anditisrecommendedthatreviewteamsincludeanindividualwiththe
statisticalexpertiseneededtounderstandandimplementthehierarchicalmodelsrequiredfor
metaanalysis.Thischapterhasbeenwrittenwith
thisrecommendationinmind.Itfirstaimsto
bothprovideguidancetothekeyresearchersinthereviewteamonthepurpose,possibilitiesand
interpretationofmethodsofmetaanalysis,andsecondlyprovidesthetechnicaldetailtoassista
statisticalexpertinapplyingthemethod srecommendedforCochraneReviews.Sections
10.1to
10.4and10.6outlinetheconceptualapproachtometaanalysis,howanalysisisundertakenfora
singletestaccuracystudy,andthegraphicalpresentationsandmetaanalysismethodsthatare
recommended.Section10.5 is themoretechnicalguidetothemetaanalyticalmodelstoassistan
informedstatisticianapply
themincommercialstatis ticalsoftwareprograms,andisnecessarily
writtenpresumingaleveloffamiliaritywithsta tisticalhierarchicalmodelling.Itincludesexamples
withdatasets,computercodeandresultingoutput.Section10.5isthereforeunlikelytobe
understoodbyallreaders.
10.1.1 AimsofmetaanalysisforDTAreviews
Health
professionals(mainlyphysicians)usediagnosticteststoascertainwhetheranindividual
(usuallyapatient)doesordoesnothaveaparticulardiseaseorcondition.Cochranediagnostictest
accuracyreviewsprovideinformationonhowwelltestsdistinguishpatientswiththediseasefrom
thosewithout.Mosttestsareimperfect,anderrorswilloccur.
Hence,thestatisticalmethodsfocus
ontwostatisticalmeasuresofdiagnosticaccuracy,thesensitivityofthetest(theproportionofthose
withthediseasewhohaveanabnormaltestresult)andthespecificityofthetest(theproportionof
thosewithoutthediseasewhohaveanormaltestresult).A
CochraneDTAreviewaimstoquantify
andcomparethesestatisticsforoneormorediagnosticteststodescribehowwelleachtest
classifiesindividuals,andestimateandcomparethelikelyerrorrates(falsepositiveandfalse
negativediagnoses)thatmaybeencountered.Publishingsuchrev iewsintheCochraneLibraryaims
to
assistdecisionmakersinrationallychoosingandusingtestsbyprovidinggoodevidenceabout
theirlikelyerrorrates.
Metaanalysisisasetofstatisticaltechniquesforcombiningresultsfromtwoormoreseparate
studies.Metaanalysisofdiagnostictestaccuracystudiesprovidessummariesoftheresultsof
relevantincluded
studies:providinganestimateoftheaveragediagnosticaccuracyofatestortests,
theuncertaintyofthisaverage,andthevariabilityofstudy findingsaroundtheestimates.Meta
analyticalregressionmodelscanstatisticallycomparetheaccuracyoftwoormoredifferent
diagnostictestsanddescribehowtestaccuracyvarieswith
testthresholdsandotherstudy
characteristics.
Metaanalysishelpstomakesenseofapparentlyconflictingstudyresults,asitidentifieswhich
differencesarelikelytobereal,whichareexplicablebychance,andwhichcanbeexplainedby
knowndifferencesinstudycharacteristics.Astheprecisionofestimatestypicallyincreases
withthe
quantityofdata,metaanalysismayhavemorepowertodetectrealdifferencesintestaccuracy
betweenteststhansinglestudies,andmayyieldmorepreciseestimatesofexpectedsensitivityand
5 | Page
specificity.Also,byquantifyingthevaria bility oftestaccuracyacrossmanysettings,metaanalysis
mayprovideinsightsintotheconsistencyoftestresults.Metaanalysismodelsalsoprovidea
frameworkforcomparingtheaccuracyoftestswhichhavenotdirectlybeencomparedinindividual
studies.
10.1.2 Whennottousea
metaanalysisinareview
Metaanalysisisapowerfultooltousetosummarisestudyfindings,providingtheestimatesoftest
accuracyintheindividualstudiesarebothrelevantandunlikelytobebiased.
Acommoncriticismofmetaanalysesofstudiesofinterventionsisthat‘theycombineapples
with
oranges’,implyingthattheymaymixtogetherestim atesfromstudieswhichdifferinimportant
ways.ThisisoneofthereasonswhyCochranereviewsemphasisetheimportanceofcarefully
defininginclusioncriteriatoidentifystudieswhichdirectlyaddressthereviewquestion.Inany
analysisitisimportanttoensurethat
therearenodifferencesbetweenthestudiesin termsofthe
participantstheyrecruitandthetestswhichtheyevaluatewhichwouldmaketheresultsofthe
metaanalysisuninformative.Thisisparticularlyimportantinreviewsoftestaccuracy,aschanges
topatientselectioncriteriawillalterthespectrumof
diseaseandnondiseaseinthepopulation,
whichcanstronglyimpact ontestaccuracyasdiscussedinChapter9.
Inadditionitisimportantthatthestudiesthatarebeingcombinedinananalysisare
methodologicallyrigorous.Metaanalysisofstudiesatriskofbiasmaybeseriouslymisleading.If
bias
ispresentinindividualstudiesmetaanalysismaycompoundtheerrorsandproducean
erroneousresultwhichmaybeinappropriatelyinterpretedashavingcredibility.Metaanalysis
involvingregressionmodelling(see10.5.3)maybeusefultoinvestigatehowpoormethodological
qualitycanleadtobiasinresults.
10.1.3 Howdoesmeta
analysisofdiagnostictestaccuracydifferfrommetaanalysisof
interventions?
TheformatofCochraneDTAreviewsallowsforgreaterflexibilityforstructuringandreportingmeta
analysisthanisavailableinCochraneInterventionreviews,andrequiresuseofexternalstatistical
software.Thesedifferencesariseforfivemainreasons:
1) Diagnostictest
accuracyreviewscanhavediverseaimsandaddressdifferenttypesofquestion
(asoutlinedin10.1.4below).Differentcomparisonsandmultipleaimsmaybeaddressedina
singlereview,oftenusingdatafromthesamestudiesinseveralanalyses.Toprovidethe
flexibilityneededRevManrequiresseparatestepsoforganising
dataentryandspecifying
analyses,unlikeinCochranereviewsofinterventionswherethetwostagesarecombined.Thus
thereisaneedtodevelopbothanappropriatedatastructureandaclearanalysisplan.
2) Evaluatingtestaccuracyrequiresknowledgeof twoquantities,thetestsensitivityandspecificity.
Metaanalysismethods
fordiagnostictestaccuracythushavetodealwithtwosummary
statisticssimultaneouslyratherthanone(asisthecaseforreviewsofinterventions).
3) Ametaanalysisofdiagnostictestaccuracyhastoallowforthetrade offbetweensensitivityand
specificitythatoccursbetweenstudiesthatvaryin
thethresholdvalueusedtodefinetest
positivesandtestnegatives(see10.2.4).Metaanalysismethodshavebeendevisedtoenable
studiestobecombinedthathaveusedatest(s)atdifferentthresholds,acommonoccurrencein
manydiagnostictestsystematicreviews.
6 | Page
4) Heterogeneityistobeexpectedinresultsoftestaccuracystudies,thusrandomeffectsmodels
arerequiredtodescribethevariabilityintestaccuracyacrossstudies(see10.4.3).
5) Methodsforundertakinganalyseswhichaccountforbothsensitivity andspecificity,the
relationshipbetweenthem,and theheterogeneityintestaccuracy,
requirefittinghierarchical
randomeffectsmodels,whichisbeyondtheanalyticalabilitiesofRevMan.Although
exploratoryanalysescanbeundertakeninRevMan,thedefinitiveanalysesneedstobe
undertakenincommercialsoftwarepackagesandsophisticatedstatisticalprogramming
environmentssuchasSAS,Stata,SPlus,R,MLwiNorwinBUGS/OpenBUGS,forwhich
collaborationwithastatisticalexpertishighlyrecommended.
10.1.4 QuestionswhichcanbeaddressedinDTAanalyses
TherearethreemaintypesofquestionthatcanbeaddressedinaCochraneDTAanalysisconcerning
theaccuracyofatest.ThequestiontypesaremirroredasdifferentoptionsintheDTA
modulein
RevManwhencreatinganalysisdefinitions.
10.1.4.1 Whatistheaccuracyofatest?
Suchananalysisisrestrictedtocharacterisingtheaccuracyofasingletest,andaimseitherto
estimateanaveragesummaryvalueofsensitivityandspecificityortodescribehowsensitivityand
specificityvarywith
changingthresholdbyestimatingasummaryROCcurve.Whichapproachis
usedwilldependonthenatureofthetest,andthev ariabilityinthresholdsacrossthestudies,which
isdiscussedinmoredetailin10.4.1.
10.1.4.2 Howdoestheaccuracyvarywithclinicalandmethodologicalcharacteristics?
Plannedinvestigationsofheterogeneity
investigatewhethertheobservedtestaccuracyvaries
betweenstudiesaccordingtocharacteristicsassociatedwiththetests,settings,participantsor
methodologyofthestudies.Forpurposesofgraphicalpresentationitisbestforthecharacteristic
variabletogroupstudiesincategories.However,metaregressionmodelsallowinvestigationofthe
relationshipof
accuracytobothcategorical andcontinuouscovariates,suchasdiseaseprevalenceor
testthreshold.BothdifferencesinkeyparametersofsummaryROCcurvesandinsummary
sensitivityspecificitypointscanbeinvestigated.
10.1.4.3 Howdoestheaccuracyoftwoormoretestscompare?
Comparisonoftheaccuracyoftestsisan
importantpartofaCochraneDTAreview,asitidentifies
whichtest(ortests)yieldssuperiortestaccuracy.Itispossibletocomparemultipletestsinasingle
analysisthereisnogeneralrestrictiontocomparingonlypairsoftests,althoughitisoftenhelpful
tostructurecomparisonsof
multipletestsasaseriesofpairwisecomparisons(bearinginmind
problemscausedbymakingexcessivenumbersofmultiplecomparisons).Methodologically,
comparingtwotestscanbeconsideredasaformofsubgroupanalysis,withstudiesevaluatingeach
testeachinaseparatesubgroup,sothesamestatisticalmodellingtechniquesareused
asfor
investigatingsourcesofheterogeneity.However,thereisanimportantconsiderationtobemade
aboutthestudiestobeincludedineachpairwisecomparisonoftwotests;whetherallstudies
shouldbeincluded,orwhetherthecomparisonshouldberestrictedtoonlythosewhichmakedirect
comparisonsthemselves,either
bytestingallpatientsusingalltestsorbyrandomizingpatientsto
differenttests.
7 | Page
10.1.5 Planningtheanalysis
UndertakingmetaanalysesforaCochraneDTAreviewsinvolvesfirstdevelopingananalysisplan
andcreatingaseriesofanalysisdefinitionsinRevMan.Someofthesedecisionscanbemad eat
protocolstage(see10.1.6),othersonlyafterthedatahasbeenextractedfromthepapers.The
planningstagescanbeorganisedasfollows:
Clearlyspecifyingthemainquestionswhichneedanswering,concerningwhichtestsrequire
estimatesoftestaccuracy,andwhichtestsshouldbecomparedwitheachother.
Detailedplanningofthewayinwhichcomparisonswillbemade,identifyingthedifferenttests
or
groupsoftestswhichcanbecompared,themultipleandpairwisecomparisonsthatwillbe
made,andthestudiesanddatathatwillbeincludedineachanalysis.Adecisiontoconsider
hereiswhethercomparativeanalysesshouldincludeallstudies,orberestrictedtothosestudies
thatevaluatebothtests.
Covariatesforanyheterogeneityanalysessimilarlyneedtobe
specifiedandcoded.
Fromthesealistoftheplannedmainanalyses,testcomparisonsandheterogeneityanalyseswill
beproduced.Thequantityofdatathatareavailableforeachanalysisshouldbedeterminedto
guidethechoiceofanalysismethod,
andtoassesswhetheradequatedataareavailablefor
plannedheterogeneityanalyses.AnanalysisdefinitioncanbecreatedinRevManforeach,and
outlinesofmajorresultstablescreated.
PlottingtheresultsonforestplotsandROCplotsusingthefunctionsinRevManwillfamiliarise
thereviewauthorwiththe
locationandvariabilityofthestudyresults.
Astrategyneedstobespecified todealwiththemixedreportingofthresholdsthatmayoccur
acrossstudies.Akeyissueisdecidingwhetherananalysisshouldberestrictedtostudiesthat
shareacommonthresholdvalue(whichallowsestimationof
thesummarysensitivityand
specificityofatestatthatthreshold)ortoincludeallstudiesregardlessofthresholdvalue
(whichallowsestimationofsummaryROCcurvesbutcom promisestheinterpretationof
sensitivityandspecificitypoints).Thiswillbeinformedbyinformationaboutthethresholdsat
whichthetestswereevaluated
intheprimarystudies,andknowledgeofhowthetestsare
appliedinclinicalpractice.
OncethisanalysisplanhasbeendetermineddatamustbeexportedfromRevMantothechosen
statisticspackage,andappropriatemodelsfitted.Resultsmustbecollatedandtabulatedas
required,andparameterestimatesfor
averagesensitivityandspecificitypointsandsummary
ROCcurvescopiedbackintotheRevMangraphicstoproduce finalgraphicaloutput.
10.1.6 Writingtheanalysissectionoftheprotocol
Astheanalysiswilltosomeextentdependonthetypeand quan tityofdatathatarelocatedthrough
theliteraturesearch,it
isoftennotpossibletofullyspecifytheanalysisattheprotocolstage.
However,certainaspectsmustbepredefined,andanalysisstrategiesincludedwherefulldetails
cannotbeprovided.Developingaprotocolpriortoreviewingthestudiesaddsscientifi ccredibility
tothereview,aimingtoreducethepossibilitythatdecisions
madeduringtheanalysis arenotdata
driven,inthatanalyticaloptionsarenotselectedinordertomanipulatethefindings.Italsoensures
thatthereisaclearplanforthecollectionandprocessing ofdata, whichwill informthedata
extractionprocessandensurethattheanalysesdonewill
addresstheaimsofthereview.
Aleveloffamiliaritywithkeystatisticalsummarymeasuresshouldbepresumedwhenwritinga
protocol.Forexample,itisnotnecessarytodefinesummarymeasuressuchassensitivityand
8 | Page
specificity,likelihoodratios,etc.Similarly,itisnotnecessarytoincludeexplanationsofthemeta
analyticalmethodsusediftheyarethosedescribedintheHandbook.ThischapteroftheHandbook
shouldbecitedifanydefinitionsandexplanationsarethoughtnecessary.Wherenonstandard
methodsarerequired,theseshouldbe
describedandtheirusejustified.
Keyissueswhichneedtobestatedare:
Definitionsofkeycriteria,suchasdisease(specifyinganybinaryclassificationsrequired)and
categorisationsofpositiveandnegativetestresults.Wherethereareseveralpossibleoptionsa
strategyneedstobeprovidedastohowa
definitionwillbemade,andpla nsforsensitivity
analyses(10.6.1)includedinordertoinvestigatetherobustnessofthedecisionsmade.Rules
forhandlingknowncategoriesofindeterminatetestresultsshouldbeprestatedwhere
possible.
Astrategyneedstobeincludedforhandlingmultiplethresholds fortestpositivity,pre
specifying,ifpossible,anycommonthresholdsatwhichsummaryestimatesofsensitivityand
specificitywillbeobtained(see10.4.1).
Approachestomodellingneedtobeoutlined.Insomecases,itmaynotbepossibletospecifyin
advancewhetherthemodellingwillfocusonsummarypointsand/orcurvesas
thiswillbe
determinedbyhowstudiesreporttheirresults.Inthissituation,reviewersshouldmakeitclear
howtheywillmakethisde cisiononcethedataareavailable(see10.4.1).Thesoftwarethatwill
beusedforanalysisshouldbestated(see10.5.5).
Itneedstobestatedclearly
whetherallstudieswillbeincludedintestcomparisons,whether
comparisonswillbebasedonpaireddataonly,orwhetherbothwillbepresented.Ifboth,it
needstobeclearwhichwillbetheprimaryanalysis.Again,numbersofstudiesmayaffectthe
originalintent(see10.5.4Error!Referencesource
notfound.Error!Referencesourcenot
found.).
Plannedinvestigationsofheterogeneityshouldbeoutlined,statingcovariatecodingsifknown,
andtheapproachesusedforbuildingmodels(see10.5.3).
Plans,ifany,forinvestigatingreportingbiasesshouldbeoutlined(see10.6.3).
Anydeviationsfromtheprotocolshouldbedocumented
inthe‘Differencesbetweenprotocoland
review’sectionattheendofthereview.
10.2 Keyconcepts
10.2.1 Diseasestatus
ForthepurposesofthisHa ndbook,theaccuracyofadiagnosticorscreeningtestwillbeassessedby
measuresofthetest’sabilitytodetectthepresenceofdisease.Thetruediseasestatusofeach
individualwillbeconsideredasbinary(dichotomous),diseasedandnotdiseased.Although
this
representsasimplificationofthereality ofdiagnosis,thevastmajorityofavailablemethodologyfor
theassessmentofdiagnosticandscreeningtestsispredicatedontheassumptionofadichotomous
truediseasestatus.Wheretherearealternativesfordichotomisationofdiseasestatus,binary
categorisationswhichrelatetode cisionmakingoptions
usedinclinicalpracticeshouldbechosento
ensurethatthereviewwillinformclinicaldecisionmaking.Wherenoconsensusexists,
considerationofalternativecategorisationsmaybeinvestigatedinsensitivityanalyses(10.6.1).
9 | Page
Statisticalmethodologyiscurrentlybeingdevelopedformodellingtestaccuracyformultipledisease
categories,butthisiscurrentlyatadevelopmentalstageandnotreadyforinclusioninCochrane
reviewsofdiagnostictestaccuracy(see10.6.4).
10.2.2 Typesoftestdata
Systematicreviewsofdiagnosticandscreeningtestaccuracyinvolvetest
resultsofoneormoreof
thefollowingthreedatatypes:
Binary(dichotomous),in whichthetestresultisreportedasayesorno,positiveornegative.
Ordinal,inwhichthetestresultisreportedonasetoforderedcategories,oftenwithverbal
descriptors,suchas
1=definitelynormal,2=presumablynormal,3=equivocal,4=presumably
abnormal,5=definitelyabnormal.
ContinuousorCount,inwhichthetestresultisreportedonacontinuousscaleorasacount,
suchastheconcentrationofasubstanceorthenumberoffeaturesobserved.
Manyordinalandbinarycategorizationsarise,or
canbeconceptualizedasarisingfromunderlying
continuousmeasurementsbyapplicationofoneormorethresholds.Forexample,laboratorytests
thatreportresultsaspositiveornegativetypicallyinvolveanumericalmeasurementwhichis
categorizedaccordingtoaprestatedthreshold,whereasimagingtestsmayreportanordinalgrade
forthe
certaintyofthepresenceofafeatureorthestageofdiseaseprogression.
Tobeincludedinametaanalysis,ordinal,countorcontinuoustestresultsneedberecategorizedas
binarybyselectingathresholdandpresentingthedataasa2x2table.Theissueofchoiceofsuch
positivitythresholdsandexaminationofaccuracyatseveralthresholdsis discussedin10.2.4and
10.4.1.
10.2.3 Analysisofaprimarytestaccuracystudy
Thissectiondefinessummarystatisticsfortestaccuracycommonlyusedinreportsofprimary
studies.
Havingchosenaparticularthresholdfortestpositivity,thedatafromaprimary
studycanbe
presentedina2x2tableshowingthecrossclassificationofdiseasestatus(resultofthereference
standard)andtestoutcome(resultoftheindextest)asinTable10.1.Forsimplicity,throughoutthis
chapterwerefertothosewithandwithoutthetargetconditionasdefinedby
thereference
standardasdiseasedandnondiseased,acceptingthatthosewithoutthetargetconditionmaywell
haveotherdiseases.
Table10.12x2crossclassificationoftestresultsanddiseasestatus
Testoutcome(indextest) Diseasestatus(referencestandardresult)
Diseased(D+) Nondiseased(D) Total
Indextest
positive(T+) Truepositives(a) Falsepositives(b) Testpositives(a+b)
Indextestnegative(T) Falsenegatives(c) Truenegatives(d) Testnegatives(c+d)
Total Diseasepositives(a+c) Diseasenegatives(b+d)N(a+b+c+d)
Studyspecificaswellassummarymeasuresoftestaccuracyarethencomputedeitheras
proportionsofthosediseasepositiveornegative(instatisticalterms,thesearestatisticsthatare
conditionalonthediseasestatus)ortestpositive ornegative (thesearestatisticsthatareconditional
ontheindextestresult)
asdescribedbelow.
10 | Page
10.2.3.1 SensitivityandSpecificity
Sensitivityandspecificityaremeasuresdefinedconditionalonthediseasestatusastheyare
computedasproportionsofthenumberdiseasedandthenumbernondiseasedrespectively.
Thesensitivityofatestisdefinedastheprobabilitythattheindextestresultwillbepositiveina
diseasedcase.Formally,sensitivity=P(T+|D+)andisestimatedusingthenumbersfromthetableas
a/(a+c).SensitivityissometimesreferredtoasDetectionRate(DR),TruePositiveRate(TPR)orTrue
PositiveFraction(TPF).Itisexpressedeitherasaproportionorapercentage.
Thespecificityofatestisdefined
astheprobabilitythattheindextestresultwillbenegativeina
nondiseasedcase.Formally,specificity=P(T|D)andisestimatedusingthenumbersfromthetable
asd/(b+d).Specificity is occasionally referredtoastheTrueNegativeRate(TNR)orTrueNe gative
Fraction(TNF).Moreoften,theterms
FalsePositiveRate(FPR)andFalsePositiveFraction(FPF)are
usedforthecomplementofspecificity(computedas1specificityorb/(b+d)).Again,both
proportionsandpercentagesareused.
Althoughthetermstruepositivefractionandfalsepositivefractionarebothtechnicallymore
correctbecausesensitivityandspecificityarefractions
andnotrates,truepositiverateandfalse
positiveratearethetermsinmostcommonusageandwillbeusedinthisHandbook.
ThevaluesofsensitivityandspecificityareoccasionallycombinedinameasureknownasYouden’s
Indexcomputedassensitivity+specificity–1.Youden’sIndexhasnodirectprobabilistic
interpretationbut
providesageneralindexoftestaccuracywhichgivesequalweighttotesterrors
(falsenegativesandfalsepositives).Valuescloseto1 indicatehighaccuracy;avalueofzerois
equivalenttouninformedguessingandindicatesthatatesthasnodiagnosticvalue.
10.2.3.2 Predictivevalues
Predictivevaluesare
measuresdefinedconditionalontheindextestresultsastheyarecomputedas
proportionsofthetotalwithpositiveandnegativeindextestresults.
Thepositivepredictivevalueofatestisdefinedastheprobabilitythatacasewithapositiveindex
testresultisdiseased.Formally,positivepredictivevalue=P(D+|T+)
andisestimatedusingthe
numbersfromthetableasa/(a+b).Again,positivepredictivevaluesarereportedeitheras
proportionsorpercentages.
Thenegativepredictivevalueofatestisdefinedastheprobabilitythatacasewithanegativeindex
testresultisnondiseased.Formally,negativepredictivevalue=P(D
/T)andisestimatedusingthe
numbersfromthetableasd/(c+d).Again,negativepredictivevaluesarereportedeitheras
proportionsorpercentages.
10.2.3.3 Likelihoodratios
LikelihoodratioscanbeusedtoupdatethepretestprobabilityofdiseaseusingBayes’theorem,
oncethetestresultisknown.Theupdated
probabilityisreferredtoastheposttestprobability.For
atestthatisinformative,theposttestprobabilityshouldbehigherthanthepretestprobabilityif
thetestresultispositive,whereastheposttestprobabilityshouldbelowerthanthepretest
probabilityifthetestresult
isnegative.Considerationsabouttheuseoflikelihoodratiosin
systematicreviewsoftestaccuracyareexplainedintheChapter11.
11 | Page
Thepositivelikelihoodratiodescribeshowmanytimesmorelikelypositiveindextestresultswerein
thediseasedgroupcomparedtothenondiseased group.Thepositivelikelihoodratio,whichshould
begreaterthan1ifthetestinformative,isdefinedas:
LR+=P(T+|D+)/P(T+|D)=sens/(1spec),and is
estimatedas(a/(a+c))/(b/(b+d)).
Thenegativelikelihoodratiodescribeshowmanytimeslesslikelynegativeindextestresultswerein
thediseasedgroupcomparedtothenondiseased group.Thenegativelikelihoodratio,whichshould
belessthan1ifthetestisinformative,isdefinedas:
LR‐=
P(T|D+)/P(T|D)=(1sens)/spec,andisestimatedas(c/(a+c))/(d/(b+d)).
10.2.3.4 Diagnosticoddsratios
Thediagnosticoddsratio(DOR)summarizesthediagnosticaccuracyoftheindextestasasingle
numberthatdescribeshowmanytimeshighertheoddsareofobtainingatestpositiveresult
ina
diseasedratherthananondiseasedperson.Thefactthatitsummarisestestaccuracyinasingle
numbermakesiteasytousethismeasureformetaanalysisasdescribedin10.5.1,butexpressing
accuracyintermsofratiosofoddsmeansthemeasurehaslittledirectclinicalrelevance,
anditis
rarelyusedasasummarystatisticinprimarystudies.Infact,theclinicianisusuallyinterestedinthe
sumofthenumberoffalsenegativeandfalsepositiveresultswhereastheDORreflectstheir
product.TheDORdoes,however,remainanimportantelementinmetaanalyticmodel
building(see
10.5).Itisformallydefinedas:
DOR=LR+/LR=(sens×spec)/(1sen s)×(1spec),and is estimatedas(ad)/(bc).
10.2.4 Positivitythresholds
Binarytestoutcomesaredefinedonthebasisofathresholdfortestpositivityandchangeifthe
thresholdisaltered.Thisdependenceon
thresholdisafundamentalaspectofdiagnostictest
evaluation.Inthecaseoftestsensitivityandspecificity,thedependenceinducesatradeoff
betweenthetwoquan tities,onevalueincreasingwhilsttheotherdecreasesasthethresholdfor
positivityismoved.ThisisillustratedinthepanelsinFigure10.1,which
eachshowthesame
hypotheticaldistributions oftestresultsfordiseasedandnondiseasedindividualsonacontinuous
scale.Thepanelsvaryinthenumericalvalueofthediseasethresholdusedtodefinetestpositive.At
eachthreshold,thesensitivityofthetestismeasuredbytheproportionofthe
areaunderthe
‘diseased’curvetotherightofthethreshold.Similarly,thespecificityismeasuredbytheproportion
oftheareaunderthe‘nondiseased’curvetotheleftofthethreshold.Asthethresholddecreases
frompanel(a)topanel(e),theproportionofthosewithdiseasewhoare
abovethethresholdand
hencehaveapositivetestincreasesfrom69%to99%.Thesefiguresgivethesensitivityofthetest.
Atthesametimetheproportionofthosewithoutdiseasewhoarebelowthethresholdandhence
haveanegativetestresultdecreasesfrom99%to69%.These
figuresgivethespecificityofthetest.
Throughoutthischapterre lationshipsoftestperformancearedescribedpresumingthathighertest
resultsareconsistentwithdiseasebeingpresentandlowertestsresultsareconsistentwithdisease
beingabsent.Iflowermeasuresofthetestquantityi ndicatedisea se,therelationshipswouldbe
reversed.

12 | Page
Figure10.1Relationshipbetweensensitivity,specificityandthepositivitythreshold
(a)
(b)
(c)
(d)
(e)
TN FN FP TP
specificity=99% sensitivity=69%
diseasednon-diseased
0 40 80 120 160
test measurement
TN FN FP TP
specificity=98% sensitivity=84%
diseasednon-diseased
0 40 80 120 160
test measurement
TN FNFP TP
specificity=93% sensitivity=93%
diseasednon-diseased
0 40 80 120 160
test measurement
TN FN FP TP
specificity=84% sensitivity=98%
diseasednon-diseased
0 40 80 120 160
test measurement
TN FN FP TP
specificity=69% sensitivity=99%
diseasednon-diseased
0 40 80 120 160
test measurement
13 | Page
10.2.5 ROCcurves
PrimarystudiesthatevaluateatestatseveralthresholdssometimespresentresultsasROCcurves.
TheROCcurveofatestisthegraphofthevaluesofsensitivityandspecificitythatareobtainedby
varyingthepositivitythresholdacrossallpossiblevalues.Thegraphplotssensitivity(true
positive
rate)against1–specificity(falsepositiverate).Thecurveforanytestmovesfromthepointwhere
sensitivityand1–specificityareboth1(theupperrightcorner)whichisachievedforathresholdat
thelowerendofitsrange(classifyingallparticipantsastestpositive,sothereareno
falsenegatives
butmanyfalsepositives)toapointwheresensitivityand1specificityarebothzero(thelowerleft
corner)whichisachievedwhenthethresholdmovestotheupperendofitsrange(andall
participantsareclassifiedastestnegative,givingnofalsepositivesbutmanyfalsenegatives).
The
shapeofthecurvebetweenthesetwofixedpointsdependsonthediscriminatoryabilityofthetest.
Figure10.1showsidealiseddistributionsoftestresultsforpopulationsofdiseasedandnondiseased
individuals,withshadedareasshowinghowthefalsenegativerate(red)andthefalsepositiverate
(green)
changeasthepositivitythresholdvaries.Figure10.2(a)showstheresultingROCcurve.In
practice,theROCcurveisestimatedfromafinitesampleoftestresultsandhencewillnot
necessarilybeasmoothcurveasshownbelow.NotethatthehorizontalaxisforeachROCplotin
Figure10.2
islabelledintermsofspecificitydecreasingfrom1.0to0.0.Thisstyleoflabellingisused
inRevMan,andisequivalenttotheusuallabelling(1specificityrangingfrom0.0to1.0).
ThepositionoftheROCcurvedependsonthedegreeofoverlapofthedistributionsofthe
test
measurementindiseasedandnondiseased.Whereatestclearlydiscriminatesbetweendiseased
andnondiseasedsuchthatthereisnoorlittleoverlapofdistributions,theROCcurvewilli ndicate
thathighsensitivityisachievedwithahighspecificity,thatisthecurveapproachestheupperleft
handcorner
ofthegraphwheresensitivityis1andspecificityis1(Figure10.2(a)).If the
distributionsoftestresultsindiseasedandnondiseasedcoincide,thetestwouldbecompletely
uninformativeanditsROCcurvewouldbetheupwarddiagonalofthesquare(Figure10.2(c)).
TheROCcurvesshownin
Figure10.2(a)(c)areallsymmetricalaboutthesensitivity=specificityline
(thedownwarddiagonalofthesquare).ItisalsopossibletogetROCcurveswhicharenot
symmetricalasinFigure10.2(d).Asymmetricalcurvestypicallyoccurwhenthedistributionofthe
testmeasurementinthosewithdiseasehasmoreorless
variabilitythanthedistributioninnon
diseasedpeople.Increasedvariabilitymightoccur,forexample,wherediseasemaycausea
biomarkerbothtoriseandbecomemoreerratic;reducedvariabilitymightoccurwheredisea s emay
lowerbiomarkervaluestoaboundinglevelsuchasalowerlevelofdetection.
Thecomparison
oftestsonthebasisoftheirROCcurvestakesintoconsiderationtheiraccuracy
acrossarangeofthresholds,andisaidedbysinglesummarystatistics.Severalsuchmeasureshave
beenproposedintheliterature.Mostcommonlyusedamongthemistheareaunderthecurve
(AUC),whichequals 1for
aperfecttestand0.5foracompletelyuninformativetest.TheAUCis
equaltotheprobabilitythatifapairofdiseasedandnondiseasedindividualsisselectedatrandom,
thediseasedindividualwillbehaveahighertestresultthanthenondiseasedindividual. TheAUC
canalso
beinterpretedasanaveragesensitivityforthetest,takenoverallspecificityvalues(or
equallyastheaveragespecificityoverallsensitivityvalues).Othersummariesincludepartialareas
underthecurve,valuesofsensitivitycorrespondingtoselectedvaluesofspecificity(and viceversa),
andoptimaloperatingpoints,definedaccording to
specifiedcriteria.
14 | Page
Figure10.2ExamplesofROCcurves
(a)
(b)
(c)
(d)
10.2.6 RelationshipsbetweenROCcurves,diagnosticoddsratiosandQ*
ThereisausefullinkbetweenROCcurvesanddiagnosticoddsratioswhichisimportantto
appreciatetounderstandthewayinwhichmetaanalyticalmodelsareconstructed.
Forthe
symmetricROCcurvesdisplayedinFigure10.3,allpointsoneachcurvehavea com mondiagnostic
oddsratio.Thispropertyariseswhenthetestresultsinthediseasedandnondiseasedgroupshavea
particularmathematicaldistributionknownasalogisticdistributionwithequalvarianceinboth
groups.
Forexample,aROCcurvewithadiagnosticoddsratioof21wouldgothroughthe
(sensitivity,specificity)pointsof(0.70,0.90),(0.82,0.82)and(0.90,0.70).Thusonewayof
summarisingasymmetricROCcurveisbythevalueofthediagnosticoddsratio.WhereROCcurves
areasymmetric,the
diagnosticoddsratioisnotconstantacrossthewholelengthofthecurvebut
increases(ordecreases)systematicallywithincreasingthreshold,andthecurvecanbe
mathematicallydescribedbynotinghowthediagnosticoddsratiochangeswiththreshold,ora
quantityrelatedtothreshold.
diseasednon-diseased
0 40 80 120
test measurement
0.0 0.2 0.4 0.6 0.8 1.0
sensitivity
0.00.20.40.60.81.0
specificity
diseasednon-diseased
0 40 80 120
test measurement
0.0 0.2 0.4 0.6 0.8 1.0
sensitivity
0.00.20.40.60.81.0
specificity
diseasednon-diseased
0 40 80 120
test measurement
0.0 0.2 0.4 0.6 0.8 1.0
sensitivity
0.00.20.40.60.81.0
specificity
diseasednon-diseased
0 40 80 120
test measurement
0.0 0.2 0.4 0.6 0.8 1.0
sensitivity
0.00.20.40.60.81.0
specificity
15 | Page
Theserelationshipsarenotusedinprimarystudiesoftests,butformthebasisoftheROCbased
metaanalyticalmodelsoftestaccuracydescribedin10.4 Error!Referencesourcenotfound.Error!
Referencesourcenotfound.and10.5below.
ROCcurvesaresometimesdescribedbyquotingapointknownasQ*
wheretheROCcurve
intersectsthedownwarddiagonalshowninFigure10.3.Bydefinition,atthispointthesensitivity
andspecificityvaluesareequal.TheuseofQ*valuesisdiscouragedinCochranereviewsasthey
oftengivethewrongimpressionoftheaccuracy,particularlyifSROCcurvesareasymmetric,or
the
studypointslieawayfromthedownwarddiagonalofthesensitivity=specificityline.
Figure10.3RelationshipbetweenDORandROCcurves
10.3 Graphicalandtabularpresentation
ACochranereviewofdiagnostictestaccuracyusestwomainformsorgraphicaldisplay,summary
ROCplotsandforestplots.ReviewauthorscreatethesefigureswithinRevManforeachanalysis
thatisspecified.
10.3.1 SummaryROCplots
SummaryROCplotsdisplaytheresultsofindividualstudiesinROCspace,eachstudy
isplottedasa
singlesensitivityspecificitypoint.Thesize ofpointscanbecontrolledtodepicttheprecisionofthe
estimate(typicallyscaledaccordingtotheinverseofthestandarderrorofthelogit(sensitivity)and
logit(specificity))oraccordingtotheirsamplesizes.InRevManitispossibletomark
studiesas
rectangles,withtheirheightrelatingtothenumberofdiseased(andhenceprecisionofsensitivity
estimate)andwidthrelatingtothenumberofnondiseased(andhencetheprecisionofthe
specificityestimate).
SummaryROCplotsdepictthescatterofthestudyresults.Occasionally‘crosshairs’areaddedto
eachstudypointtoindicateconfidencelimitsforsensitivityandspecificity,butthiscanmakethe
plotveryclutteredshouldtherebeman y studies.ThisisnotimplementedinRevMan.Evenifthey
depicttheprecisionoftheestimatesfromindividualstudies,itisdifficulttogaugevisuallyasense
of
randomvariabilityversusheterogeneity.
DOR=361
DOR=81
DOR=5
DOR=16
DOR=2
DOR=1
line of symmetry
uninformative test
0 0.2 0.4 0.6 0.8 1.0
sensitivity
1.0 0.8 0.6 0.4 0.2 0
specificity
16 | Page
Twotypesofmetaanalyticalsummarycanbeaddedtothegraph:summaryROC(SROC)curvesand
summarysensitivityandspecificitypoints.Confidenceregionsforthesummarysensitivityand
specificitypointscanbeincluded,ascanpredictionregionswhichgiveanindicationofbetween
studyheterogeneity(seealso 10.5.2.1).
Studiescan
alsobeplottedusingdifferentsymbolsorcolourstoindicateattributiontodifferent
subgroupsforinvestigationsofheterogeneityorfortestcomparisons.
10.3.2 LinkedROCplots
LinkedROCplotsareusedinanalysesofpairsoftests,wherebothtestshavebeenevaluatedineach
study.Thepointsareplotted
asinanormalsummaryROCplot,butthetwoestimates(oneforeach
test)fromeachstudyarejoinedbyaline.Itisthuspossibletogetasenseofthechangeinaccuracy
withinstudybetweenthetests,andtonotethedegreeofconsistencyinthischange.
Summary
estimatesofsensitivityandspecificityforeachtests,aswellassummaryROCcurvesobtainedfrom
metaanalysiscanbeaddedtotheseplots(see10.5.4 .5 foranexampleplot).
10.3.3 Coupledforestplots
Forestplotsfordiagnostictestaccuracyreportthenumberoftruepositivesandfalsenegatives
in
diseasedandtruenegativesandfalsepositivesinnondiseasedparticipantsineachstudy,andthe
estimatedsensitivityandspecificity,togetherwithconfidenceintervals.Theplotsareknownas
coupledforestplotsastheycontaintwographicalsections:onedepictingsensitiv ity,andone
specificity.Theorderofthestudiescan
besorted,oftentheyarepresentedsortedbyvaluesof
sensitivity,orgroupedbytesttypeorcovariatevalues.Whilstitispossibletoobserveheterogeneity
insensitivityandspecificityindividuallyonsuchplots,itisnotaseasytovisualisewhetherthereare
thresholdlikerelationships.Summarystatisticscomputed
frommetaanalysesarerarelyaddedto
coupledforestplots.InCochraneDTAreviewsanarchiveofcoupledforestplotsforallthetestsfor
whichdataareenteredintoRevManispublishe dwiththereviewtomakethe2x2tableswidely
accessible.
10.3.4 Example1:AntiCCPforthediagnosis
ofrheumatoidarthritisDescriptivePlots.
Thesedataaretakenfromareview(Nishimura2007) ofanticycliccitrullinated peptideantibody
(antiCCP).Thereferencestandardwasbasedonthe1987revisedAmericanCollegeof
Rheumatology(ACR)criteriaorclinicaldiagnosis.Thirtysevenstudieswereincludedinthemeta
analysisandtheirsensitivitiesandspecificitiesareshownontheforestplot,andthestudyspecific
estimatesarealsoshowninascatterplotinROCspacebelow.
Theforestplotbelowshowsthestudiesinalphabeticalorder.Thefiguregivesthenumbersforthe
2×2table(TP,FP,FN,TN)foreachstudywhichwillformthebasisforstatisticalanalyses.Study
specificestimatesofsensitivityandspecificityaresho w n,withtheir95%confidenceintervals.These
estimates(andconfidenceintervals)arealsoshowngraphically.Themoststrikingfeatureofthis
figureisthegreateruncertainty(indicatedbytheconfidenceintervalwidth)andvariability
(indicatedbythescatterofpointestimates)insensitivitythanspecificity.Thestudiescanbeordered
indifferentways(e.g.inincreasingorderofsensitivity)toprovideavisualrepresentationofany
associationbetweensensitivityandspecificity.Thefigurealsoincludesinformationonacovariate,
theCCPgeneration,whichmaybeassociatedwithheterogeneityintestaccuracy.(Thiswillbe
exploredin10.5.3.1).
17 | Page
TheROCscatterplotshownbelowalsoshows
thegreatervariabilityinestimatedsensitivity
thanspecificityacrossstudies.Covariate
information(e.g.generationofCCP)canbeused
todistinguishbetweenstudies indifferent
subgroups(e.g.CCP1vsCCP2),andROCcurves
canbesuperimposedforadescriptiveanalysis.
However,amoreformalstatisticalanalysisis
requiredtoprovidesummaryestimatesoftest
accuracyandtoexploreheterogeneity.These
willbecoveredin10.5.3.2and10.5.3.5.Before
proceedingtothesestatisticalanalyses ,the
reviewauthormustdecidewhetheritis
appropriatetofocusonasummarypoint(s)ora
summarycurve(s)inthestatisticalanalysesthat
follow.Thiswillbedeterminedbythethreshold(s)usedbythestudiestodefineapositivetestresult
(see10.5.2).
18 | Page
10.3.5 Tablesofresults
Reviewauthorsneedtoconstructadditionaltablestoreportresultsfromtheirmetaanalytical
models.UnlikeforCochraneinterventionreviews,thisoutputisnotautomaticallyincludedinthe
reviewdocument.Authorsmightconsidercreatingtablesforthefollowingpurposes:
Toreportthenumbersofstudiesand
individualsavailableforeachofthekeyanalyses.
Toreporttheestimatesofdiagnostic ac curacyforeachofthetests
Toreportstatisticsofcomparativeaccuracyandtestsofstatisticalsignificanceforthe
pairwisecomparisonsbetweentests(ahalfmatrixdisplay ofallpossiblepairwise
comparisonsmaybe
useful).Separatetablesfordirectanduncontrolledcomparisonsmay
beneeded(see10.5.4)
Resultsofinvestigationsofheterogeneity,includingestimatesoftestaccuracyinsubgroups,
summarystatisticsofcomparativeaccuracyandtestsofstatisticalsignificance (see10.5.3)
Resultsofsensitivityanalyses(see10.6.1)
Thislistisnotexhaustive,and
authorsshouldusetheirinspirationtoidentifythebestwaysof
communicatingtheresultsoftheiranalyses.
CochraneDTAreviewsalsoincludeSummaryofResultstableswhicharedescribedinChapter11.
10.4 Metaanalyticalsummaries
Metaanalysisaimstocomputeandcompareestimatesoftheexpecteddiagnosticaccuracyofatest
andinvestigatethevariabilityofresultsbetweenstudies.Achoiceneedstobemadeofwhich
summarystatisticsaretobecomputed.InCochranereviewsthechoiceisbetweenestimating
expectedvaluesofsensitivityand
specificityforthetestatacommonthreshold(referredtoasthe
averageoperatingpoint),ortoestimatetheexpectedROCcurveforatestacrossmanythresholds
(referredtoasthesummaryROCcurveorSROCcurve).Othersummarystatistics(suchaslikelihood
ratiosatthesummarypointand
area(s)underthecurve)canbecomputedfromthesesummaries
shouldtheyberequiredtoassistinterpretationandapplicationoftheresults(seeChapter11).
10.4.1 ShouldIestimateaSROCcurveorasummarypoint?
Inasystematicreviewitislikelythatthecollecteddatawillbeat
amixtureofdifferentpositivity
thresholds.Whilstforsometeststhereisconsensusofwhatvaluethepositivitythresholdshould
take,moreoftentestsareevaluatedatdifferentthresholdsindifferentstudies.Presentationof
resultsatmultiplethresholdswithinasinglestudyisalsoencountered,withsomestudiespresenting
estimatesof
ROCcurves(see0)whichdepicttheaccuracyofthetestatallpossiblethresholds.In
addition,selectivereportingofthresholdsidentifiedtooptimisetestaccuracycanintroducebiasif
theyareselectedinadatadrivenmann er(Leeflang2008).
Akeyprincipleunderlyingthechoiceofstatisticalsummaryin
metaanalysisoftestaccuracyisthat
thesensitivityandspecificityofatestwillvaryasthepositivitythresholdsvaries,asgraphically
depictedusingaROCcurve(see0).Itisimportanttonotethatthehierarchicalmodels
recommendedformetaanalysisforCochraneDTAreviewsaccountforcorrelationbetween
sensitivityandspecificityobservedacrossstudieswhichisduetothefunctionalrelationship
betweensensitivityandspecificityasthethresholdvarieswithineachstudy.Thisoccursregardless
ofwhetherasummaryROCcurveorasummarypointistheoutputofchoice.
19 | Page
Areviewauthorneedstodecidewhethertheywill useallthestudiesavailable toestimatethecurve
(inwhichcasethemetaanalysiswillestimatethesummaryROCcurve)ortoestimateasummary
sensitivityandspecificitypointonthiscurveatachosenthreshold.Estimatingsummarysensitivity
and
specificitybypoolingstudieswhichmixthresholdswillproduceanestimatethatrelatestosome
notionalunspecifiedaverageofthethresholdsthatoccurintheincludedstudies,whichisclinically
unhelpfulandmustbeavoided.
Variationinthresholdishighlylikelywherethereisnoexplicitnumericalcutpointanddefinitionsof
atestpositivearebasedonjudgementratherthanmeasurement.Butevenwhenitispossibleto
defineacommoncutpointonthebasisofanumericalvalueorapointonaratingscale,itmustbe
acknowledgedthattherewillstillremainsomevariabilityintheactualthreshold
betweenstudies
throughcalibrationdifferencesbetweenequipment,differencesbetweenratersorobservers,as
wellasvariationintheimplementationoftests.Theconsequenceofsuchvariabilitywillbe
additionalheterogeneityintestresultsobservedatthecommoncutpoint.Thesummarysensitivity
andspecificitypointwillreflecttheaverageobservedaccuracy,whilst
thepredictionregionwill
reflecttheheterogeneityinhowitisapplied(seeexample10.5.2.2).
Thusthetwomain strategiestohandlemixedandvariablethresholdsinananalysisare:
Estimatingsummarysensitivityandspecificity ofthetestforacommonthreshold,orateach
ofseveraldifferentcommon
thresholds.Eachstudycancontributetooneormoreanalyses
dependingonwhatthresholdsitreports.Studieswhichdonotreportatanyoftheselected
thresholdsareexcluded.
EstimatingtheunderlyingROCcurvewhichdescribeshowsensitivityandspecificitytradeoff
witheachotherasthresholdsvary.In
thiscaseonethresholdperstudyis selectedtobe
includedintheanalysis.
Thechoiceofanalyticalapproachwillbeinfluencedbythevariationofthresholdsintheavailable
studies.Forexample,ifthereislittleconsistencyinthethresholdsused,metaanalyseswhich
restricttocommonthresholdswill
containverylittledata,andestimatingasummaryROCmaybe
preferred.IfthereislittlevariationinthresholdbetweenstudiesattemptingtofitasummaryROC
curvewillbedifficultasthepointsarelikelytobetootightlyclusteredinROCspace.
ItisreasonabletoestimatebothSROC
curvesandaverageoperatingpointsinareview,astheymay
complementeachotherinprovidingclinicallyusefulsummaries,andpowerfulwaysofdetecting
effects.Forexample,separateanalysesoftestdataatdifferentthresholdsmaybeusedtoprovide
clinicallyinformativeestimatesofsensitivityandspecificity,whereasincludingall
studiestoestimate
howsummaryROCcurvesdependoncovariatesortesttypewillbethemostpowerfulwaytotest
hypothesesandinvestigateheterogeneity.
10.4.2 MetaanalyticalmethodsnotroutinelyusedinCochraneReviews
MethodsthatarenotroutinelyincludedinCochranereviewsarecommonlyencounteredinthe
literaturefor
diagnosticmetaanalysis.Separatepoolingofsensitivityandspecificityestimatesfails
toaccountforthetradeoffbetweensensitivityandspecificity,whichmayleadtounderestimatesof
testaccuracy(Deeks2001).Similarlyseparatepoolingoflikelihoodratiosignorescorrelations
20 | Page
betweenpositiveandnegativelikelihoodratios,andtheoreticallycanproduce estimateswhichare
impossible(Zwinderman2008).
PoolingofpredictivevaluesispossibleusingtheBivariatemethod,butisnotrecommendedasitis
knownthatpredictivevaluesdependonprevalencewhichislikelytovarybetweenstudies.The
consequencesofthis
aretwofold:firstlythatbetweenstudyvariationinprevalencemayinduce
greaterheterogeneitythanisobservedforsensitivityandspecificity,andsecondlythattheaverage
predictivevalueswillrelatetouseofthetestatsomeaverage,butunknown,prevalence.
10.4.3 Heterogeneity
Heterogeneityistobeexpectedinmeta
analysesofdiagnostictestaccuracy.Aconsequenceofthis
isthatmetaanalysesoftestaccuracystudiestendtofocusoncomputingaverageratherthan
commoneffects.Insystematicreviewsofinterventionsitissometimenotedthattheestimatesof
theeffectoftheinterventioninthedifferentstudiesarevery
similar,thedifferencesbetweenthem
beingsmallenoughtobeexplicablebychance.Insuchsituationsitisappropriatetouseafixed
effectapproachmetaanalysisthatestimatestheunderlyingcommoneffect(andisinterpretedas
theactualeffectoftheintervention).Intestaccuracyreviewslargedifferencesare
commonly
notedbetweenstudies,toobigtobeexplainedbychance,indicatingthatactualtestaccuracyvaries
betweentheincludedstudies,orthatthereisheterogeneityintestaccuracy.Randomeffectsmeta
analysismethodsarerecommendedwhendataareheterogeneous,whichfocusonprovidingan
estimateoftheaverageaccuracyof
thetest,anddescribingthevariabilityinthiseffect.InCochrane
DTAreviews,heterogeneityispresumedtoexistandrandomeffects modelsarefittedbydefault,
onlysimplifiedtofixedeffectmodels wheretherearetoofewstudiestoestimatebetweenstudy
variability,oranalysisdemonstratesthatfixedeffectsare
appropriate.
Univariatetestsforheterogenei tyin sensitivityandspecificityandtheestimatesoftheI
2
statistic
(Higgins2003)arenotroutinelyusedinCochraneDTAreviewsastheydonotaccoun tfor
heterogeneityexplainedbyphenomenasuchaspositivitythresholdeffects.Ifinametaanalysis
thereisvariationinthreshold,whatisofimportanceis thedegreetowhichtheobservedstudy
resultslie
closetothesummaryROCcurve,nothowscatteredtheyareinROCspace.The
magnitudeofobservedheterogeneityisbestdepictedgraphicallywheresuchrelationshipscanbe
observedbythescatterofpointsandfromthepredictionellipse.Thenumericalestimatesofthe
randomeffecttermsinthehierarchicalmodels
doquantifytheamountofheterogeneityobserved,
butarenoteasilyinterpretedastheyrepresentvariationinparametersexpressedonlogodds
scales.
10.5 Modelfitting
10.5.1 MosesLittenbergSROCcurves(RevMan)
TheMosesLittenbergmethod(Moses1993)(Littenberg1993)providesasimplemodelforderiving
aSROC.Itwasoneoftheearliestmodelstobepropos edandhasbeenusedextensivelyinmeta
analysesofdiagnostictestaccuracy.Itismoreakintoa
fixedeffectthanarandomeffectsmodel,as
itdoesnotprovideestimatesoftheheterogeneitybetweenstudies.Eventhoughithasbeen
supersededbymorecomplexhierarchicalmodelsthatproperlyallowforrandomeffectsin
diagnostictestaccuracy,theMosesLittenbergmodelisusedinRevMantoprovidereviewerswith
thefacilitytoundertakepurelyexploratoryanalysesbasedonSROCcurveswithoutneedingto
exportdataoutofRevMan.BecauseofthelimitationsoftheMosesLitten bergmethod,RevMan
21 | Page
doesnotprovideparameterestimatesorstandarderrorsfromthismodelasinferencesshouldbe
basedonhierarchicalmodelsthattakeseparateaccountofwithinstudysamplingerrorand
additionalunexplainedheterogeneitybetweenstudies.
AbriefdescriptionoftheMosesLittenbergmethodisprovidedheretoexplainhowtheSROCcurves
producedbyRevManarederived.Themethodproceedsinthreesteps:
(i)thepairsofsensitivityandspecificityestimatesfromeachstudyaretransformedontothelog
odds(logit)scaletocompute,
)logit(1)logit(D yspecificitysensitivit
,and
)1logit()logit( yspecificitysensitivitS
whereDisthenaturallogarithmofthediagnosticoddsratio(lnDOR)andSisaquantityrelatedto
theoverallproportionofpositivetestresults.Scanbeconsideredasaproxyfortestthresholdsince
Swillincreaseastheoverallproportionoftestpositives,inthediseased
andnondiseasedgroups,
increases.TherelationshipbetweenDandSisexpectedtobelinear.
(ii)Thesimplelinearregressionmodel
error
SD
characterizeshowtestaccuracy,as
measuredbythediagnosticlogoddsratio(D),varieswithS,aproxyofthepositivitythresholdacross
studies.
(iii)Theestimatesof
and
arethenusedtoobtaintheestimatedsensitivityacrossachosen
rangeofpossiblevaluesofspecificityusing



)1)1logit(1exp(11
yspecificitysensitivitE .
ThiswillprovidetheestimatedSROCcurveintheoriginalROCcoordinates.Therangeofspecificities
overwhichthecurveisdrawnisusuallyconfinedtotherangeobservedinthedatatoavoid
extrapolation.
10.5.1.1 PropertiesofMosesLittenbergSROCcurve
Figure10.4illustratesthreepossibleSROCcurvesthat
couldarisefromtheMosesLittenbergmodel.
Allsharethesamevalueof
(takentobe3foreachcurve),butwithvarying
(takentobe‐0.35,
0and0.35).Thepointofintersectionofallthreecurvesliesonthediagonalwhere
sensitivity=specificity(S=0).Thesensitivityandspecificityofthetestatthispointisalsoreferredto
asQ*(see10.2.6).When
0
,thecurveissymmetri caboutthediagonallinegivenbyS=0.The
lnDORisthesame(andequalto
)ateverypointonthissymmetriccurvesincethereisno
associationbetweenaccuracy(D)andthreshold,(S)inthemodel.However,when
0
,thecurve
isnotsymmetricandtheexpectedaccuracy(lnDOR)increases(ordecreases)withthreshold.
Itispossibleinsomedatasetsfortheestimatedvalueof
toleadtoimproperSROCcurveswhich
donotgothroughthebottomleft(sensitivity=0,spe cificity=1)andtopright(sensitivity=1,
specificity=0)cornersoftheSROCplot.If
1
or
1
theestimatedSROCcurvehasthe
unintuitivepropertythatsensitivityde creases as1specificity(falsepositiverate)increases.Such
situationsmayariseifthereareoutlyingstudiesthatareinfluentialindeterminingtheslopeofthe
22 | Page
regressionline.ExcludingtheoutlierstudyallowsassessmentofitsinfluenceonthefittedSROC
curve.Extremevaluesof
mayalsoresultifthereisheterogeneityintestaccuracybetween
subgroupsofstudies.Suchheterogeneitycanbeexploredthroughsubgroupanalyseswhenthere
aresufficientstudiestoallowforthis.
Figure10.4.SROCcurvesforalternativevaluesofmodelparameters
10.5.1.2 Choiceofweights
Theregressionlinecan
befittedusingthemethodofweightedleastsquares(WLS)toaccountfor
differencesinthesamplingerrorinDbetweenstudiesbyweightingeachstudybytheinverse
varianceof
DORln
forthatstudy(estimatedas
dcbaDOR 1111lnvar
,wherea,b,c
anddrepresentthecellsofthe2x2tableshowninTable10.1).Analternativeapproachistoassign
equalweighttoallstudiesonthebasisthattheunexplainedheterogeneityintestaccuracybetween
studiesislikelytobelargecomparedwiththevariability
duetosamplingerror(Moses1993)(Irwig
1995).Bothweightedandunweighted(equallyweighted)leastsquaresareimplementedinRevMan.
Inpractice,bothweightingschemesoftenleadtosimilarcurves.
Neitherapproachaddressestheissueofsamplingerrorintheexplanatoryvariable(S)(violatinga
basicassumptionoflinearregression)
anddonotdealappropriatelywithadditionalunexplained
heterogeneityinD.ConsequentlytheMosesLittenbergmethodforSROCanalysisdescribedabove
isusedonlyforpreliminaryexploratoryanalysesandshouldnotbeusedtocomputeconfidence
intervalsforsummaryestimatesoftestaccuracy,ortoestablishwhetherdifferencesbetween
subgroupsare
withintheboundsofwhatweexpecttoseebychancealone.
10.5.2 Hierarchicalmodels
Morestatisticallyrigorousapproachesbasedonhierarchicalmodelshavebeenproposedthat
overcomethelimitationsoftheMosesLittenbergmethod.Inthissection,theBivariatemodel
(Reitsma2005)andthehierarchicalSROC(HSROC)modelof
RutterandGatsonis(Rutter2001)are
describedanddiscussed.
0.0 0.2 0.4 0.6 0.8 1.0
1 - specificity
0.0
0.2
0.4
0.6
0.8
1.0
sensitivity
β=0
β<0
β>0
S=0
23 | Page
Bothhierarchicalmodelsinvolvestatisticaldistributionsattwolevels.Atthelowerlevel,theymodel
thecellcountsinthe2×2tablesextractedfromeachstudyusingbinomialdistributionsandlogistic
(logodds)transformationsofproportions.Atthehigherlevel,randomstudyeffectsareassumedto
accountforheterogeneityindiagnostic
testaccuracybetwee nstudiesbeyon dthataccountedforby
samplingvariabilityatthelowerlevel.TheBivariatemodelandRutterandGatsonisHSROCmodel
aremathematicallyequivalentwhennocovariatesarefitted(Harbord2007),(Arends2008),but
differintheirparametrizations.TheBivariateparametrizationmodelssensitivity,specificityandthe
correlation
betweenthemdirectly,whereastheRutterandGatsonisHSROCparameterization
modelsfunctionsofsensitivityandspecificitytodefineasummaryROCcurve.
ParameterestimatesfromboththeBivariatemodelorRutterandGatsonisHSROCmodelcanbe
inputtoRevMantoprod uce
thesummaryROCcurve,
thesummaryoperating
point,(i.e.summaryvaluesforsensitivityandspecifici ty),
a95%confidenceregionaroundthesummaryoperatingpoint,and
a95%predictionregion.
Thispredictionregionisonewayofillustratingtheextentofstatisticalheterogeneitybydepictinga
regionwithinwhich,assumingthemodeliscorrect,wehave
95%confidencethatthetruesensitivity
andspecificityofafuturestudyshouldlie(Harbord2007).
FromthesummaryROCcurvetheexpectedsensitivityatagivenvalueofspecificity(orviceversa)
canbecomputed.Inaddition,summaryvaluesand confidenceintervalscanalsobederivedforthe
positive
andnegativelikelihoodratiosorthediagnosticoddsratioatthesummarypoint.
Notallofthesepossiblesummarymeasureswillberelevantorappropriateforagivenanalysis.The
choiceofsummarymeasure(s)mustbeinformedbytheresearchquestionandalsothevariabilityin
thresholdsusedacrossstudiesfor
definingtestpositivity.
Themotivationforchoosingoneofthesetwoalternativehierarchicalmodelsbecomesclearwhen
covariatesaretobeaddedtoexploreheterogeneityintestaccuracy.Ultimately,thechoiceof
methodwillbedeterminedbythefocusonewishestoadopt,andwhichofthetwodirectly
addresses
theresearchquestion(see10.4.1).
Bothmodelsrequiretheuseofexternalstatisticalsoftware,asfittingthemrequiresmethodsthat
aretoocomplextoimplementwithinRevMan.However,publicationreadygraphicaloutputcanbe
createdinRevManbyestimatingparameterestimatesfromeithermodeltoaddmodelsummaries
tosummaryROC
plots.
AlternativespecificationsforsummarycurvesbasedonfunctionsoftheBivariatemodelparameters
haverecentlybeenproposed(Arends2008),(Chappell2009).Theserequirefurtherevaluationand
arenotsupportedcurrentlyinRevMan.ThischapterwillfocusontheRutterandGatsonismodelas
itisthemostestablishedof
theHSROCspecifi cations.
24 | Page
10.5.2.1 Bivariatemodel
TheBivariatemethodmodelsthesensiti vityandspecificitydirectly.Themodelcanberegardedas
havingtwolevelscorrespondingtovariationwithinandbetweenstudies.Atthefirstlevel,thewithin
studyvariabilityforbothsensitivityandspecificityisassumedtofollowabinomialdistri bution.For
sensitivity(denoted
byA),thenumbertestingpositive
),(~
AiAiAi
nBy
where
Ai
n
and
Ai

respectivelyrepresentthetotalnumberofdiseasedindividualstestedandtheprobabilityofa
positivetestresultinthatgroupinstudyi.Similarly,forspecificity(denotedbyB),thenumber
testingnegative
),(~
BiBiBi
nBy
where
Bi
n
and
Bi
respectivelyrepresentthetotalnumberof
nondiseasedindividualstestedandtheprobabilityofanegativetestresultinthatgroupstudyi.
Thesensitivityspecificitypairforeachstudymustbemodelledjointlywithinstudyatleveloneof
theanalysisbecausetheyarelinkedbysharedstudycharacteristics
includingthepositivity
threshold.Atthehigherlevel,thelogittransformedsensitivitiesareassumedtohaveanormal
distributionwithmeanμ
A
andvarianceσ
2
A
,whilethelogittransformedspecificitieshaveanormal
distributionwithmeanμ
B
andvarianceσ
2
B
.Theircorrelationisincludedbymodellingbothatonce
byasinglebivariatenormaldistribution:
,
,
~N ,
Ai
A
Bi
B









with
2
2
BAB
ABA
where σ
2
A
andσ
2
B
describethebetweenstudyvariabilityintruelogitsensitivityandspecificity
respectively,and
AB
isthecovariancebetweenlogitsensitivityandspecificity.Themodelmayalso
beparameterizedusingthecorrelation
/( )
A
BABAB

,whichmaybemoreinterpretablethan
thecovariance.TheBivariatemodelthereforehasfiveparameterswhennocovariatesareincluded:
μ
A
, μ
B
, σ
2
A
,σ
2
B
and
AB
.(Note:wefollowHarbord(Harbord2007)inusingμwhereReitsma
(Reitsma2005)used
inordertoavoidconfusionwiththenotationfromthatoftheHSROCmodel
whichfollows).
Theinclusionofacorrelationparameterinthemodelallowsfortheexpectedtradeoffinsensitivity
andspecificityasthetestpositivitythresholdacrossstudiesvaries.Wherevariationbetweenstudies
arisesthroughsuch
atradeoffthiscorrelationisexpectedtobenegative,butthecorrelationmay
bepositiveifthereareothersourcesofheterogeneity.
Reitsma(Reitsma2005)originallyproposedfittingthesemodelsbyapproximatingthebinomial
withinstudydistributionsbynormaldistributions.Althoughthisallowsthemodeltobefittedina
slightlylargerrangeofsoftware(e.g.theMIXEDprocedureinSAS),Chu(Chu2006)later
demonstratedthattheapproximation canperformpoorlyandrecommendedthatsoftwarebeused
thatcanexplicitlymodelthebinomialwithinstudydistributions.
25 | Page
10.5.2.2 Example1continued:AntiCCPforthediagnosisofrheumatoidarthritis.
Wenowundertakethefirststageofaformalstatisticalanalysisofthedatafromareview
(Nishimura2007)ofanticycliccitrullinatedpeptideantibody(antiCCP).Ifitcanbepresumedthat
theantiCCPtestisdeemedpositiveifanyantiCCPantibodyisdetectedandthatdetectioncanbe
consideredacommonthreshold,itmak essensetofocusonsummaryestimatesforsensitivityand
specificity.
Asnotedinthedescriptiveanalysesofthesedata,thereappearstobegreatervariabilityin
estimatedsensitivitythanspecificityacrossstudies,whichcouldariseeitherthroughheterogeneity
orthroughestimatesofsensitivitybeingbasedonsmallersamplesthanestimatesofspecificity.The
parameterestimatesfromtheBivariatemodelareshownbelow.
FitStatistics
‐2LogLikelihood545.6
AIC(smallerisbetter)555.6
AICC(smallerisbetter)556.4
BIC(smallerisbetter)563.6
ParameterEstimates
Standard
ParameterEstimateErrorDFtValuePr>|t|AlphaLowerUpperGradient
msens0.65340.1275355.13<.00010.050.39460.91223.959E6
mspec3.10900.14593521.31<.00010.052.81283.40523.473E8
s2usens0.54260.1463353.710.00070.050.24550.8397‐6.62E6
s2uspec0.57170.1873353.050.00430.050.19140.95201.36E6
covsesp‐0.27040.119935‐2.260.03040.05‐0.51370.02710‐1.59E6
CovarianceMatrixofParameterEstimates
RowParametermsensmspecs2usenss2uspeccovsesp
1msens0.01625‐0.007410.000890‐0.00004‐0.00004
2mspec‐0.007410.02128‐0.000060.004286‐0.00116
3s2usens0.000890‐0.000060.021420.003997‐0.00874
4s2uspec‐0.000040.0042860.0039970.03509‐0.01184
5covsesp‐0.00004‐0.00116‐0.00874‐0.011840.01436
Theparameterestimatesintheboxesabovecan
beinputtoRevMantoproducethesummary
point,95%confidenceregion,and95%
predictionregionshownintheFigure.The
BivariateoutputboxinRevManrequires:the
summaryestimateforlogit(sensitivity)whichis
0.6534,thesummaryestimatefor
logit(specificity)whichis3.1090;andthe
variancesoftherandomeffectsfor
logit(sensitivity),logit(specificity)andtheir
covariancewhichare0.5426,0.5717and‐0.2704
respectively(alloftheseestimatesappearinthe
redbox).Computationofconfidenceand
predictionregionsalsorequiresthestandard
errorofthesummaryestimatesfor
logit(sensitivity),logit(specificity)andtheircovariancewhichare0.1275,0.1459and‐0.00741
respectively(shownintheblueboxes).
Thevariancecoefficientsindicatesimilarheterogeneityinsensitivitiesandspe cificities.The
magnitudeoftheheterogeneityis alsoevidentinthesizeofthepredictionregionontheSROCplot.
Thesummaryestimateofsensitivityandspecificityisshownbythesolidblackdot.Thesensitivity
26 | Page
andspecificityatthispointcanbecomputedbyinversetransformationofthelogitestimatestogive
asensitivityandspecificityof0.66and0.96respectively.Confidenceintervalscanbecomputedby
inversetransformationofintervalscomputedonthelogitscale.
Theplotshowsapotentialoutlier,Hitchon2004withasensitivityof0.63andspecificityof0.65.A
sensitivityanalysiscanbeperformedtoassesstheinfluenceofthisstudyonthesummaryestimates.
10.5.2.3 TheRutterandGatsonisHSROCmodel
TheHSROCmodelproposedbyRutterandGatsonis(Rutter1995),(Rutter2001)isbasedonalatent
scalelogisticregressionmodel(McCullagh1980),(Tosteson1988).TheHSROCmodelassumesthat
thereisanunderlyingROCcurveineachstudywithparameters
and
thatcharacterizethe
accuracyandasymmetryofthecurve,inasimilar(thoughtechnicallydistinct)waytothe
and
parametersinthelinearregressionmethodofMosesandLittenberg.UnliketheMo sesLittenberg
model,theRutterandGatsonismodelisconstrainedtoprovideaROCcurvewheresensitivity
cannotdecreaseasspecificityincreases.
Accuracy,definedintermsofthelnDOR,determinesthepositionofthesummarycurverelativeto
thetopleftcorneroftheROCaxes.AswiththeSROCregressionmethod,eachstudycontributes
dataatasinglethresholdtotheanalysis.The2×2tableforeachstudythenarisesfrom
dichotomizingatapositivitythresholddenotedby
.Theparameters
and
areassumedto
varybetweenstudies:bothareassumedtohavenormaldistributi onsasinconventionalrandom
effectsmetaanalysis.
TheHSROCmodelcanalsoberegardedashavingtwolevelscorrespondingtova riationwithinand
betweenstudies.Atthefirstlevel,thenumberofdiseasedindividualswhotestpositive
isdenoted
by
1i
y
forthei
th
study,andthecorrespondingnumberofnondiseasedwhotestpositiveisdenoted
by
2i
y
.Foreachstudy(i),thenumbertestingpositiveineachdiseasegroup(j)isassumedtofollow
abinomialdistributionsuchthat
2,1),,(~
jnBy
ijijij
where
ij
n and
ij
respectively
representthetotalnumbertestedandtheprobabilityofapositivetestresult.Thenumbertesting
positiveineachdiseasedandnondiseasedpairisanalysedjointlywithineachstudyatleveloneof
theanalysis.
Themodeltakestheform
ijijiiij
disdis
exp)logit(
where
ij
dis representsthe‘true’diseasestatus(codedas‐0.5forthenondiseasedand0.5forthe
diseased)therebytakingintoaccountthewithinstudyvariabilityatlevelone.Usingtheusual
terminologyforthismodel,wegenerallyreferto
i
representstheproxyforpositivitythreshold
calculatedasthemeanofthelogoddsofapositivetestresultforthediseasedandthelogoddsofa
positivetestresultforthenondiseasedgroupsinstudyi(equivalentto
2
i
S
intheMoses
Littenbergmodel).
i
(thelnDORforstudy i)representsameasureofdiagnosticaccuracyinthei
th
studythatincorporatesbothsensitivityandspecificit yforthatstudy.Thescaleparameter(
)
providesforasymmetryintheSROCbyallowingaccuracytovarywiththreshold.Sinceeachstudy
27 | Page
contributesonlyoneestimateofsensitivityandspecificityatasinglethreshold,itisnecessaryto
assumethattheshapeofthetrueunderlyingROCcurveineachstudyisthesame,andhence
is
fittedasafixedeffect.
Thethresholdanddiagnosticaccuracyforeachstudyarespecifiedasrandomeffectsandare
assumedtobeindependent(uncorrelated)andnormallydistributed.Theaccuracyparameterhas
mean
(capitallambda)andvariance
2
,whilethepositivity(threshold)parameterhasmean
(capitaltheta)andvariance
2
.Theshapeparameter(
)isestimatedusingdatafr omthestudies
consideredjointly,assumingnormallydistributedrandomeffectsfortestaccuracy.Whenno
covariatesareincluded,theHRSOCmodelalsohasfiveparameters:
,
,
,
2
and
2
.
AsummaryROCcurvecanbeconstructe dfromtheHSROCmodelbychoosingarangeofvaluesof1
specificityandusingtheestimatedaveragelocationparameter(
)andscaleparameter(
)to
computethecorrespondingvaluesforsensitivity.Theexpectedsensitivityatachosenfalsepositive
fraction(1–specificity)isgivenby

))(exp(11
1logit5.0
eyspecificit
eysensitivit
.
When
0
,testaccuracycanbesummarizedby
whichrepresentstheexpectedaccuracy(log
DOR),andtheresultingsummarycurvewillbesymmetric.
28 | Page
10.5.2.4 Example2:Rheumatoid FactorasamarkerforRheumatoidArthritis.
InthisexamplewewillinvestigatethediagnosticperformanceofRheumatoidfactor(RF)asa
markerforrheumatoidarthritis(RA).The50studiesincludedintheanalysisaretakenfromthe
samereviewasExample1(Nishimura2007).Thereferencestandardwasagainbasedonthe1987
revisedAmericanCollegeofRheumatology(ACR)criteriaorclinica l diagnosis.
ThecutofffortestpositivityforRFvariedbetweenstudiesandrangedfrom3to100U/ml.The
variabilityinthresholdusedtodefinetestpositivitybetweenstudiesisreflectedinthevariabilityin
studyspecificestimatesofsensitivityandspecificityshownintheSROCplotshownintheFigure.
Becauseofthevariationinthresholdacrossstudies,asummaryROCcurveisappropriateto
summarisethesedata.TheHSROCmodelwasusedtoestimateasummarycurveusingProc
NLMIXEDinSAS.
ProcNLMIXEDOutput:
FitStatistics
‐2LogLikelihood806.9
AIC(smallerisbetter)816.9
AICC(smallerisbetter)817.6
BIC(smallerisbetter)826.5
ParameterEstimates
Standard
ParameterEstimateErrorDFtValuePr>|t|AlphaLowerUpperGradient
alpha2.60160.18624813.97<.00010.052.22732.97592.227E6
theta‐0.43700.146948‐2.980.00460.05‐0.7323‐0.14174.573E6
beta0.22670.1624481.400.16910.05‐0.099780.5532‐1.16E6
s2ua1.30140.3046484.27<.00010.050.68901.9137‐6.42E7
s2ut0.54230.1237484.39<.00010.050.29370.7909‐6.99E6
Theparameterestimateshighlightedabovecan
beinputtoRevMantodrawthesummary
curveasshowninFigure;2.6016estimatesthe
meanoftherandomeffectsforaccuracy(i.e.
,lambda),‐0.4370estimatesthemeanofthe
randomeffectsforthreshold(theta),0.2267
estimatestheshapeparameter(beta),1.3014
estimatesthevarianceoftherandomeffects
foraccuracy,and0.5423estimatesthevariance
oftherandomeffectsforthreshold.The
resultingcurveshowstheexpectedtradeoff
betweensensitivityandspecificityacross
thresholds.
Wheninterpretingtheresultsoftheanalysis,it
isimportanttonotethatRFconstitutespartoftheACRcriteria.Hence,thereisriskofbiasinthe
estimatedcurvesincetheindextestisincorporatedinthereferencestandard.Thiscouldresultinan
overestimationofthediagnosticaccuracyofRF,andcouldresultingivingadistortedpictureofthe
expediencyofusingRFasafirsttestforresolvinguncertaintyinasuspectedcaseofrheumatoid
arthritis.
29 | Page
10.5.3 Investigatingheterogeneity
Indiagnosticreviewsitisusualtoobservevariabilityintestaccuracybetweenstudiesthatis
considerablygreaterthanwouldbeexpectedfromwithinstudysamplingerroralone.Thisis
reflectedinthemodelspecificationsfortheBivariateandHSROCmodelswhichbothallowfor
randomstudyeffects.
FortheBivariatemodel,thesummaryestimates ofsensitivityandspecificity
representanaverageoperatingpointacrossstudies.Similarly,theestimatedsummaryROCcurve
representsanaverageROCcurveacrossstudiesontheassumptionthatthetrueunderlyingROC
curveineachstudyhasthesameshape.
Someofthis
heterogeneityintestaccuracybetweenstudiesislikelytoariseduetodifferencesin
patientcharacteristics,testmethods,studydesignandotherfactors.Exploratoryanalysescanbe
conductedinRevMantoinvestigatewhethersuchstudycharacteristicsappeartobeassociatedwith
testaccuracyusingtheMosesLittenbergSROCmethod,butthis
methodcannotbeusedtoprovide
validstatisticalevidenceofsuchassociations.AseparateSROCcurveisfittedforeachsubgroup,
andtheresultscanbecomparedgraphicallyacrosssubgroups.Thefeasibilityofsuchanalyseswill
obviouslybeinfluencedbythenumberofavailablestudiesineachsubgroup.
Statistically,it
isgenerallymoreefficienttomakeuseofallofthedataavailableacrossstudieswhen
investigatingheterogeneitybyaddingstudylevelcovariates toahierarchicalmodeltoidentify
factorsassociatedwithdiagnostictestaccuracy.Thismetaregressionapproachalsoallows
statisticalinferencestobemade .Itisusuallyassumedthat
eachcovariatehasafixedeffectwhen
addedtothemodel.Thisapproachisalsoapplicabletotestcomparisons,asdiscussedin10.5.4.
TheBivariateandHSROCmodelsdifferinhowstudylevelcovariatesareincluded.Published
accountsoftheBivariatemethodfocusontheestimationofasummaryestimate
ofsensitivityand
specificity,andhowtheexpectedvaluesofthesemayvarywithstudylevelcovariates.Publi shed
accountsoftheHSROCapproach,bycontrast,focusontheestima tionofthesummaryROCcurveas
thebasisforassessingtestaccuracy,andhowthepositionandshapeofthecurvemay
varywith
studylevelcovariates.
Bothmodelsallowtheuseofcategoricalandcontinuouscovariates.Inpractice,covariatesrelating
tostudycharacteristicsareusuallycategoricalandindicatorvariablesarecreatedasisdonein
standardregressionmodelling.Forcontinuouscovariates,particularcareshouldbetakentocheck
thattheassumption
oflinearassociationsarevalid.FortheBivariatemodel,thisrefersto
associationwithlogit(sensitivity)and/orlogit(specificity).FortheHSROCmodel,thisrefersto
associationwiththeaccuracyparameter(lnDOR)and/orthethresholdparameter.
Theusesandlimitationsofinvestigatingheterogeneityusingsubgroupanalysisandmetaregression
inSection9.6
oftheCochraneHandbookforSystematicReviewsofInterventions(Deeks2008)
appliesequallytodiagnosticstudies.
10.5.3.1 HeterogeneityandRegressionAnalysisusingtheBivariatemodel
TheBivariatemodelallowscovariatestoaffectsummarysensitivityorsummaryspecificity,orboth.
UsingthenotationofHarbord(Harbord2007),andassumingthatwe
haveasinglestudylevel
covariateZthatmayaffectbothsensitivityandspecificity,thenthemodelcanbeextendedas
follows:
30 | Page
,~
iBB
iAA
Bi
Ai
Zv
Zv
N
Asbefore,
representsthecovariancematrixfortherandomeffectsforlogitsensitivityandlogit
specificity.Ifthecovariatedoes explainsomeoftheheterogeneityinsensitivityand/orspecificity
thenwewouldexpectthattheestimatedvarianceforoneorbothrandomeffectstobereduced.
Theestimatedcovariance(correlation)parametermayalso
change.
Assumingthatwehaveabinarystudylevelcovariate(Z) codedas0or1torepresentthetwogroups
ofstudies,then
A
estimatesthelogitsensitivityattheexpectedsummaryoperatingpointforthe
referentgroup(Z=0),and
AA
v
estimatesthelogitsensitivityattheexpectedsummary
operatingpointfortheothergroup(Z=1).Hence,
A
vexp estimatestheoddsratioforsensitivityin
group1relativetothereferentgroup.Theexpectedsensitivityisestimatedas
 
AA
exp1exp forthereferentgroupofstudies,andas

AAAA
vv
exp1exp
fortheothergroup.Comparisonsofspecificitybetweenthetwo groupsofstudiesfollowthesame
approachasdescribedabovebasedon
B
and
B
v
.Thefitofthemodel,withandwithoutthe
additionalparameters
A
v and
B
v ,canbeusedtotestwhetherthecovariateisassociatedwith
sensitivityandorspecificity.Thisjointtestwillhave2degreesoffreedomifZisbinary.Separate
testsofstatisticalsignificanceofthecovariatewithsensitivityandspecificitycanalsobeconducted,
firsttoassesswhether
A
v differsfrom0(asignificantresultindicatesthatthereisevidencethat
sensitivitydiffersbetween thetwogroupsofstudies)andsecondlywhether
B
v differsfrom0(a
significantresultindicatesthatthereisevidencethatspecificitydiffersbetweenthetwogroupsof
studies).Seealso10.5.3.4relatingtocriteriaformodelselection.
Thestandarderrorofanewestimatebasedonafunctionofthemodelparameterestimatescanbe
obtainedusingthedelta
methodontheassumptionthattheerrordistributionofthenewestimate
isapproximatelynormal.Thedeltamethodisimplementedinstandardstatisticalsoftwaresuchas
SASandStata.
Themodeliseasilyextendedtoallowformorethanonecovariate.However,thismaynotbe
feasibleinpracticeif
thenumberofstudiesisnotlarge.Also,itisimportanttonotethatacovariate
mayonlybeassociatedwithsensitivityandnotspecificity,orviceversa.Itisnotrequiredthatthe
samecovariatesarefittedforbothsensitivityandspecificity,althoughthismaycommonlybethe
case.Where
acovariate(orcovariates)isallowedtoaffectboththesensitivityandthespecificity,
theBivariatemodelisequivalenttoanHSROCmodelinwhichthecovariateorcovariatesare
allowedtoaffectboththeaccuracyandthepositivitythresholdbutnottheshapeparameter.
However,usingtheestimatesfrom
theBivariatemodeltotestfortheeffectofcovariatesonthe
shapeandpositionofthesummaryROCcurveisnotstraightforward.UsingtheHSROCmodel
parameterizationallowsthistobedoneinamoredirectandstraightforwardmanner.
Itisusuallyassumedthatthevarianceoftherandomeffects
(andtheircorrelationinthecaseofthe
Bivariatemodel)arenotassociatedwiththecovariate.Thisisprobablyareasonableassumptionin
mostanalysesinvestigatingheterogeneityintestaccuracyforasingleindextest.However,for
analysesthatcomparedifferentindextests,thisassumptionislesslikelytohold.
See10.5.4.
31 | Page
10.5.3.2 Example1(cont).:Investigationofheterogeneity indiagnosticperformanceof
antiCCP
ThestudiesincludedinthereviewtoassessthediagnosticperformanceofantiCCPusedtwo
differentgenerationsoftheassay:firstgeneration(CCP1, 8studies)andsecondgeneration(CCP2,
29studies).AbinarycovariateforgenerationofCCPwithcoefficientse2forsensitivity,andand
coefficientsp2forspecificitywereaddedtothemodel.Thecovariatewascodedas0forCCP1and1
forCCP2.AllowingtestperformancetovarybygenerationofCCPinthemodelresultedina–2Log
Likelihoodof533.4,areductionof12.2comparedwiththemodelthatcontainednocovariates.
Hence,thereisstatisticalevidence(chisquare=12.2,2df,P=0.002)thattestperformanceis
associatedwithgenerationofCCPbutfurtherinvestigationisrequiredtoascertainwhetherthis
associationisforsensitivity,specificity,orboth.
TheparameterestimatesrequiredtodrawthesummarypointsandregionsinRevMancanagainbe
extractedfromtheProcNLMIXEDoutput(seeAppendixforSASprogram).Thevariancesofthe
randomeffectsforlogit(sensitivity)andlogit(specificity),andtheircovariancearecommonforboth
generationsofCCP(seeblueboxinoutput).Forthereferentgroup(CCP1inthiscase)thesummary
estimatesforlogit(sensitivity),logit(specificity),thecorrespondingstandarderrorsandcovariance
areshownintheredboxesintheoutput.Thelogit(sensitivity)forCCP2isestimatedbymsens+se2,
andthelogit(specificity)isestimatedbymspec+sp2.Thestandarderrorsandcovarianceofthese
additionalestimatescanbeobtainedusingtheESTI MATEcommandinProcNLMIXEDasshownin
theprogramintheAppendix.Alternativ ely,asimplewayofgettingtheseresultsistorefitthemodel
usingCCP2asthereferentgroup(codedas0)andCCP1astheothergroup(codedas1).Thefitof
themodelandresultsfortherandomeffectswillbethesame,buttheestimates for‘msens’and
‘mspec’willnowbeforCCP2andhencetherequiredestimatescanthenbeextractedfromthe
standardoutput(revisedoutputnotshown).Theresulting plotis showninthe Figurebelow.
FitStatistics
‐2LogLikelihood533.4
AIC(smallerisbetter)547.4
AICC(smallerisbetter)549.1
BIC(smallerisbetter)558.6
ParameterEstimates
Standard
ParameterEstimateErrorDFtValuePr>|t|AlphaLowerUpperGradient
msens‐0.096530.220335‐0.440.66400.05‐0.54380.35070.000317
mspec3.44670.29823511.56<.00010.052.84124.0522‐0.00005
s2usens0.35980.1022353.520.00120.050.15240.56738.325E6
s2uspec0.53990.1802353.000.00500.050.17420.90570.000159
covsesp‐0.19690.0983635‐2.000.05310.05‐0.39650.002824‐0.00004
se20.96260.2513353.830.00050.050.45231.47280.000319
sp2‐0.43020.337735‐1.270.21110.05‐1.11580.2554‐0.00004
CovarianceMatrixofParameterEstimates
RowParametermsensmspecs2usenss2uspeccovsespse2sp2
1msens0.04854‐0.02464‐0.00012‐0.00001‐0.00003‐0.048550.02465
2mspec‐0.024640.08895‐0.000020.004771‐0.000650.02463‐0.08834
3s2usens‐0.00012‐0.000020.010440.002118‐0.004400.000693‐0.00005
4s2uspec‐0.000010.0047710.0021180.03246‐0.00860‐0.00006‐0.00039
5covsesp‐0.00003‐0.00065‐0.00440‐0.008600.0096740.000100‐0.00091
6se2‐0.048550.024630.000693‐0.000060.0001000.06317‐0.03160
7sp20.02465‐0.08834‐0.00005‐0.00039‐0.00091‐0.031600.1141
32 | Page
Basedontheconfidenceregionsinthefigureit
isclearthatthesensitivityvariesbygeneration,
butnotspecificity.Thesummaryestimatesof
specificitieswere:0.97(95%CI0.95,0.98)for
CCP1and0.95(95%CI0.94,0.97).Thesummary
estimatesofsensitivitywere0.48(95%CI0.37,
0.58)forCCP1and0.70(95%CI0.65,0.75)for
CCP2.Theseresultsindicateanimprovementin
sensitivity,withoutlossofspecificityfor
generation2comparedwithgeneration1CCP.
Furthermodelsmaybefittedtoformallytest
theeffectofremovingthecovariatefor
specificityfromthemodel.
Comparingtheoutputfromthismodelwith
thatofthemodelwithnocovariates(see10.5.2.1),itisclearthatthevariancesoftherandom
effectshavereduced,particularlyforsensitivity.Also,checksofthedistributionsoftherandom
effects(notshownhere)showthatadjustingforgenerationofantiCCPresultsindistributionsthat
morecloselyfollowanormaldistribution.
10.5.3.3 HeterogeneityandRegressionAnalysisusingtheRutterandGatsonisHSROC
model
TheHSROCmodelallowscovariatestobeaddedtoexploreheterogeneityintestpositivity
(threshold),positionofthecurve(accuracy)andshapeofthecurve.Acovariatemaybeassociated
withsome,butnotallthreemodelparameters.
Assumingthatwehaveabinarystudylevelcovariate(Z) codedas0or1torepresentthetwogroups
ofstudies,thentheHSROCmodelcanbeextendedtoestimatethelogoddsofapositivetestfor
studyianddiseasegroupjasfollows:

ijiijiiiiij
disZdisZZ
exp)logit(
where
,
and
areallassumedtobeafixedeffect.Hence,thedistributionoftherandomeffects
forthresholdandaccuracyarenowgivenby
2
,~
ii
ZN ,and
2
,~
ii
ZN
respectively.Theshapeparameterforthesummarycurvesforthetwogroupsisestimatedas
for
thereferentgroupofstudies(Z=0)and
fortheothergroup(Z=1).Ifthecovariatedoes
explainsomeoftheheterogeneityinthresholdand/oraccuracythenwewouldexpectthatthe
estimatedvarianceforoneorbothrandomeffectstobereduced.
Thefirststepwouldbetoinvestigatetheshapeofthesummarycurve.If
0
,thentheshapeof
thesummarycurvediffersforthetwogroups ofstudieswhichmeansthattherelativeaccuracyof
thetestwillvarywiththreshold.(Figure10.5(a))Thisrepresentsthemostcomplexscenario,andthe
modelwouldnotgenerallybesimplifiedanyfurther.Inpractice,itisdifficult
todetectastatistically
significantdifferencein theshapeofthecurveacrossgroupsbecausethenumberofstudiesineach
33 | Page
groupisusuallylimited.Also,itisimportantwheninvestigatingshapetoconsidertheeffectof
outlyingandpotentiallyinfluentialstudies.Whenthereisgoodevidencethatthecurvesdifferin
shape,aplotoftheestimatedcurvesforthetwogroupswillaidininterpretation.Focusingonthe
regionof
theplotthatcoverstheobserveddata,itisthenpossiblecomparetheestimatedcurves.
Whereonecurveconsistentlyliesaboveanother,thereisevidenceofsuperioraccuracyeven
thoughthedifferentialbetweenthecurveswillvaryacrossthresholds.Ifthecurvescross,thenthe
interpretationofwhichcurveshows
superioraccuracywilldependonthreshold.
If,basedonstatisticalevidence,similarityofcurveshapesandinvestigationofpotentiallyinfluential
studies,itcanbeassumedthat
0
,thenthecovariatecanberemovedforshape.Theestimated
SROCcurvesforthetwogroupswillthenhavethesameshape,eventhoughtheyarenotsymmetric
(Figure10.5(b)),andtherelativediagnosticaccuracyofthetwocurvescanbesummarized usingthe
relativediagnosticoddsratio(
expRDOR ).TheRDORwillbeconstantacrossallpossible
valuesof
.Ifthemodelcanbesimplifiedfurtherandbothcurvescanbeassumedtobe
symmetric,i.e.
0
,theRDORagainprovidesameasureofrelativeaccuracyasdescribedabove.
Figure10.5SummaryROCcurveswithandwithoutadifferenceinshape
(a)relativeaccuracydependson
particularspecificityvalues,the
curvescrossing
0.0 0.2 0.4 0.6 0.8 1.0
1 - specificity
0.0
0.2
0.4
0.6
0.8
1.0
sensitivity
group 0
group 1
(b)group1dominatesacrossall
specificityvalues,thecurvesdonot
cross
0.0 0.2 0.4 0.6 0.8 1.0
1 - specificity
0.0
0.2
0.4
0.6
0.8
1.0
sensitivity
group 0
group 1
34 | Page
Ifthecurvescanbeassumedtohavethesameshape(eitherbothasymmetricorbothsymmetric),
thenthequestioniswhetherthecovariateisassociatedwithaccuracy.Ifthereisevidencethat
0
,thentheRDORgivesanestimateoftheoverallrelativediagnosticaccuracy.Thiswould
correspondtoaclearseparationbetweentheSROCcurvesforthetwogroups.Alternatively,
0
impliesthatthereisnoseparationbetweenthecurvesandnoassociationbetweenthecovariate
andaccuracy.
If
canbeassumedtobe0,thenthemodelcanbefurthersimplifiedbyremovingthecovariatefor
accuracywhichwillresultinasinglesummarycurve(assumingthattheshapeofthecurveisthe
sameforthetwogroupsofstudies).Anassociationbetweenthecovariateandthethreshold
parameter(i.e.
0
)wouldindicatethattheunderlyingtestpositivityrateforthetwogroupsof
studiesdiffers.Suchanassociationisoftendifficulttointerpretunlessthecurve s canbeassumedto
havethesameshapeandaccuracy.
Whentheactualcutpointtodefineapositivetestisavailableforeach
study,thiscanbefittedasa
covariatetothethresholdparametertoallowestima tionoftheexpectedsensitivityandspecificity
onthesummarycurveataselectedcutpoint.However,thispresumesaparticularfunctional
relationshipbetweenthresholdandsensitivityandspecificity.
10.5.3.4 Criteriaformodelselection
Irrespectiveof
whichmodelisused,reviewerauthorsmustspecifywhatmodellingstrategywill be
usedforaddingorremovingcovariatesandwhatcriterionwillbeusedtodecidewhetherornota
covariateshouldbeincludedinamodel.
Thedecisionastowhetheracovariateshouldberetainedinthemodel
maybebasedinparton
statisticaltests.Commonlyusedsoftwareforfittingthesemodels,suchasSASforinstance,will
provideWaldstatisticsandcorrespondingpvaluesforeachvariableinthemodel.Apvaluebased
onthelikelihoodratiochisquaredstatisticisgenerallymorereliable.The
chisquaredstatisticis
computedasthechangeinthe‐2Loglikelihoodwhenacovariateisadded(orremoved)froma
model.Thedegreesoffreedomisequaltothedifferenceinthenumberofparametersfittedin
thesemodels.Theeffectofadding(orremoving)covariatesonmeasuresofmodel
fitsuchas
Akaike'sinformationcriterion(AIC)orBayesianinformationcriterion(BIC)canalsobeused.The
devianceinformationcriter ion(DIC) is commonlyusedformodelsfittedbyMarkovchainMonte
Carlo(MCMC)simulation
Likelihoodratiotestscanalsobeusedtoassessthesignificanceofthevariancetermsfor
thetwo
randomeffectsineithermodel,orwhetherallowingforvariancetorelatetotestaccuracyprovides
abetterfittingmodel.
35 | Page
10.5.3.5 Example2(cont.):Investigatingheterogeneityindiagnosticaccuracyof
RheumatoidFactor
WewillnowinvestigatewhetherthelaboratorytechniqueusedtomeasureRFisassociatedwith
diagnosticperformance.Ofthe50studies,15usednephelometry(N),16latexagglutination(LA),16
ELISA,onestudyusedRAhemagglutination,and2didnotreportthemethodused.Theanalysisis
restrictedtostudiesthatusedN,LAorELISA.TheHSROCmodelwasagainusedbecauseofthe
variationinthresholdusedfortestpositivityacrossstudies.Covariates(indicatorvariablesfor
technique,usingLAasthereferentcategory)wereincludedinthemodeltoassesswhether
accuracy,threshold,ortheshapeoftheSROC curvevariedwithtechnique.
The‐2Loglikelihoodforthemostcomplexmodelthatincludedcovariatesforshape,accura cyand
thresholdparameterswas752.9.Theincreaseinthe‐2Loglikelihoodwasnegligible(anincreaseto
753.1)whenthecovariateforshapewasremovedfromthemodel(chisquare=753,1752.9=0.2,2
df,P=0.90).Parameterestimatesforthemodelthatassumesacommonshapearegivenbelow,and
thecorrespondingHSROCcurves shown intheFigure.Theestimatesofalpha,thetaandbetacanbe
inputtoRevMantoobtainthesummarycurveforthereferentgroup(LA).Thevariancesofthe
randomeffectsforthresholdandaccuracyarecommontoallthreetechniques,asistheshape
parameterbeta.However,thethresholdandaccuracyparameterestimatesforELISAaregivenby
theta+t1andalpha+a1respectively,andforNaregivenbytheta+t2andalpha+a3.
FitStatistics
‐2LogLikelihood753.1
AIC(smallerisbetter)771.1
AICC(smallerisbetter)773.2
BIC(smallerisbetter)787.7
ParameterEstimates
Standard
ParameterEstimateErrorDFtValuePr>|t|AlphaLowerUpperGradient
alpha2.45520.3245457.57<.00010.051.80173.1087‐0.0004
theta‐0.54900.213745‐2.570.01360.05‐0.9794‐0.11860.000139
beta0.19950.1702451.170.24720.05‐0.14320.5423‐0.00018
s2ua1.28650.3109454.140.00020.050.66031.9128‐0.00038
s2ut0.47860.1139454.200.00010.050.24920.70800.00062
a10.24830.4408450.560.57600.05‐0.63951.1361‐0.00038
a20.33280.4439450.750.45730.05‐0.56121.22690.000093
t1‐0.19620.261445‐0.750.45680.05‐0.72270.3303‐0.00017
t20.49600.2627451.890.06540.05‐0.033011.02500.000366
Fromthefigure,itappearsthatLAmaybeless
accuratethantheother2methods,however,
removalofthecovariateforaccuracy(coefficients
a1anda2)fromthemodelhasnegligibleeffecton
thefitofthemodel(χ
2
=753.7753.1=0.6on2d.f.,
p=0.74)indicatingnostatisticalevidenceofa
differenceindiagnosticaccuracyofRFaccordingto
technique.Thisindicatesthatitisreasonabletofit
asinglesummaryROCforRF.
36 | Page
10.5.4 ComparingIndexTests
Formanydiagnosticreviews,akeyobjectiveistocomparethediagnosticaccuracyoftwo
alternativeindexteststhatmaybeusedtodiagnosethesamecondition.Inthissection,thefocus
willbeonthecomparisonoftwoindextests,buttheapproachcanbeextended
toallowformore
thantwotests.
Twoapproachesaregenerallyadoptedfortestcomparisons.Thefirstapproachutilisestestaccuracy
datafromalleligiblestudiesthathaveevaluatedoneorbothtests.Thesecondapproachrestricts
theanalysistostudiesthathaveevaluatedbothtestseitherinthesame
individuals,orhave
randomizedindividualstoundergooneorotherofthetwotests.Thesecondapproachhas
advantagesbecausethecomparisonislesslikelytobebiasedduetoconfoundingandhencethese
resultsshouldberelieduponwherepossible.However,thenumberofstudiesthatreportsuch
directcomparisons
isoftenverylimited,whichmeansthatsuchananalysismaynotbefeasibleor
mightonlybeconsideredasasensitivityanalysis(see10.6.1)
10.5.4.1 Testcomparisonsbasedonallavailablestudies
Often,manyoftheavailablestudiesevaluateonlyoneofthetestsofinterest.Byusingall
studies
thathaveevaluatedatleastoneofthetests,wemaximizethenumberofstudiesintheanalysis.
However,thestudiesarelikelytobeheterogeneousintermsofdesignandpatientcharacteristics
thatareassociatedwithtestaccuracyandhenceconfoundingmaybeanissue.Inpreliminary
exploratoryanalyses
inRevManthiscanbedealtwithbycomparingthetestswithinsubgroupsof
studiesthatarehomogeneouswithrespecttoimportantpotentialconfounderssuchasstudydesign
orspectrumofdisease.Thevalueandfeasibilityofsuchexploratoryanalyseswillbeaffectedbythe
numberofavailablestudies,and
missingorinconsistentreportingacrossstudiesofinformationon
potentialconfounders.
Thestatisticalmethodsdescribedinthissectionfollowdirectlyfromtheearlierdescriptionof
hierarchicalmodelsandhowtheycanbeusedtoinvestigateheterogeneityintestaccuracy.Forthe
comparisonoftwoindextests,thetypeoftestis
representedbyabinarycovariatethatisusedto
identifythetestthatgaverisetoeach2×2tableincludedintheanalysis.Confounderscan
potentiallybeadjustedfor,howeverthisisoftendifficulttodoinpracticebecausethenumberof
studiesissmalland/ordataonimportantconfounders
maybepoorlyrecordedorincomple te.
BoththeBivariatemodelandtheRutterandGatsonisHSROCmodelcanbeuse dtoinvestigatethe
relativeaccuracyoftwoindextests.However,asnotedpreviously,thechoiceofapproachwillbe
influencedbythenatureoftheavailabledata,and theinterpretationof
theresultswillalsodepend
onwhichapproachisused.
10.5.4.2 TestcomparisonsusingtheBivariatemodel
If,foreachindextest,theavailablestudieshaveusedaconsistentcutpointonacontinuousor
ordinalscaletodefinetestpositivitythentheBivariatemodelprovidesanappropriateframework
for
testcomparisons.Itmayalsobereasonabletoassumeaconsistentcutpointwhenatest
comprisesa‘testkit’thatproducespositiveandnegativeresults(suchasacolouredlineappearing
onadevice).Byadoptingthesamestrategydescribedearlier(10.5.3.1),abinarycovariatefortest
typecanbe
includedinthemodeltoinvestigatewhethertheexpectedsensitivityand/orspecificity
differsbetweenthetests.
37 | Page
Caremustbetakenwiththeinterpretationoftheresultsofsuchamodel,particularlyifthe
commoncutpointfortestpositivityforeithertestisappliedtoacontinuousorordinalscale.Any
inferencesmadeabouttherelativediagnosticaccuracyofthetwotestsisonlyvalidat
thechosen
cutpointforeachofthetwotestsandcannotbeextrapolatedtootherpossiblecutpoints.Where
othercutpointsarereported,theanalysiscanberepeatedusingtheavailabledatatoinvestigate
therelativediagnosticaccuracyofthetestsatthosealternativecutpoints.
Becauseweare
analyzingtestaccuracydatafortwoalternativeindextests,itmaynotbereasonable
toassumethatthevariancesoftherandomeffectsforlogit(sensitivity)andlogit(specificity)arethe
sameforthetwotests.TheBivariatemodelcanbeextendedtoallowthevarianceoftherandom
effectsforboth
todependonthecovariatefortesttype.Thiswillalsoaffecttheestimated
correlationbetweenthem.Statistically,estimationofthevariancesoftherandomeffectsfor
logit(sensitivity)andlogit(specificity)andcorrelationbetweenthemis subjecttoahigherlevelof
uncertaintythanforthemainparametersofinterest.However,if
basedonpreliminaryplotsofthe
studylevelestimatesofsensitivityandspecificityinROCspacetherearemarke ddifferencesin
heterogeneitybetweenstudiesforthetwotests,itisadvisabletoassesswhethertheassumptionof
equalvariancesofrandomeffectsforthetwotestsisreasonable.Thisis
usuallydonebycomparing
thefitofthealternativemodels(variancesdoordonotdependonthecovariatefortesttype)using
alikelihoodratiotest.Acomparisonofthemainestimatesofinterestbetweenthealternative
modelsisalsousefultoassesswhetherconclusionsabouttherelativesensitivityand/or
specificityof
thetestsarerobusttoassumptionsaboutthevariancesoftherandomeffects.Again,suchan
investigationwillnotbefeasibleifthenumberofstudiesissmall.
Itisusualformostofthestudiesinthisapproachtotheanalysisoftestcomparisonstohave
evaluated
onlyoneofthetests,butsomestudieswillhaveevaluatedboth.Iftheproportionof
studiesthathaveevaluatedbothisverysmall,thentreatingtheresultsofthetwotestsinastudyas
iftheywereobtainedfromdifferentstudiesis unlikelytoaffecttheresults.Althoughthis
isoften
doneinpractice,suchanapproachisnotrecommendediftheproportionofstudiesevaluatingboth
testsisnotsmallbecauseitislikelytoresultininappropriatestandarderrorsforthetestcomparison
parametersforsensitivityandspecificity.Inthatcasethepaire dsensitivity/specificitydataforboth
testsfromeachstudyshouldbeatleveloneoftheanalysis,andabinarycovariatefortesttype
includedtoidentifywhich2×2tablecorrespondstoeachtest.
38 | Page
10.5.4.3 Example3:CTversusMRIforthediagnosisofcoronaryarterydisease
Schuetzetal(Schuetz201 0)evaluatedthediagnosticperformanceofmultislicecomputed
tomography(CT)andmagneticresonanceimaging(MRI)forthed iagnosis ofcoronaryarterydisease
(CAD).ProspectivestudiesthatevaluatedeitherCTorMRI(orboth),usedconventionalcoronary
angiography(CAG)asthereferencestandard,andusedthesamethresholdforclinicallysignificant
coronaryarterystenosis(adiameterreductionof50%orgreater)wereincludedinthereview.A
totalof103studiesprovideda2x2tableforoneorbothtestsandwereincludedinthemeta
analysis:84studiesevaluatedonlyCT,14evaluatedonlyMRI,and5studiesevaluatedbothCTand
MRI.(SeeAppendixfordataandSASprograms).
Becausethestudieswereselectedbasedonacommonthresholdforclinicallysignificantcoronary
arterystenosis,theBivariatemodelwasusedfordatasynthesisandtestcomparison.Inthefirst
stageoftheanalysis,webaseourtestcomparisononallstudiesthatevaluatedatleastonetest.The
approachfollowscloselythemethodillustratedinExample1forexploring heterogeneityusingthe
Bivariatemodel.
Abinarycovariate(testtype)isaddedtothemodelwhichiscodedas0ifthe2x2tableisforMRI
(thereferentcategory),andcodedas1ifthe2x2tableisforCT.Thefivestudiesthatevaluatedboth
testscontributea2x2tableforeachtest,hencethereare19studiesincludedforMRIand89studies
includedforCT.Allowingtestperformancetovarybytypeoftestresultedina‐2Loglikelihoodof
953.0,areductionof42.5comparedwiththemodelthatcontainednocovariates.Hence,thereis
statisticalevidence(chisquare=42.5,2df,P<0.001)thatsensitivityand/orspecificityareassociated
withtesttype.Removingthecovariateforsensitivityfromthemodel(chisquare=976.7953.0=23.7,
1df,P<0.001)showsstrongstatisticalevidenceofadifferenceinsensitivitybetweenthetwotests.
Similarly,removingthecovariateforspecificityfromthemodel(chisquare=976.2953.0=23.2,1df,P
<0.001)showsstrongstatisticalevidenceofadifferenceinspecificitybetweenthetwo tests.
TheSASoutputforthemodelthatallowsbothsensitivityandspecificitytovarybytestis:
FitStatistics
‐2LogLikelihood953.0
AIC(smallerisbetter)967.0
AICC(smallerisbetter)967.5
BIC(smallerisbetter)985.5
ParameterEstimates
Standard
ParameterEstimateErrorDFtValuePr>|t|AlphaLowerUpperGradient
msens2.17710.24571018.86<.00010.051.68962.66450.000046
mspec0.87540.21111014.15<.00010.050.45661.2942‐0.00008
s2usens0.87490.22931013.820.00020.050.42011.32970.000033
s2uspec0.84470.16961014.98<.00010.050.50841.1810‐4.31E6
covsesp0.18030.13841011.300.19560.05‐0.094240.4548‐0.00002
se_CT1.30330.26251014.97<.00010.050.78271.82400.000053
sp_CT1.04150.21541014.84<.00010.050.61431.4687‐0.00005
CovarianceMatrixofParameterEstimates
RowParametermsensmspecs2usenss2uspeccovsespse_CTsp_CT
1msens0.060380.0052410.011000.0003420.002242‐0.05262‐0.00404
2mspec0.0052410.044570.0005180.0033600.001651‐0.00376‐0.03861
3s2usens0.011000.0005180.052570.0006940.0076080.003839‐0.00060
4s2uspec0.0003420.0033600.0006940.028750.005537‐0.000230.000438
5covsesp0.0022420.0016510.0076080.0055370.019150.001155‐0.00110
6se_CT‐0.05262‐0.003760.003839‐0.000230.0011550.068890.004479
7sp_CT‐0.00404‐0.03861‐0.000600.000438‐0.001100.0044790.04638
AdditionalEstimates
Standard
LabelEstimateErrorDFtValuePr>|t|AlphaLowerUpper
logitsensCT3.48040.155010122.45<.00010.053.17293.7879
logitspecCT1.91690.117210116.36<.00010.051.68442.1494
39 | Page
CovarianceMatrixofAdditionalEstimates
RowLabelCov1Cov2
1logitsensCT0.024030.001916
2logitspecCT0.0019160.01373
Theestimatedlogit(sensitivity)andlogit(specificity)forthereferentcategory(MRI)aregivenby
msensandmspecrespectively.Theseestimates,theirstandarderrorsandtheircovarianceare
shownintheredboxes.TheESTIMATEcommandinSAShasbeenusedtoobtainthecorresponding
estimatesforCT(shownintheblueboxes).Thevariancesoftherandomeffectsandtheircovariance
areshowninthegreenbox.
TheaboveestimatescanbeinputtoRevManto
produceaROCscatterplotwithsummary
operatingpointsforMRIandCTandtheir
confidenceregionssuperimposedasshownin
thefigurewheretheblacksymbolsrepresentCT
andtheredsymbolrepresentMRI.Becauseof
thelargenumberofstudiesforCT,thesummary
estimateandregionaredifficult tosee.The
figurecouldberedrawnwithjustthesummary
pointsandregionsandshownseparatelyfrom
theROCscatterplot.
Fromtheoutputabove,the95%confidence
limitsforthesummaryestimatesfor
logit(sensitivity)andlogit(specificity)canbe
foundinthecolumnsheaded“lower”and
“upper”.Usinginversetransformation,the
summaryestimatesforsensitivityare0.90(95%CI0.84,0.93)forMRIand0.97(95%CI0.96,0.98)
forCT.Thesummaryestimatesforspecificity0.71(95%CI0.61,0.78)forMRIand0.87(95%CI0.84,
0.90)forCT.Hence,basedonthisanalysis,thereisstrongevidencethatCThashighersensitiv ity
andspecificitythanMRI,fordetectingclinicallysignificantcoronaryarterystenosisdefinedasa
diameterreductionof50%ormore.
10.5.4.4 TestcomparisonsusingtheRutterandGatsonisHSROCmodel
Simpleseparatecomparisonsofsummaryestimatesofsensitivity(orspecificity)ofalternativetests
canbemisleadingiftheincludedstudieshaveuseddifferentcutpointstodefinetestpositivity.In
thissituation,comparisonsbasedonSROCcurvesprovideamoreinformative
approach.
Thehierarchicalmodellingstrategyusedtoinvestiga teheterogeneitydescribedearlierfortheRutter
andGatsonisHSROCmethods(10.5.3.3)canbeusedforcomparisonsoftestaccuracywhenthereis
variabilityinthresholdbetweenstudies.Thetypeoftestisrepresentedbyabinarycovariatethatis
usedtoidentify
thetestthatgaverisetoeach2×2tableincludedintheanalysis.Thiscovariatethen
allowsthereviewertoinvestigatewhethertesttypeisassociatedwiththeshapeandpositionofthe
summaryROCcurve.Interpretationoftheresultsfollowsdirectlyfromthediscussionofthe
interpretationofinvestigationsof
heterogeneityin1 0 .5.3.
Statistically,estimationofthevariancesoftherandomeffectsforthresholdandaccuracyissubject
toahigherlevelofuncertaintythanforthemainmodelparametersofinterest.Ifpreliminar yplots
ofthestudylevelestimatesofsensitivityandspecificityinROCspaceshowmarkeddifferences
in
40 | Page
heterogeneitybetweenstudiesforthetwotests,itisadvisabletoassesswhethertheassumptionof
equalvariancesoftherandomeffectsforthetwotestsisreasonable.Thisisusuallydoneby
comparingthefitofthealternativemodels(i.e.wherevariancesdo,ordonot,dependonthe
covariate
fortesttype).Acomparisonofthemainestimatesofinterestbetweenthealternative
modelsisalsousefultoassesswhetherconclusionsabouttherelativeshapeandaccuracyofthe
summarycurvesforthetwotestsarerobusttoassumptionsaboutthevariancesoftherandom
effects.Again,suchan
investigationwillnotbefeasibleifthenumberofstudiesissmall.
AsnotedfortheBivariatemodel,itisusualformostoftheincludedstudiestohaveevaluatedonly
oneofthetests,butsomestudieswillhaveevaluatedboth.Iftheproportionofstudiesthathave
evaluatedboth
isverysmall,thentreatingtheresultsofthetwotestsinastudyasiftheywere
obtainedfromdifferentstudiesisunlikelytoaffect theresults.However,moreaccuratestandard
errorswillbeobtainedforthetestcomparisonparametersifthedataforbothtestsaremodelled
within
thestudyatleveloneoftheanalysis.Abinarycovariatefortesttypemustbeincludedto
identifywhich2×2tablecorrespondstoeachtest.
10.5.4.5 Testcomparisonbasedonstudiesthatdirectlycomparetests
Asnotedearlier,heterogeneityintheestimatedaccuracyofadiagnostictestacrossstudiesis
likely
tooccur.Thiscouldconfoundthecomp arisonoftwotestsifdifferentstudiesareusedtoestimate
thediagnosticaccuracyofeachtest.Ideally,thecomparisonshouldbebasedonstudiesthathave
madeadirectcomparisonofthetestsofinterestbyeitherapplyingbothteststoeach
individual,or
byrandomizingeachindividualtoreceiveoneofthetests.Acommonreferencestandardshouldbe
appliedtobothtests.Ifth ere aresufficientstudiesofthistypeonwhichtobaseatestcomparison,
theresultsarelesspronetobiasthanananalysisbasedonallavailable
studiesthathaveevaluated
oneorbothtests.
ApreliminarygraphicalanalysiscanbeconductedinRevManbyplottingtheestimatedsensitivity
andspecificityforbothtests,foreachstudyinROCspace.Thetwopointscontributedbyeachstudy
(oneforeachtest)arejoinedbyalineto
highlighttherelativetestaccuracywithineachstudy(see
10.3.2).Thisfigureillustratesthepairingoftestaccuracyestimatesatthestudylevel.
TherationaledescribedaboveforchoosingbetweentheBivariatemodelandtheHSROCmodel
whenmakingtestcomparisonsisalsoapplicablehere,andthesamepointsrelatingto
interpretation
apply.Theonlymajordifferenceisthattheanalysisdoesnotincludeanystudiesthathaveevaluated
onlyoneofthetests.
Becauseeachstudycontributesa2×2tableforeachofthetwoteststobecompared,thedatafor
thetwotestsmustbeanalysedwithinstudy
atleveloneoftheanalysis,andabinarycovariatefor
testtypeincludedtoidentifywhich2×2tablecorrespondstoeachtest.Enteringaseparate2×2
tableforeachtest(withineachstudy)foranalysisinahierarchicalmodeleffectivelyassumesthat
thedataarisefromarandomizeddesign.This
representsaconservativeapproachthatisoften
necessitatedbythelackofinformationonpairedresultsattheindividuallevelfortruly‘paired’
studiesthathaveappliedbothteststothesameindividual.
Metaanalyticalmodelsthataccountforpairingoftestresultswithinanindividualinstudieswhich
have
usedapairedstudydesignarenotcommonlyused,andrequirefurtherdevelopmentand
testingbeforetheyareimplementedinaCochranereview.Suchanextensionwouldalsorequire
41 | Page
thatresearcherspublishacrossclassificationoftestresultswithinboththediseasedandnon
diseasedgroups.Thisisnotcommonpracticeatpresent.
10.5.4.6 Example3(cont.):CTversusMRIforthediagnosisofcoronaryarterydisease
ThemetaanalysisbySchuetzalsoincluded5studiesthatmadeadirect
comparisonofCTandMRI.
Basingtheanalysisonthese5studies(ten2x2tables)hastheadvantagethattheresultsshouldbe
lesspronetobias.However,thenumberofstudiesintheanalysisisdramaticallyreducedwhich
reducestheprecisionofthesummaryestimates.Aswewillseeinthisexample,simplifying
assumptionsmayalsoberequiredtofitcomplexHierarchicalmodelstothesedata.Wewillagain
applytheBivariatemodelforthesedata.
TheROCscatterplotshowsthedataforthe5
pairedstudies,withblackusedtodenoteCTand
redusedtodenoteMRI.Alineisusedtojoin
theresultsforCTandMRIwithineachstudy.
Examiningthisplot,we canseethatsensitivity
forCTislowerthanforMRIinonestudy,
equivalentinonestudy,andhigherintheother
threestudies.SpecificityishigherforCTthan
forMRIin3studiesandlowerintheother2.
Fittingamodeltothesedataisdifficult,
particularlyfortheBivariatemodelwhere
convergenceismoreproblematicthanforthe
RutterandGatsonismodel(see10.5.6).
Apreliminaryseriesofmodelswerefittedtoassesswhetherrandomeffectsshouldbeincludedfor
bothsensitivityandspecificity(thismodeldidnotincludethecovariatefortesttype).Themodel
thatincludedrandomeffectsonlyforspecificitygavea‐2Loglikelihoodof106.4,abetterfitthanthe
modelthatassumedafixedeffectforbothsensitivityandspecificity(2Loglikelihood114.8).The
modelthatassumedrandomeffectsonlyforsensitivityprovidednoimprovementtothefit
comparedwiththefixedeffectmodel.Hence,thecovariatefortesttypewasaddedtothemodel
withrandomeffectsforspecificityonly.TheSASoutputforthismodelisshown below.
FitStatistics
‐2LogLikelihood89.6
AIC(smallerisbetter)99.6
AICC(smallerisbetter)103.9
BIC(smallerisbetter)97.6
ParameterEstimates
Standard
ParameterEstimateErrorDFtValuePr>|t|AlphaLowerUpperGradient
msens1.80830.241247.500.00170.051.13852.47810.000011
mspec0.89100.360642.470.06890.05‐0.11011.89211.197E6
s2uspec0.42390.374441.130.32080.05‐0.61561.4634‐1.35E7
se_CT1.00510.419542.400.07470.05‐0.15962.1698‐6.32E6
sp_CT0.93780.295543.170.03370.050.11751.75813.672E6
CovarianceMatrixofParameterEstimates
RowParametermsensmspecs2uspecse_CTsp_CT
1msens0.05820‐7E12‐131E13‐0.05820‐147E14
2mspec‐7E120.1300‐0.012141.13E11‐0.03387
3s2uspec‐131E13‐0.012140.14022.43E110.003983
4se_CT‐0.058201.13E112.43E110.17609.95E12
5sp_CT‐147E14‐0.033870.0039839.95E120.08729
AdditionalEstimates
42 | Page
Standard
LabelEstimateErrorDFtValuePr>|t|AlphaLowerUpper
logitsensCT2.81340.343248.200.00120.051.86063.7663
logitspecCT1.82870.386744.730.00910.050.75502.9025
CovarianceMatrixofAdditionalEstimates
RowLabelCov1Cov2
1logitsensCT0.11781.28E11
2logitspecCT1.28E110.1496
Thelayoutandinterpretationoftheoutputfollowthatfortheindirecttestcomparisonexample
discussedearlier,withtheredindicatingMRIresultsandblueCT.Notethattheestimated
covariancebetweenmsensandmspecisequivalenttozero.Thegreenboxsho w s theestimated
varianceoftherandomeffectsforspecificity.
TheseestimateswereinputtoRev Manto
superimposesummaryestimatesandtheir
confidenceregionsontheROCscatterplot.A
zerovaluewasenteredforthevarianceofthe
randomeffectsforsensitivity.
Inverselogittransformationoftheestimates
andtheirlowerandupper95%confidencelimits
givesestimatedsensitivitiesof0.86(95%CI0.76,
0.92)forMRIand0.94(95%CI0.87,0.98)forCT;
andestimatedspecificitiesof0.71(95%CI0.47,
0.87)forMRIand0.86(95%CI0.68,0.95)forCT.
Theseestimatesareconsistentwiththeprevious
analysiswhichshowedthatCThadhigher
sensitivityandspecificitythanMRI.
Theconfidenceregionsshownonthefigurearewiderthanwouldbeindicatedbytheconfidence
intervalsgivenabove.TheconfidenceregionscomputedbyRevManappeartobeoverly
conservativewhenthenumberofstudiesissmall.Thisissuewillinvestigatedandmodificatio ns
madetolaterversionsofthesoftwareifrequired.
Thetstatisticsintheoutputaboveprovideonlyweakevidenceofadifferenceinsensitivity
(P=0.075forparameterse_CT)andevidenceofadifferenceinspecificity(P=0.034forparameter
sp_CT).ThePvaluesbasedonchangesinthe‐2Loglikelihoodarelower(0.012and0.0011
respectively),indicatingstrongerevidencefortheseeffects.Giventhesmallnumberofstudiesin
thisanalysisandtheresultingdifficultyincheckingmodelassumptionsregardingthedistributionsof
therandomeffects,itmaybeadvisabletotakeaconservativeapproach.Thekeyinferencehereis
thattheresultsofthisanalysisareconsistentwiththeconclusionsoftheearlierindirectcomparison
whichwasbasedonallavailablestudiesthatevaluatedatleastoneoftheindextests.
10.5.5 Computersoftware
Bothofthehierarchicalmodelswehavefocusedoncanbefittedusingarangeofstatistical
packages.WinBUGS(oritsrecentopensourceversionOpenBUGS)providesaflexibleBayesian
frameworkformodelfitting.ItcanbeusedtofitboththeBivariateandHSROCmodels.To
obtain
theparametersneededtocreateconfidenceandpredictionregionsfortheRevManSROCplots
requiresestimatesofthestandarderrors,whichneedtobegeneratedfromtheposterior
distributions.
43 | Page
ManyanalystsfindfittingthemodelsusingstandardcommonlyusedsoftwarepackagessuchasSAS,
StataandMLwiNmorestraightforward. 
TheBivariatemodelcanbefittedusingsoftwarethatcanfitageneralizedlinearmixedmod el.
CommonlyusedroutinesareProcNLMIXED(orProcGLIMMIX)inSAS,
xtmelogit(ortheuser
writtenpackageglamm)inStata.Alloftheseprogramsassumethattherandomeffectsare
normallydistributed.WinBUGScanallowalternativedistributionsfortherandomeffectsifthatis
deemednecessary.Auserwrittencommandmetandi isavailableinStatatofitamodelwithout
covariates(Harbord
2009),andamacroMETADASwritteninSASwhichincludesmodelswithand
withoutcovariates(Takwoingi2008).Bothmacrosneatlytabulateoutputrequiredforpopulating
theplottingfunctionsinRevMan.
TheparameterizationfortheRutterandGatsonismodelrepresentsageneralizednon‐linearmixed
model.Ifcovariatesaretobeincluded
andtestedinthemodel,thentherangeofavailablesoftware
ismorelimitedbecauseofthenonlinearformofthemodeliftheshapeparameterisincludedinthe
model.Thismodelisusua llyfittedusingProcNLMIXEDinSAS(Macaskill2004).TheSASMETADAS
macroalsofits
HSROCmodelswithandwithout covariates.TheStatametandicommandcanbe
usedtofittheRutterandGatsonisHSROCmodelwithoutcovariates.However,itshouldbenoted
thatitdoesthisbyexploitingthemathematicalequivalencebetweentheBivariateandHSROC
modelswhentherearenocovariatesinthemodel.
Hence,itisapplicablewhenanoverallHSROC
curveistobefittedtoagroupofstudiesbutdoesnotallowinclusio n ofcovariates.
RevMancanusetheparameterestimatesfromoneorothermodeltoestimate:asummarycurve;
summaryoperatingpoint;aconfidenceregionandpredictionregion
forthesummarypoint.
However,thereviewauthorneedstobeclearwhichofthesesummarymeasuresareappropriatefor
theiranalysis.
10.5.6 Approachestoanalysiswithsmallnumbersofstudies
Whenthenumberofstudi esissmallitmaybedifficulttodecideonwhichtermsshouldbeincluded
in
amodel,andwhichisthe‘best’model.Forinstance,whenfittingasummaryROCcurve,the
uncertaintyassociatedwiththeestimationoftheshapeparametercouldbeveryhigh,andthe
estimatemayalsobestronglyinfluencedbytheinclusion/exclusionofindividualstudies.Forboth
theBivariateandHSROC
models,estimatesofthevariancesoftherandomeffectswillbesubjectto
ahighlevelofuncertainty.
ItisimportanttokeepinmindthatestimationofasinglesummarypointusingtheBivariatemodel,
orestimationofasinglesummarycurveusingtheHSROCmodel,requiresfiveparametersto
be
estimatedinthefullmodelspecification.Thereislittleinformatio n onwhichtobasethese
estimateswhenthenumberofstudiesissmall,soanalystsmusttakethisintoaccountwhen
interpretingtheresults.Insomesituationsmodelsmayfailtoconverge.
Itisnotpossibletogivehard
andfastrulesabouthowtoproceedwhendealingwithsmallnumbers
ofstudies.However,somestrategiesareoutlinedherewhichmayhelpinsomesituations.
Ultimately,judgementmu stbeexercisedregardingwhetheramodelissufficientlyreliabletoreport.
Failureofamodeltoconvergemaybesymptomaticofseveral
problems:
44 | Page
Insomecases,itmaybeduetopoorchoicesofstartingvaluesfortheparameterestimates.If
so,itmayhelptofitthemodelfirstassumingafixedeffectforthemodelparameters,andthen
usetheseasthestartingvaluesfortherandomeffectsmodel.
For
smalldatasets,convergencemayalsobeaffectedbytheinclusion/removalofindividual
studies.Theeffectofsuchinfluentialstudiesshouldbeinvestigated.
Convergenceproblemscanalsoarisewhenthevarianceofoneoftherandomeffectsiscloseto
zero.ThisisparticularlyanissuefortheBivariate
modelparameterisation,wherean
examinationofthescatterplotmayhelptoidentifystrongheterogeneityinsensitivitybut
homogeneityinspecificity,orviceversa.Restrictingthemodeltohaverandomeffectsforone
parameter,andafixedeffectfortheothermaythenbewarranted.Thisparticularproblemcan
alsooccur
whenthenumberofstudiesisrelativelylarge.
ThestandarderrorfortheshapeparameterintheHSROCmodelmaybelarge.Itwouldbe
advisabletocheckhowmuchtheshapeisinfluencedbytheremovalofindividua lstudies.When
theshapeisuncertainandalsoverydependent
onindividualstudies,thensomeanalystsmay
choosetoassumesymmetryforthesummarycurvetoacknowledgethattheshapecannotbe
estimatedreliably.Again,thisneedstobereportedanddiscussedinthereportoftheanalyses.
10.6 Specialtopics
10.6.1 Sensitivityanalysis
Theprocessofundertakingasystematicreviewinvolvesasequenceofdecisions.Whilstmanyof
thesedecisionsareclearlyobjectiveandnoncontentious,somewillbesomewhatarbitraryor
unclear.Forinstance,ifinclusioncriteriainvolveanumericalvalue,thechoiceofvalueisusually
arbitrary:forexample,
defininggroupsofolderpeoplemayreasonablyhavelowerlimitsof60,65,
70or75years,oranyvalueinbetween.Otherdecisionsmaybeunclearbecauseastudyfailsto
includetherequiredinformation.Furtherdecisionsareunclearbecausethereisnoconsens us of
thebestmethodto
useforaparticularproblem,suchasdefiningareferencestandardoranalysing
missingdataorintermediatetestresults.
Itisdesirabletodemonstratethatthefindingsfromasystematicreviewarenotdependentonsuch
arbitraryoruncleardecisions.Asensitivityanalysisisarepeatoftheprimaryanalysisor
meta
analysis,substitutingalternativedecisionsofrangesofvaluesfordecisionsthatwerearbitraryor
unclear.Forexample,iftheeligibilityofsomestudiesinthemetaanalysis isdubiousbecausethey
donotcontainfulldetails,sensitivityanalysismayinv olveundertakingthemetaanalysistwice:first
includingallstudies,
andsecond,onlyincludingthosethataredefinitelyknowntobeeligible.A
sensitivityanalysisasksthequestion“Arethefindingsrobusttothedecisionsmadeintheprocessof
obtainingandanalysingthem?”.Asensitivityanalysisisnotthesameasasubgroupanalysiswhere
thepurposeistoinvestigate
howstudydesignandpatientcharacteristicsareassociatedwithtest
accuracy.Theaiminthesubgroupanalysisistoexploreandexplainheterogeneityintestaccuracy.
Therearemanydecisionnodeswithinthesystematicreviewprocesswhichcangenerateaneedfor
sensitivityanalysis.Examplesinclude:
Searchingforstudies:
45 | Page
o Shouldabstractswhoseresultscannotbeconfirmedinsubsequentpublicationsbeincludedin
thereview?
Eligibilitycriteria
o Characteristicsofparticipants:whereamajoritybutnotallpeopleinastudymeettherequired
presentationordemographic,shouldthestudybeincluded?
o Characteristicsoftests:whatversionsofa
testtechnologyshouldbeincluded?Whatthreshold
definitionconstitutesacommonthreshold?
o Characteristicsofthereferencestandard:wheretherearevariationsoninformationusedina
clinicalopinionbasedreferencestandard,shouldtheyallbeincluded?Wherethereference
standardinvolvesfollowup,whatlengthsoffollowupare
consideredadequate?
o Studymethods:shouldonlyfullyuniformlyverifiedstudiesbeincluded?Shouldunblinded
studiesbeincluded?Shouldcasecontrolstudiesbeincluded?Orshouldinclusionberestricted
byanyothermethodologicalcriteria?
Whatdatashouldbeanalysed?
o Howshoulduninterpretabletestresultsbehandledintheanalysis?Should
theybeclassifiedas
testnegativesorexcluded?
o Howshouldmissingdatabehandledintheanalysis?
Analysismethods
o ShouldacommonorsymmetricshapeforanROCcurvebepresumedacrosssubgroupsortests?
o Canequalvariancesbepresumedforalltestsinacomparison?
Reporting
ofsensitivityanalysesinasystematicwaymaybestbedonebyproducingasummary
table.Somesensitivityanalysescanbeprespecifiedinthestudyprotocol,butmanyissuessuitable
forsensitivityanalysisareonlyidentifiedduringthereviewprocesswheretheindividualpeculiarities
ofthestudiesunderinvestigationare
identified.Wheresensitivityanalysisshowtheoverallresult
andconclusionsarenotaffectedbythedifferentdecisionsmadeduringthereviewprocess,the
resultsofthereviewcanberegardedwithahigherdegreeofcertainty.Wheresensitivityanalyses
identifyparticulardecisionsormissinginformationthatgreatlyinfluencethefindings
ofthereview,
greaterresourcescanbedeployedtotryandresolveuncertaintiesandobtainextrainformation,
possiblythroughcontactingstudyauthors.Ifthiscannotbeachieved,theresultsmustbe
interpretedwithanappropriatedegreeofcaution.Suchfindingsmaygenerateproposalsforfurther
investigationsandfutureresearch.
Sensitivityanalysis
maybesometimesconfusedwithsubgroupanalysis.Althoughsomesensitivity
analysismayinvolverestrictingtheanalysistoasubsetofthetotalityofthestudies,thetwo
methodsdifferintwoways.First sensitivityanalysesdonotattempttoestimatetheeffectofthe
covariateinthegroupofstudies
removedfromtheanalysis,whereasinsubgroupanalysisestimates
areproducedforallgroups.Second,insensitivityanalysisinformalcomparisonsaremadebetween
46 | Page
differentwaysofestimatingthesame thing,whereasinsubgroupanalysisformalstatistical
comparisonsaremadeacrossthesubgroups.
10.6.2 Investigatingandhandlingverificationbias.
Anexaminationofthepotentialforverificationbiasinasystematicreviewwouldordinarilybepart
oftheassessmentofstudyquality.Ifverificationbiasis
present,correctionswouldneedtobe
implementedwithinindividualstudies,beforeproceedingtothemetaanalysis.Theliteratureon
methodsforcorrectingverificationbiasinindividualstudiesisbynowextensive.Forexample,the
analystmaywanttoconsultChapter10inZhouetal(Zhou2002)andChapter7
inPepe(Pepe
2003).Itmayalsobeusefultoinvestigatethepresenceofverificationbiasasasourceof
heterogeneityamongstudies.Issuesariseintheextractionofstudydatawhenadjustmentshave
beenmadeforverificationbias,asdescribedintheChapter8.
10.6.3 Investigatingandhandlingpublicationbias

Systematicreviewersmustundertakecomprehensivesearches toattempttolocateallrelevant
studies.Ifthestudiesincludedinthereviewhaveresultsthatdiffersystematicallyfromrelevant
studiesthataremissed,estimatesderivedfromthemetaanalysiswillbeaffectedbypublication
bias(Begg1994a).
Althoughthereissubstantialliterature
relatingtopublicationbiasinsystematicreviewsof
randomizedcontrolledtrials,littleresearchhasbeendoneinthecontextofsystematicreviewsof
diagnosticstudies.However,itisclearthatthedeterminantsofpublicationbia sforreviewsofRCTs
(Dickersin1990),(Ioannidis1998)areunlikelytobegeneralizabletoreviewsof
diagnosticstudies.
Forinstance,whenconsideringdiagnostictestaccuracy,statisticalsignificanceisnotparticularly
relevantasfewstudiesformulateandtesthypotheses.Anotherdifferenceisthelikelyrelationship
betweenstudysizeandmethodologicalquality.WhereaslargeRCTsrequirelargescalefundingand
areonaverageconductedandanalysedwithgreater
methodologicalrigourthansmallRCTs,large
diagnosticstudiesmaybenomorethanananalysisofalargelaboratorydatabaseofroutinely
collecteddata.
Statisticaltestsdetectfunnelplotasymmetryingeneralratherthanpublicationbiasspecifically(see
section10.4oftheCochraneHandbookforSystematicReviewsofInterventions).Testsfor
funnel
plotasymmetrydesignedprimarilyforuseinrandomizedtrials,i ncludingtheEgger(Egger1997),
Begg(Begg1994b),Harbord(Harbord2006)andPeters(Peters2006)tests,shouldnotbeusedwith
diagnosticstudies.Itiswellestablishedthattheaccuracyofsuchtestsforfunnelplotasymmetryis
reasonableif
theoddsratioiscloseto1(asoccursinmanyrandomizedtrials),butdeterioratesas
theoddsratiomovesawayfrom1(Macaskill2001),(Schwarzer2002).Fordiagnosticstudies,the
oddsratioisexpectedtobelarge.Applyingsuchtestsforfunnelplotasymmetryinsystematic
reviewsofdiagnostictest
accuracyislikelytoresultinpublicationbiasbeingincorrectlyindicatedby
thetestfartoooften(aTypeIerrorratethatistoohigh)(Deeks2005).
Amoreappropriatemethodfordetectingfunnelplotasymmetryinreviewsofdiagnosticstudieshas
beendeveloped(Deeks2005).Ittestsfor
associationbetweenthelnDORandthe‘effectivesample
size’,asimplefunctionofthenumberofdiseasedandnondiseasedindividuals.Asimulationstudy
hasshownthatthetesthasmodestpowerfordetectingfunnelplotasymmetry.However,when
thereisheterogeneityintheDOR,eventhistesthaslowpower,
asdoalltestsforfunnelplot
asymmetry.
47 | Page
Sinceheterogeneityintest accuracyistobeexpectedin manydiagnosticreviews,reviewauthors
arewarnedagainstinterpretingstatisticalevidenceoffunnelplotasymmetryasnecessarilyimplying
publicationbias.Studysizemayberelatedtotestaccuracyforreasonsotherthanpublicationbias.
Explorationofheterogeneityintestaccuracyshould
beundertaken,aspatie ntandstudy
characteristicsmaybeassociatedwithstudysizeaswellastestaccuracy(Deeks2005).Further
researchisrequiredtoimproveourunderstandingofthedeterminantsandextentofpublication
biasfordiagnosticstudies.
10.6.4 DevelopmentsinmetaanalysisforDTAreviews
Thischapterreflects
thecurrentlyestablishedmethodsformetaanalysisofdiagnostictestaccuracy.
Methodologicaldevelopmentsoccurofteninthisfield,andthemethodsusedinCochraneDTA
reviewsaresuretodevelopovertimetoextendthescopeofthemodelsanddatastructureswhich
canbeincluded.Asnewmethodsareshown
toberobustandofimportance,andsoftwaremade
availablefortheirimplementation,theywillbeincludedinupdatesofthischapter.
Ofparticularinterestareanalyticalmethodsbeingdevelopedtoincludedatafrommultiple
thresholdsforeachstudy,whichallowbothmoreaccurateestimationofsummaryROCcurvesand
estimatesofaveragesensitivityandspecificityvaluesatstatedthresholds,buttheserequirefurther
evaluationbeforetheywillbeincorporatedinCochranereviews(Dukic2003),(Hamza2009).
48 | Page
Appendix
TheprogramslistedbelowareinSAS,buttheresultscanbereproducedusi ngothersoftware.The
exportfacilityinRevMan5wasusedtocreatea.csvfilewhichcontainedthe2x2tablesforeach
studyincludedineachreview.The.csvfilecanbereadbyExcelandcanalso
beimportedinto
statisticalprogramssuchasSASforfurtheranalysis.(Note:additionalcolumnsofdatathatarenot
relevanttoouranalysesarenotshown).FilesareavailablefromtheCochraneDTAwebsite
(srdta.cochrane.org).
DataandSASfileforExample1Anti‐CCPforthediagnosisofrheumatoidarthritis
Data(nashimuraCCP.csv)
Test study_id CCPgeneration tp fp fn tn methodofmeasurement
AntiCCP Aotsuka2005 CCP2 115 17 16 73
AntiCCP Bas2003 CCP1 110 24 86 215
AntiCCP Bizzaro2001 CCP1 40 5 58 227
AntiCCP Bombardieri2004 CCP2 23 0 7 39
AntiCCP Choi2005
CCP2 236 20 88 231
AntiCCP Correa2004 CCP2 74 11 8 130
AntiCCP DeRycke2004 CCP2 89 4 29 142
AntiCCP Dubucquoi2004 CCP2 90 2 50 129
AntiCCP FernandezSuarez2005 CCP2 31 0 22 75
AntiCCP GarciaBerrocal2005 CCP2 69
8 18 38
AntiCCP Girelli2004 CCP2 25 2 10 40
AntiCCP GoldbachMansky2000 CCP1 43 1 63 120
AntiCCP Greiner2005 CCP2 70 5 17 228
AntiCCP GrootenboerMignot2004 CCP2 167 8 98 88
AntiCCP Hitchon2004 CCP2 26 8 15 15
Anti
CCP Jansen2003 CCP1 110 3 148 118
AntiCCP Kamali2005 CCP2 26 1 20 56
AntiCCP Kumagai2004 CCP2 64 14 15 293
AntiCCP Kwok2005 CCP2 71 2 58 66
AntiCCP LeeandSchur2003 CCP2 68 14 35 132
AntiCCP LopezHoyos
2004 CCP2 38 3 0 73
AntiCCP Nell2005 CCP2 42 2 60 96
AntiCCP Nielen2005 CCP2 149 7 109 114
AntiCCP Quinn2006 CCP2 147 10 35 106
AntiCCP RantapaaDahlqvist2003 CCP2 47 7 20 375
AntiCCP Raza2005 CCP2 24 3 18
79
AntiCCP Saraux2003 CCP1 40 11 46 146
AntiCCP Sauerland2005 CCP2 171 26 60 443
AntiCCP Schellekens1998 CCP1 72 14 77 298
AntiCCP Soderlin2004 CCP2 7 2 9 51
AntiCCP Suzuki2003 CCP2 481 23 68 185
AntiCCP Vallbracht2004
CCP2 190 12 105 408
AntiCCP vanGaalen2005 CCP2 82 13 71 301
AntiCCP vanVenrooij2004 CCP2 865 79 252 2218
AntiCCP Vincent2002 CCP1 139 7 101 464
AntiCCP Vittecoq2004 CCP2 69 5 107 133
AntiCCP Zeng2003 CCP1 90 7 101
313

49 | Page
SASProgram(nishimuraCCP.sas):
/*Importdata*/
procimportout=nishimura
datafile='C:\chapter10\nishimuraCCP.csv'
 dbms=csvreplace;
getnames=yes;
run;
datanishimura_accp;
setnishimura;
 wheretest='AntiCCP';
run;
/*Createatwoseparaterecordsforthetrueresultsineachstudy,
thefirstforthediseasedgroup,andthesecondforthenondiseasedgroup.
Thevariablesensisanindicatorwhichtakesthevalue1iftrue=truepositivesand0otherwise,
thevariablespecisalsoanindicatorthattakesthevalue1iftrue=truenegativesand0otherwise*/
datanishimura_accp;
setnishimura_accp;
sens=1;spec=0;true=tp;n=tp+fn;output;
sens=0;spec=1;true=tn;n=tn+fp;output;
run;
/*Ensurethatbothrecordsforastudyareclusteredtogether*/
procsortdata=nishimura_accp;
bystudy_id;
run;
/*RuntheBivariatemodelwithnocovariates
The"cov"optionrequeststhatacovariancematrixisprintedfor
allmodelparameterestimates.The"ecov"optionrequestsacovariancematrix
foralladditionalestimatesthatarecomputed.*/
procnlmixeddata=nishimura_accpcovecov;
/*specifystartingvaluesforallparameterstobeestimated
andensurethatthevariancesoftherandomeffectscannotbenegative*/
parmsmsens=1to2by0.5mspec=2to4by0.5s2usens=0.2s2uspec=0.6covsesp=0;
boundss2usens>=0;
boundss2uspec>=0;
logitp=(msens+usens)*sens+(mspec+uspec)*spec;
p=exp(logitp)/(1+exp(logitp));
modeltrue~binomial(n,p);
/*usensanduspecrepresenttherandomeffects.Thearebothassumedtobe
normallydistributedwithmeanzero.Theirvariancesestimatesares2usensands2uspec,
andtheircovarianceestimateiscovesp*/
randomusensuspec~normal([0,0],[s2usens,covsesp,s2uspec])
subject=study_idout=randeffs;
/*Additionalestimatesthatarefunctionsofthemodelparameterscanbeestimtedhere:
e.gthepositiveandnegativelikelihoodratios*/
estimate'logLR+'log((exp(msens)/(1+exp(msens)))/(1(exp(mspec)/(1+exp(mspec)))));
estimate'logLR'log((1(exp(msens)/(1+exp(msens))))/(exp(mspec)/(1+exp(mspec))));
run;
/*Checkassumptionofnormalityfortherandomeffects*/
procunivariatedata=randeffsplotnormal;
classeffect;
varestimate;
run;
/*CreateadummyvariableforCCPgeneration,codedas0for'CCP1'(thereferentgeneration)
andcodedas1for'CCP2'.Thisnewvariableisaddedtothedatasetsetcreatedabove.*/
datanishimura_accp;
setnishimura_accp;
ccpg=0;
ifccp_generation="CCP2"thenccpg=1;
run;
/*addthecovariateCCPGtothemodeltoallowbothsensitivityandspecificitytobe
associatedwithgenerationofthetest*/
procnlmixeddata=nishimura_accpcovecov;
parmsmsens=1mspec=2s2usens=0.2s2uspec=0.6covsesp=0se2=0sp2=0;
50 | Page
boundss2usens>=0;
boundss2uspec>=0;
logitp=(msens+usens+se2*ccpg)*sens+(mspec+uspec+sp2*ccpg)*spec;
p=exp(logitp)/(1+exp(logitp));
modeltrue~binomial(n,p);
randomusensuspec~normal([0,0],[s2usens,covsesp,s2uspec])
subject=study_idout=randeffs;
/*Estimatelogit(sensitivity)andlogit(specificity)forCCP2
(theircorrelationwillbeoutputbecauseofthe"ecov"optionfornlmixed),
andalsologlikelihoodratiosforCCP1andCCP2*/
estimate'logitsensCCP2'msens+se2;
estimate'logitspecCCP2'mspec+sp2;
estimate'logLR+CCP1'log((exp(msens)/(1+exp(msens)))/(1(exp(mspec)/(1+exp(mspec)))));
estimate'logLR‐CCP1'log((1(exp(msens)/(1+exp(msens))))/(exp(mspec)/(1+exp(mspec))));
estimate'logLR+CCP2'log((exp(msens+se2)/(1+exp(msens+se2)))/(1(exp(mspec+sp2)/(1+exp(mspec+sp2)))));
estimate'logLR‐CCP2'log((1(exp(msens+se2)/(1+exp(msens+se2))))/(exp(mspec+sp2)/(1+exp(mspec+sp2))));
run;
/*Checkassumptionofnormalityfortherandomeffects*/
procunivariatedata=randeffsplotnormal;
classeffect;
varestimate;
run;
51 | Page
DataandSASfileforExample2RheumatoidFactorasamarkerforRheumatoid
Arthritis.
Data(nishimuraRF.csv)
test study_id CCPgeneration tp fp fn tn methodofmeasurement
RF Young1991 25 1 14 20 Rheumatoidarthritishemagglutination
RF Nell2005 56 11 46 87 Notreported
RF Quinn2006 115 53 67 63 Notreported
RF Bizzaro2001 61 36 37 196 Nephelometry
RF Bombardieri2004 27
6 3 33 Nephelometry
RF Das2004 42 46 14 127 Nephelometry
RF FernandezSuarez2005 30 2 23 73 Nephelometry
RF Girelli2004 32 29 3 13 Nephelometry
RF GoldbachMansky2000 70 39 36 93 Nephelometry
RF Greiner2005 75 42 12 191 Nephelometry
RF GrootenboerMignot2004
64 18 29 73 Nephelometry
RF Hitchon2004 32 10 9 13 Nephelometry
RF Jansen2003 130 8 128 113 Nephelometry
RF Kwok2005 77 16 52 52 Nephelometry
RF LopezHoyos2004 36 3 5 70 Nephelometry
RF Sauerland2005 161 89 7 360 Nephelometry
RF Spiritus2004 57 9
33 93 Nephelometry
RF Suzuki2003 383 38 166 170 Nephelometry
RF Swedler1997 89 3 9 39 Nephelometry
RF Aho1999 64 16 27 153 LA
RF AnuradhaandChopra2005 482 2 82 153 LA
RF Berthelot1995 80 50 39 45 LA
RF Choi2005 261 54
63 197 LA
RF Cordonnier1996 20 2 29 18 LA
RF DeRycke2004 93 28 25 118 LA
RF Despres1994 143 39 63 130 LA
RF Kamali2005 20 32 26 25 LA
RF LeeandSchur2003 73 22 29 90 LA
RF Raza2005 22 2
20 80 LA
RF Saraux1995 8 8 31 91 LA
RF Soderlin2004 5 4 11 49 LA
RF Thammanichanond2005 57 25 6 111 LA
RF Vittecoq2001 26 1 32 29 LA
RF Winkles1989 113 19 29 481 LA
RF Banchuin1992 36 6 41 313
ELISA
RF Bas2003 143 43 53 196 ELISA
RF CarpenterandBartkowiak1989 60 8 20 119 ELISA
RF DavisandStein1989 18 3 31 25 ELISA
RF deBois1996 8 8 0 31 ELISA
RF Dubucquoi2004 84 41 56 90 ELISA
RF GomesDaudrix1994 48
1 40 99 ELISA
RF Jonsson1998 50 14 20 191 ELISA
RF RantapaaDahlqvist2003 49 23 28 359 ELISA
RF Saraux2003 35 8 51 149 ELISA
RF Schellekens2000 80 28 69 284 ELISA
RF Vallbracht2004 196 75 99 345 ELISA
RF vanLeeuwen1988 163
10 28 140 ELISA
RF Vasiliauskiene2001 75 21 21 106 ELISA
RF Visser1996 157 287 78 1466 ELISA
RF Vittecoq2004 62 11 114 127 ELISA

52 | Page
SASProgram(nishimuraRF.sas):
/*Importdata*/
procimportout=nishimura
datafile='C:\chapter10\nishimuraRF.csv'
dbms=csv
replace;
getnames=yes;
run;
/*selectonlystudiesthathaveevaluatedRF*/
datanishimura_RF;
setnishimura;
wheretest='RF';
run;
procprint;
run;
datanishimura_RF;
setnishimura_RF;

/*Createseparaterecordsforthediseasedandnondiseasedgroupsineachstudy
Thevariabledisisthediseaseindicatorwhichtakesthevalue0.5ifdiseased
and‐0.5ifnotdiseased.*/
 dis=0.5;pos=tp;n=tp+fn;output;
 dis=0.5;pos=fp;n=tn+fp;output;
run;
/*Ensurethatbothrecordsforastudyareclusteredtogether*/
procsortdata=nishimura_RF;
 bystudy_iddis;
run;
/*RuntheRutterandGatsonisHSROCmodelwithnocovariates.
requestcovariancematricesformodelparameters("cov")and
alsoforadditionalestimatesthatarecomputed("ecov")*/
procnlmixeddata=nishimura_RFecovcov;
/*setstartingvaluesforallmodelparameterstobeestimated*/
parmsalpha=2theta=0beta=0s2ua=0s2ut=0;
 logitp=(theta+ut+(alpha+ua)*dis)*exp((beta)*dis);
 p=exp(logitp)/(1+exp(logitp));
 modelpos~binomial(n,p);
/*therandomeffectsforaccuracy(ua)andthreshold(ut)areassumedtobe
approximatelynormallydistributed,bothwithmeanzeroandwithvariances
s2uaands2utrespectively.Thecovarianceoftherandomeffectsissetto0.*/
 randomutua~normal([0,0],[s2ut,0,s2ua])subject=study_idout=randeffs;
run;
/*CreatetwodummyvariablesfortoallowforthethreeRFmeasurementmethods.
LAisthereferentmethod
Deletethe2studiesthatdidnotreportthemethod,andthestudythatused
adifferentmethod.*/
datanishimura_RF;
setnishimura_RF;
ifmethod_of_measurementne"ELISA"andmethod_of_measurementne"Nephelometry"and
method_of_measurementne"LA"thendelete;
 rfm1=0;rfm2=0;;
 ifmethod_of_measurement="ELISA"thenrfm1=1;
 ifmethod_of_measurement="Nephelometry"thenrfm2=1;
run;
/*Ensurethatbothrecordsforastudyareclusteredtogether*/
procsortdata=nishimura_RF;
 bystudy_iddis;
run;
/*includecovariatestoallowaccuracy,thresholdandshapetovarybymethod*/
procnlmixeddata=nishimura_RFecovcov;
parmsalpha=2theta=0beta=0s2ua=1s2ut=1a1=0a2=0t1=0t2=0b1=0b2=0;
53 | Page
 logitp=(theta+ut+t1*rfm1+t2*rfm2+(alpha+ua+a1*rfm1+a2*rfm2)*dis)*
exp((beta+b1*rfm1+b2*rfm2)*dis);
 p=exp(logitp)/(1+exp(logitp));
 modelpos~binomial(n,p);
 randomutua~normal([0,0],[s2ut,0,s2ua])subject=study_idout=randeffs;
/*parameterestimatesforthemethodsofRFmeasurement;*/
 estimate'alphaELISA'alpha+a1;
estimate'thetaELISA'theta+t1;
 estimate'betaELISA' beta+b1;
estimate'alphaNephelometry'alpha+a2;
estimate'thetaNephelometry'theta+t2;
estimate'betaNephelometry'beta+b2;
run;
/*simplifythemodeltoassumethatallthreecurveshavethesameshape*/
procnlmixeddata=nishimura_RFecovcov;
parmsalpha=2theta=0beta=0s2ua=1s2ut=1a1=0a2=0t1=0t2=0;
 logitp=(theta+ut+t1*rfm1+t2*rfm2+(alpha+ua+a1*rfm1+a2*rfm2)*dis)*
exp((beta)*dis);
 p=exp(logitp)/(1+exp(logitp));
 modelpos~binomial(n,p);
 randomutua~normal([0,0],[s2ut,0,s2ua])subject=study_idout=randeffs;
/*parameterestimatesforthemethodsofRFmeasurement;*/
 estimate'alphaELISA'alpha+a1;
estimate'thetaELISA'theta+t1;
estimate'alphaNephelometry'alpha+a2;
estimate'thetaNephelometry'theta+t2;
run;
/*checkassumptionofnormalityforrandomeffects*/
procunivariatedata=randeffsplotnormal;
classeffect;
varestimate;
run;
/*thismodelassumesthatallthreecurveshavethesameshapeandposition.
Thepositionisthesamebecausetherearenocovariatesincludedforaccuracy.
Comparisonwiththepreviousmodelallowsustotestwhetheraccuracyvariesbymethod.*/
procnlmixeddata=nishimura_RFecovcov;
parmsalpha=2theta=0beta=0s2ua=1s2ut=1t1=0t2=0;
 logitp=(theta+ut+t1*rfm1+t2*rfm2+(alpha+ua)*dis)*
exp((beta)*dis);
 p=exp(logitp)/(1+exp(logitp));
 modelpos~binomial(n,p);
 randomutua~normal([0,0],[s2ut,0,s2ua])subject=study_idout=randeffs;
run;
/*checkassumptionofnormalityforrandomeffects*/
procunivariatedata=randeffsplotnormal;
classeffect;
varestimate;
run;
54 | Page
DataandSASfileforExample3CTversusMRIforthediagnosisofcoronaryarterydisease
Data(schuetz.csv)
Test Study_ID tp fp fn tn Indirect
CT Achenbach2005 25 4 0 19 1
CT Alkadhi2008 57 12 2 79 1
CT Andreini2007 17 0 0 44 1
CT Bayrak2008 64 4 0 32 1
MRI Bedaux2002 7 1 0 1 1
MRI Bogaert2003 12
3 3 1 1
CT Bonmassari2006 12 2 0 8 1
CT Brodoefel2008 73 5 0 22 1
CT Budoff2008 52 30 3 142 1
CT Cademartiri2007 20 1 0 51 1
CT Carrascosa2007 13 1 1 5 1
MRI Cheng2006 21 0 4
3 1
CT Chow2007 18 0 1 7 1
CT Coles2007 77 13 7 16 1
CT Cornily2007 9 1 0 23 1
CT Davin2007 42 4 12 30 1
CT Deetjen2007 31 3 2 26 1
CT Dewey2006 62 5 4 46 0
MRI
Dewey2006 42 2 7 39 0
CT Dewey2009 11 1 0 17 1
CT Ehara2006 59 1 1 6 1
CT Erdogan2006 33 2 3 5 1
CT Garcia2006 58 58 1 70 1
CT Gaudio2008 16 2 1 48 1
MRI Gerber2005
17 1 2 6 1
CT Ghersin2006 29 11 6 13 1
CT Ghostine2006 28 2 1 35 1
CT Gilard2006 11 9 0 35 1
CT Grosse2007 29 0 1 10 1
MRI Hackenbroch2004 18 5 4 13 1
CT Hacker2007 19 1 1
9 1
CT Halon2007 72 10 13 16 1
CT Hausleiter2007 101 35 1 106 1
CT Henneman2006 12 1 1 6 1
CT Henneman2008 28 0 0 12 1
CT Herzog2007a 19 6 0 30 1
CT Herzog2007b 16 1 0 23 1
CT Herzog2008 18 2 0 10 1
CT Hoffmann2004 19 3 2 9 1
CT Hoffmann2005 43 2 2 28 1
MRI Ichikawa2007 11 8 6 33 1
MRI Ikonen2003 42 15 5 7 1
CT Johnson2007 17 2 0 16 1
CT Kaiser2005
97 18 16 18 1
CT Kefer2005 32 6 2 12 0
MRI Kefer2005 30 9 4 9 0
MRI Kim2001 56 25 4 18 1
MRI Klein2008 20 11 2 13 1
CT Kolnes2006 33 8 1 8 1
CT Laissy2007 11 2
2 25 1
CT Langer2009 25 2 1 40 0
MRI Langer2009 18 15 8 27 0
CT Leber2007 20 7 1 60 1
CT Leschka2005 47 0 0 20 1
CT Leschka2008a 69 8 2 35 1
CT Leschka2008b 35 5 1 33 1
CT Maintz2007 15 2 1 2 0
MRI Maintz2007 15 1 1 3 0
CT Manghat2007 3 0 0 12 1
CT Marano2008 179 17 12 119 1
CT Martuscelli2004 43 9 0 9 1
CT Maruyama2008 75 5 2 65 1
MRI McCarthy
2007 13 6 2 8 1
CT Meijboom2006 18 4 0 48 1
CT Meijboom2007 88 4 0 12 1
CT Meijboom2008 244 41 2 73 1
CT Miller2007 139 13 24 115 1
CT MirAkbari2009 41 11 10 20 1
CT Mollet2004 106
3 0 18 1
CT Mollet2005b 31 3 0 17 1
CT Mollet2005a 38 1 0 12 1
CT Moon2005 30 2 5 21 1
55 | Page
CT
MorganHughes
2005 32 1 0 24 1
CT Nikolaou2006 4 3 1 52 1
CT Nikolaou2006b 38 6 1 23 1
CT Olivetti2006 15 0 3 13 1
CT Oncel2007a 62 0 0 18 1
CT Oncel2007b 8 1 1 5 1
CT Pontone
2007a 66 7 0 43 1
CT Pontone2007b 56 5 4 31 1
CT Postel2007 42 5 5 34 1
CT Pouleur2008 16 7 1 53 0
MRI Pouleur2008 17 17 0 43 0
CT Pugliese2006 25 1 0 9 1
CT Pundziute2008 53
4 1 42 1
CT Raff2005 38 3 2 27 1
CT Reant2006 12 6 1 21 1
MRI Regenfus2000 34 6 2 8 1
CT Rixe2009 40 6 0 30 1
CT Rodevand2006 49 37 0 15 1
CT Romeo2007 43 2 0 123
1
CT Ropers2003 35 8 6 28 1
CT Ropers2006 25 5 1 50 1
CT Ropers2007 41 11 1 47 1
MRI Sakuma2006 42 6 9 56 1
MRI Sandstede1999 10 1 1 7 1
CT Scheffel2006 14 0 1 15 1
CT
Scheffel2007 13 2 0 35 1
CT Scheffel2008 66 4 0 50 1
CT Schuijf2006 29 1 2 28 1
CT Shabestari2007 104 10 4 20 1
CT Stolzmann2008 55 2 0 43 1
CT Tsai2007 50 5 1 22 1
CT Turkvatan2008 116
2 2 33 1
CT Ulimoen2008 32 6 4 6 1
CT Watkins2007 44 3 1 37 1
CT Weustink2007 76 3 1 20 1
MRI Yang2009 32 5 2 23 1

56 | Page
SASProgram(scheutz.sas)
procimportout=schuetz
datafile='C:\chapter10\schuetz.csv'
dbms=csv
replace;
getnames=yes;
run;
/*Createtwoseparaterecordsforthetrueresultsineachstudy,
thefirstforthediseasedgroup,andthesecondforthenondiseasedgroup.
Thevariablesensisanindicatorwhichtakesthevalue1iftrue=truepositivesand0otherwise,
thevariablespecisalsoanindicatorthattakesthevalue1iftrue=truenegativesand0otherwise*/
dataschuetz;
setschuetz;
testtype=0;
iftest="CT"thentesttype=1;
sens=1;spec=0;true=tp;n=tp+fn;output;
sens=0;spec=1;true=tn;n=tn+fp;output;
run;
/*Ensurethatbothrecordsforastudyareclusteredtogether*/
procsortdata=schuetz;
bystudy_idtest;
run;
/*RuntheBivariatemodelwithnocovariates
The"cov"optionrequeststhatacovariancematrixisprintedfor
allmodelparameterestimates.The"ecov"optionrequestsacovariancematrix
foralladditionalestimatesthatarecomputed.*/
procnlmixeddata=schuetzcovecov;
parmsmsens=2mspec=1s2usens=0s2uspec=0covsesp=0;
logitp=(msens+usens)*sens+(mspec+uspec)*spec;
p=exp(logitp)/(1+exp(logitp));
modeltrue~binomial(n,p);
randomusensuspec~normal([0,0],[s2usens,covsesp,s2uspec])subject=study_idout=randeffs;
run;
/*Bivariatemodelwithtestasacovariateusingtheindicatorvariabletesttype.
MRIisthereferencecategory.
Variancesoftherandomeffectsareassumednottovarybytesttype.*/
procnlmixeddata=schuetzcovecov;
parmsmsens=2mspec=1s2usens=0s2uspec=0covsesp=0se_CT=1sp_CT=0;
logitp=(msens+usens+se_CT*testtype)*sens+(mspec+uspec+sp_CT*testtype)*spec;
p=exp(logitp)/(1+exp(logitp));
modeltrue~binomial(n,p);
randomusensuspec~normal([0,0],[s2usens,covsesp,s2uspec])subject=study_idout=randeffs;
/*Estimatelogit(sensitivity),andlogit(specificity)*/
estimate'logitsensCT'msens+se_CT;
estimate'logitspecCT'mspec+sp_CT;
run;
/*Checkassumptionofnormalityfortherandomeffects*/
procunivariatedata=randeffsplotnormal;
classeffect;
varestimate;
run;
/*Bivariatemodelwitheffectoftesttypeononlysensitivity*/
procnlmixeddata=schuetzcovecov;
57 | Page
parmsmsens=2mspec=1s2usens=0s2uspec=0covsesp=0se_CT=1;
logitp=(msens+usens+se_CT*testtype)*sens+(mspec+uspec)*spec;
p=exp(logitp)/(1+exp(logitp));
modeltrue~binomial(n,p);
randomusensuspec~normal([0,0],[s2usens,covsesp,s2uspec])subject=study_idout=randeffs;
run;
/*Bivariatemodelwitheffectoftesttypeononlyspecificity*/
procnlmixeddata=schuetzcovecov;
parmsmsens=2mspec=1s2usens=0s2uspec=0covsesp=0sp_CT=0;
logitp=(msens+usens)*sens+(mspec+uspec+sp_CT*testtype)*spec;
p=exp(logitp)/(1+exp(logitp));
modeltrue~binomial(n,p);
randomusensuspec~normal([0,0],[s2usens,covsesp,s2uspec])subject=study_idout=randeffs;
run;
/*DIRECTCOMPARISONS*/
/*CreatenewdatasetofstudieswithwithinstudycomparisonofCTandMRI.
"indirect"isabinaryvariableinthedatasetcoded1ifthestudyevaluated
onlyonetest(CTorMRI)and0ifbothtestswereevaluatedinastudy*/
dataschuetz_direct;
setschuetz;
whereindirect=0;
run;
/*FitBivariatemodelwithoutcovariate*/
procnlmixeddata=schuetz_directcovecovqpoints=10;
parmsmsens=2mspec=1s2usens=0s2uspec=0covsesp=0;
boundss2usens>=0;
boundss2uspec>=0;
logitp=(msens+usens)*sens+(mspec+uspec)*spec;
p=exp(logitp)/(1+exp(logitp));
modeltrue~binomial(n,p);
randomusensuspec~normal([0,0],[s2usens,covsesp,s2uspec])subject=study_idout=randeffs;
run;
/*FitBivariatemodelwithfixedeffects*/
procnlmixeddata=schuetz_directcovecovqpoints=10;
parmsmsens=2mspec=1;
logitp=(msens)*sens+(mspec)*spec;
p=exp(logitp)/(1+exp(logitp));
modeltrue~binomial(n,p);
run;
/*FitBivariatemodelwithrandomeffectforspecificityonly*/
procnlmixeddata=schuetz_directcovecovqpoints=10;
parmsmsens=2mspec=1s2uspec=0;
logitp=(msens)*sens+(mspec+uspec)*spec;
p=exp(logitp)/(1+exp(logitp));
modeltrue~binomial(n,p);
randomuspec~normal([0],[s2uspec])subject=study_idout=randeffs;
58 | Page
run;
/*FitBivariatemodelwithrandomeffectforsensitivityonly*/
procnlmixeddata=schuetz_directcovecovqpoints=10;
parmsmsens=2mspec=1s2usens=0;
logitp=(msens+usens)*sens+(mspec)*spec;
p=exp(logitp)/(1+exp(logitp));
modeltrue~binomial(n,p);
randomusens~normal([0],[s2usens])subject=study_idout=randeffs;
run;
/*FitBivariatemodelwithcovariatefortesttypeonbothsensandspec.
Randomeffectsonlyforspecificity*/
procnlmixeddata=schuetz_directcovecovqpoints=10;
parmsmsens=2mspec=1s2uspec=0se_CT=0sp_CT=0;
boundss2uspec>=0;
logitp=(msens+se_CT*testtype)*sens+(mspec+uspec+sp_CT*testtype)*spec;
p=exp(logitp)/(1+exp(logitp));
modeltrue~binomial(n,p);
randomuspec~normal([0],[s2uspec])subject=study_idout=randeffs;

/*Estimatelogit(sensitivity)andlogit(specificity)*/
estimate'logitsensCT'msens+se_CT;
estimate'logitspecCT'mspec+sp_CT;
run;
/*FitBivariatemodelwithcovariatefortesttypeonspecificity.
Randomeffectsonlyforspecificity*/
procnlmixeddata=schuetz_directcovecovqpoints=10;
parmsmsens=2mspec=1s2uspec=0sp_CT=0;
boundss2uspec>=0;
logitp=(msens)*sens+(mspec+uspec+sp_CT*testtype)*spec;
p=exp(logitp)/(1+exp(logitp));
modeltrue~binomial(n,p);
randomuspec~normal([0],[s2uspec])subject=study_idout=randeffs;

run;
/*FitBivariatemodelwithcovariatefortesttypeonsensitivity.
Randomeffectsonlyforspecificity*/
procnlmixeddata=schuetz_directcovecovqpoints=10;
parmsmsens=2mspec=1s2uspec=0se_CT=0;
boundss2uspec>=0;
logitp=(msens+se_CT*testtype)*sens+(mspec+uspec)*spec;
p=exp(logitp)/(1+exp(logitp));
modeltrue~binomial(n,p);
randomuspec~normal([0],[s2uspec])subject=study_idout=randeffs;

run;
59 | Page
References
Arends2008
ArendsLR,HamzaTH,vanHouwelingenJC,HeijenbrokKalMH,HuninkMG,StijnenT.Bivariate
randomeffectsmetaanalysisofROCcurves.MedDecisMaking2008;28:621638.
Begg1994a
BeggCB.Publicationbias.In:CooperJHL(editors).TheHandbookofResearchSynthesis.NewYork:
Sage
Foundation,1994.
Begg1994b
BeggCB,MazumdarM.Operatingcharacteristicsofarankcorrelationtestforpublicationbias.
Biometrics1994;50:10881101.
Chappell2009
ChappellFM,RaabGM,WardlawJM.WhenaresummaryROCcurvesappropriatefordiagnostic
metaanalyses?StatMed2009;28:26532668.
Chu2006
Chu
H,ColeSR.Bivariatemetaanalysisofsensitivityandspecificitywithsparsedata:ageneralized
linearmixedmodelapproach.JClinEpidemiol2006;59:13311332.
Deeks2001
DeeksJJ.Systematicreviewsinhealthcare:Systematicreviewsofevaluationsofdiagnosticand
screeningtests.BMJ2001;323:157162.
Deeks2008

DeeksJJ,HigginsJPT,AltmanDG.Chapter9:Analysingdataandundertakingmetaanalyses.In:
HigginsJPT,GreenS(editors).CochraneHandbookforSystematicReviewsofInterventions.
Chichester(UK):JohnWiley&Sons,2008.
Deeks2005
DeeksJJ,MacaskillP,IrwigL.Theperformanceoftestsofpublicationbiasand
othersamplesize
effectsinsystematicreviewsofdiagnostictestaccuracywasassessed.JClinEpidemiol2005;58:
882893.
Dickersin1990
DickersinK.Theexistenceofpublicationbiasandriskfactorsforitsoccurrence.JAMA1990;263:
13851389.
Dukic2003
DukicV,GatsonisC.Metaanalysis ofdiagnostictest
accuracyassessmentstudieswithvarying
numberofthresholds.Biometrics2003;59:936946.
Egger1997
EggerM,DaveySG,SchneiderM,MinderC.Biasinmetaanalysisdetectedbyasimple,graphical
test.BMJ1997;315:629634.
Hamza2009
HamzaTH,ArendsLR,vanHouwelingenHC,StijnenT.Multivariate
randomeffectsmetaanalysisof
diagnostictestswithmultiplethresholds.BMCMedResMethodol2009;9:73.
60 | Page
Harbord2007
HarbordRM,DeeksJJ,EggerM,WhitingP,SterneJA.Aunificationofmodelsformetaanalysisof
diagnosticaccuracystudies.Biostatistics2007;8:239251.
Harbord2006
HarbordRM,EggerM,SterneJA.Amodifiedtestforsmallstudyeffectsinmetaanalysesof
controlledtrialswithbinary
endpoints.StatMed2006;25:34433457.
Harbord2009
HarbordRM,WhitingP.metandi:Metaanalysisofdiagnosticaccuracyusing hierarchicallogistic
regression.StataJournal2009;9:211229.
Higgins2003
HigginsJP,ThompsonSG,DeeksJJ,AltmanDG.Measuringinconsistencyinmetaanalyses.BMJ
2003;327:557560.
Ioannidis
1998
IoannidisJP.Effectofthestatisticalsignificanceofresultsonthetimetocompletionandpublication
ofrandomizedefficacytrials.JAMA1998;279:281286.
Irwig1995
IrwigL,MacaskillP,GlasziouP,FaheyM.Metaanalyticmethodsfordiagnostictestaccuracy.JClin
Epidemiol1995;48:119130.
Leeflang
2008
LeeflangMM,MoonsKG,ReitsmaJB,ZwindermanAH.Biasinsensitivityandspecificitycausedby
datadrivenselectionofoptimalcutoffvalues:mechanisms,magnitude,andsolutions.ClinChem
2008;54:729737.
Littenberg1993
LittenbergB,MosesLE.Estimatingdiagnosticaccuracyfrommultipleconflictingreports:anew
metaanalytic
method.MedDecisMaking1993;13:313321.
Macaskill2004
MacaskillP.EmpiricalBayesestimatesgeneratedinahierarchicalsummaryROCanalysisagreed
closelywiththoseofafullBayesiananalysis.JClinEpidemiol2004;57:925932.
Macaskill2001
MacaskillP,WalterSD,IrwigL.Acomparisonofmethodsto
detectpublicationbiasinmetaanalysis.
StatMed2001;20:641654.
McCullagh1980
McCullagh.Regressionmodelsforordinaldata.JournaloftheRoyalStatisticalSocietySeriesB1980;
42:109142.
Moses1993
MosesLE,ShapiroD,LittenbergB.Combiningindepe ndentstudiesofadiagnostictestintoa
summary
ROCcurve:dataanalyticapproachesandsomeadditionalconsiderations.StatMed1993;
12:12931316.
Nishimura2007
NishimuraK,SugiyamaD,KogataY,TsujiG,NakazawaT,KawanoS,SaigoK,MorinobuA,KoshibaM,
61 | Page
KuntzKM,KamaeI,KumagaiS.Metaanalysis:diagnosticaccuracyofanticycliccitrullinatedpeptide
antibodyandrheumatoidfactorforrheumatoidarthritis.AnnInternMed2007;146:797808.
Pepe2003
PepeM.TheStatisticalEvaluationofMedicalTestsforMisclassificationandPrediction.Oxford:
OxfordUniversityPress,2003.
Peters2006
PetersJL,SuttonAJ,JonesDR,AbramsKR,RushtonL.Comparisonoftwomethodstodetect
publicationbiasinmetaanalysis.JAMA2006;295:676680.
Reitsma2005
ReitsmaJB,GlasAS,Rutjes AW,ScholtenRJ,BossuytPM,ZwindermanAH.Bivariateanalysisof
sensitivityandspecificityproducesinformativesummarymeasuresin
diagnosticreviews.JClin
Epidemiol2005;58:982990.
Rutter1995
RutterCM,GatsonisCA.Regressionmethodsformetaanalysisofdiagnostictestdata.AcadRadiol
1995;2Suppl1:S48S56.
Rutter2001
RutterCM,GatsonisCA.Ahierarchicalregressionapproachtometaanalysisofdiagnostictest
accuracyevaluations.
StatMed2001;20:28652884.
Schuetz2010
SchuetzGM,ZacharopoulouNM,SchlattmannP,DeweyM.Metaanalysis:noninvasivecoronary
angiographyusingcomputedtomographyversusmagneticresonanceimaging.AnnInternMed
2010;152:167177.
Schwarzer2002
SchwarzerG,AntesG,SchumacherM.InflationoftypeIerrorrateintwo
statisticaltestsforthe
detectionofpublicationbiasinmetaanalyseswithbinary outcomes.StatMed2002;21:24652477.
Takwoingi2008
TakwoingiY,DeeksJJ.METADAS:ASASmacroformetaanalysisofdiagnosticaccuracystudies
(availableathttp://srdta.cochrane.org/softwaredevelopment)
.CochraneCollaboration,2008.
Tosteson1988
TostesonAN,BeggCB.AgeneralregressionmethodologyforROCcurveestimation.MedDecis
Making1988;8:204215.
Zhou2002
ZhouXH,ObuchowskiN,McClishD.StatisticalMethodsinDiagnosticMedicine.Chichester:Wiley,
2002.
Zwinderman2008
ZwindermanAH,BossuytPM.Weshouldnot
pooldiagnosticlikelihoodratiosinsystematicreviews.
StatMed2008;27:687697.