

Letter to the Editor
Reply to Erfan Ayubi and Saeid Safiri’s Letter to the
Editor re: R. Jeffrey Karnes, Voleak Choeurng,
Ashley E. Ross, et al. Validation of a Genomic Risk
Classifier to Predict Prostate Cancer-specific Mortality
in Men with Adverse Pathologic Features. Eur Urol.
In press.
https://doi.org/10.1016/j.eururo.2017.03.036.Methodological Issues
We appreciate the interest in our study and the opportunity
to clarify a seemingly simple but actually complex issue.
The letter from Ayubi and Safiri raises two points. (1) When
the Decipher genomic classifier (GC) as a continuous
variable was added to a model containing the CAPRA-S
score
[1] ,the increase in area under the receiver operating
characteristic curve (AUC) was relatively modest, from
0.73 to 0.77. (2) The AUCs for CAPRA-S alone and GC added
to CAPRA-S should be tested to determine if the increase in
AUC is statistically significant. Both of these points could
be restated as asking how do you know if adding GC to
CAPRA-S improves prediction performance? What many
investigators have observed empirically, and what Austin
and Steyerberg
[2]have demonstrated in a series of
elegant simulations, is that the higher the AUC for a base
prediction model, the smaller is the absolute increase in
AUC (and other measures of prediction performance)
when a new biomarker is added to the base model. They
showed that when the AUC for the base model is 0.70,
increases in AUC are generally
<
0.05 on addition of a
continuous biomarker unless the odds ratio per unit
increase is 2.0. Standard clinicopathologic factors such
as Gleason score, prostate-specific antigen, and stage are
already strongly predictive of prostate cancer–specific
mortality (PCSM), so it is not surprising that the absolute
increase in AUC was modest.
However, Pepe and colleagues
[3]have demonstrated
that if a biomarker is statistically significant when added to
a base model, this is equivalent to demonstrating a
statistically significant increase in prediction performance.
Thus, the statistical significance of the GC when added to
CAPRA-S indicated that GC addition significantly improved
the prediction performance. Pepe et al
[3]also showed that
the statistical tests recommended by Ayubi and Safiri for
compare the AUCs for two predictive models are actually
invalid, and are biased in the direction of nonsignificance.
This is because those tests were developed to compare ROC
curves for variables that were directly measured, such as
computed tomography image density. However, risk scores
from prediction models are estimates derived from the
same data they are tested on; without delving into
theoretical aspects, this invalidates the statistical properties
of the tests cited by Ayubi and Safiri
[2].
Both of the above issues, however, skirt the most
important issue: does addition of a biomarker to established
prognostic factors change clinical decisions in a beneficial way
for individual patients? There are a number of metrics of
model prediction performance. All have strengths and
limitations; no single metric is adequate in fully addressing
the impact on clinical decisions. These issues are clearly
presented in an excellent review of model performance
evaluation (accessible without a biostatistics background)
[4] .An important point demonstrated by Steyerberg et al is that
once a model has been developed, a prediction rule is usually
required to translate that model into a clinical decision,
usually in the form of a binary cutpoint. This often provides
the most clinically useful information about
the
proportion of patients reclassified according to the biomarker
into risk categories more closely aligned with their outcome.
Regarding the GC evaluation in our paper, using the binary
classification of low-intermediate versus high GC to further
stratify within CAPRA-S risk groups revealed a sixfold range of
PCSM rates (2.8% to 18%) within the low-intermediate CAPRA-
S risk group, and a nearly sixfold range within the high-risk
CAPRA-S group (5.%% to 30%; Supplementary Table 2
[1] ).
These are large gradations of risk, which, if externally
validated, suggest that the Decipher GC can provide clinically
important actionable risk information for men already
classified according to CAPRA-S.
Conflicts of interest:
Bruce J. Trock has been a consultant to GenomeDx
Biosciences, and has received research grant support from Myriad
Genetics, Inc. R. Jeffery Karnes has nothing to disclose.
References
[1] Karnes RJ, Choeurng V, Ross AE, et al. Validation of a genomic risk
classifier to predict prostate cancer-specific mortality in men with
E U R O P E A N U R O L O G Y 7 2 ( 2 0 1 7 ) e 1 5 8 – e 1 5 9available at
www.scienced irect.comjournal homepage:
www.europeanurology.comDOIs of original articles:
http://dx.doi.org/10.1016/j.eururo.2017.03.036 , http://dx.doi.org/10.1016/j.eururo.2017.05.031.
http://dx.doi.org/10.1016/j.eururo.2017.05.0300302-2838/
#
2017 European Association of Urology. Published by Elsevier B.V. All rights reserved.