EURURO Vol. 72 No. 6

Letter to the Editor

Reply to Erfan Ayubi and Saeid Safiri’s Letter to the

Editor re: R. Jeffrey Karnes, Voleak Choeurng,

Ashley E. Ross, et al. Validation of a Genomic Risk

Classifier to Predict Prostate Cancer-specific Mortality

in Men with Adverse Pathologic Features. Eur Urol.

In press.

https://doi.org/10.1016/j.eururo.2017.03.036.

Methodological Issues

We appreciate the interest in our study and the opportunity

to clarify a seemingly simple but actually complex issue.

The letter from Ayubi and Safiri raises two points. (1) When

the Decipher genomic classifier (GC) as a continuous

variable was added to a model containing the CAPRA-S

score

[1] ,

the increase in area under the receiver operating

characteristic curve (AUC) was relatively modest, from

0.73 to 0.77. (2) The AUCs for CAPRA-S alone and GC added

to CAPRA-S should be tested to determine if the increase in

AUC is statistically significant. Both of these points could

be restated as asking how do you know if adding GC to

CAPRA-S improves prediction performance? What many

investigators have observed empirically, and what Austin

and Steyerberg

[2]

have demonstrated in a series of

elegant simulations, is that the higher the AUC for a base

prediction model, the smaller is the absolute increase in

AUC (and other measures of prediction performance)

when a new biomarker is added to the base model. They

showed that when the AUC for the base model is 0.70,

increases in AUC are generally

0.05 on addition of a

continuous biomarker unless the odds ratio per unit

increase is 2.0. Standard clinicopathologic factors such

as Gleason score, prostate-specific antigen, and stage are

already strongly predictive of prostate cancer–specific

mortality (PCSM), so it is not surprising that the absolute

increase in AUC was modest.

However, Pepe and colleagues

[3]

have demonstrated

that if a biomarker is statistically significant when added to

a base model, this is equivalent to demonstrating a

statistically significant increase in prediction performance.

Thus, the statistical significance of the GC when added to

CAPRA-S indicated that GC addition significantly improved

the prediction performance. Pepe et al

[3]

also showed that

the statistical tests recommended by Ayubi and Safiri for

compare the AUCs for two predictive models are actually

invalid, and are biased in the direction of nonsignificance.

This is because those tests were developed to compare ROC

curves for variables that were directly measured, such as

computed tomography image density. However, risk scores

from prediction models are estimates derived from the

same data they are tested on; without delving into

theoretical aspects, this invalidates the statistical properties

of the tests cited by Ayubi and Safiri

[2]

Both of the above issues, however, skirt the most

important issue: does addition of a biomarker to established

prognostic factors change clinical decisions in a beneficial way

for individual patients? There are a number of metrics of

model prediction performance. All have strengths and

limitations; no single metric is adequate in fully addressing

the impact on clinical decisions. These issues are clearly

presented in an excellent review of model performance

evaluation (accessible without a biostatistics background)

[4] .

An important point demonstrated by Steyerberg et al is that

once a model has been developed, a prediction rule is usually

required to translate that model into a clinical decision,

usually in the form of a binary cutpoint. This often provides

the most clinically useful information about

the

proportion of patients reclassified according to the biomarker

into risk categories more closely aligned with their outcome.

Regarding the GC evaluation in our paper, using the binary

classification of low-intermediate versus high GC to further

stratify within CAPRA-S risk groups revealed a sixfold range of

PCSM rates (2.8% to 18%) within the low-intermediate CAPRA-

S risk group, and a nearly sixfold range within the high-risk

CAPRA-S group (5.%% to 30%; Supplementary Table 2

[1] )

These are large gradations of risk, which, if externally

validated, suggest that the Decipher GC can provide clinically

important actionable risk information for men already

classified according to CAPRA-S.

Conflicts of interest:

Bruce J. Trock has been a consultant to GenomeDx

Biosciences, and has received research grant support from Myriad

Genetics, Inc. R. Jeffery Karnes has nothing to disclose.

References

[1] Karnes RJ, Choeurng V, Ross AE, et al. Validation of a genomic risk

classifier to predict prostate cancer-specific mortality in men with

E U R O P E A N U R O L O G Y 7 2 ( 2 0 1 7 ) e 1 5 8 – e 1 5 9

available at

www.scienced irect.com

journal homepage:

www.europeanurology.com

DOIs of original articles:

http://dx.doi.org/10.1016/j.eururo.2017.03.036 , http://dx.doi.org/10.1016/j.eururo.2017.05.031

http://dx.doi.org/10.1016/j.eururo.2017.05.030

0302-2838/