Σάββατο 2 Απριλίου 2016

Pitfalls in prediction modelling for normal tissue toxicity in radiation therapy: an illustration with the ISE and MARIE cohorts

alertIcon.gif

Publication date: Available online 1 April 2016
Source:International Journal of Radiation Oncology*Biology*Physics
Author(s): Chamberlain Mbah, Hubert Thierens, Olivier Thas, Jan De Neve, Jenny Chang-Claude, Petra Seibold, Akke Botma, Catharine West, Kim De Ruyck
PurposeTo identify the main causes underlying the failure of prediction models for radiotherapy toxicity to replicate.MethodsData were used from two German cohorts ISE (n=418) and MARIE (n=409) of breast cancer patients with similar characteristics and radiotherapy treatments. The toxicity endpoint chosen was telangiectasia. The LASSO (Least Absolute Shrinkage and Selection Operator) logistic regression method was used to build a predictive model for a dichotomised endpoint (RTOG/EORTC score 0,1 or ≥2). Internal AUCs (inAUCs) were calculated by a naive approach where the training data (ISE) was also used for calculating the AUC. Cross validation was also applied to calculate the AUC within the same cohort, a second type of inAUC. InAUCs from cross validation were calculated within ISE and MARIE separately. Models trained on one dataset (ISE) were applied to a test dataset (MARIE) and AUCs calculated (exAUCs).ResultsInAUCs from the naive approach were generally larger than inAUCs from cross validation due to overfitting the training data. InAUCs from cross validation were also generally larger than the exAUCs, reflecting heterogeneity in the predictors between cohorts. The best models with largest inAUCs from cross validation within both cohorts had a number of common predictors: hypertension, normalised total boost and presence of estrogen receptors. Surprisingly the effect (coefficient in the prediction model) of hypertension on telangiectasia incidence was positive in ISE and negative in MARIE. Other predictors were also not common between the two cohorts illustrating that, overcoming overfitting does not solve the problem of replication failure of prediction models completely.ConclusionsOverfitting and cohort heterogeneity are the two main causes of replication failure of prediction models across cohorts. Cross validation and similar techniques (e.g. bootstrapping) cope with overfitting but the development of validated predictive models for radiotherapy toxicity requires strategies that deal with cohort heterogeneity.

Teaser

Overfitting and cohort heterogeneity are identified as the two main causes of replication failure of prediction models across cohorts. Cross validation and other resampling techniques (e.g. bootstrapping) can reduce the effect of overfitting. But the development of validated predictive models for radiotherapy toxicity requires strategies that deal with cohort heterogeneity.


from Cancer via ola Kala on Inoreader http://ift.tt/1RUebOu
via IFTTT

Δεν υπάρχουν σχόλια:

Δημοσίευση σχολίου