Due to the established role of the human papillomavirus (HPV), the optimal treatment for oropharyngeal carcinoma is currently under debate. We evaluated the most important determinants of treatment outcome to develop a multifactorial predictive model that could provide individualized predictions of treatment outcome in oropharyngeal carcinoma patients.
We analyzed the association between clinico-pathological factors and overall and progression-free survival in 168 OPSCC patients treated with curative radiotherapy or concurrent chemo-radiation. A multivariate model was validated in an external dataset of 189 patients and compared to the TNM staging system. This nomogram will be made publicly available at www.predictcancer.org.
Predictors of unfavorable outcomes were negative HPV-status, moderate to severe comorbidity, T3–T4 classification, N2b–N3 stage, male gender, lower hemoglobin levels and smoking history of more than 30 pack years. Prediction of overall survival using the multi-parameter model yielded a C-index of 0.82 (95% CI, 0.76–0.88). Validation in an independent dataset yielded a C-index of 0.73 (95% CI, 0.66–0.79. For progression-free survival, the model’s C-index was 0.80 (95% CI, 0.76–0.88), with a validation C-index of 0.67, (95% CI, 0.59–0.74). Stratification of model estimated probabilities showed statistically different prognosis groups in both datasets (p < 0.001).
This nomogram was superior to TNM classification or HPV status alone in an independent validation dataset for prediction of overall and progression-free survival in OPSCC patients, assigning patients to distinct prognosis groups. These individualized predictions could be used to stratify patients for treatment de-escalation trials.
Several studies have demonstrated that HPV-associated oropharyngeal cancer has a distinctly better survival after treatment, compared to HPV-negative tumors. However, the prognosis of HPV-positive oropharyngeal cancer seems to be significantly worse if there is a history of smoking [, ]. This accumulated evidence suggests that tailored OPSCC cancer therapies, in which specific information about HPV status and other patient characteristics are taken into account, need to be designed as a step forward from current population based therapies.
A tool that combines these factors to accurately anticipate patient’s outcome is needed. An analysis of the RTOG 0129 study proposed a stratification algorithm, combining HPV, T-stage, N-stage and smoking history, to assign patients into different prognostic groups . This single-cohort based algorithm, although able to discriminate patients according to their risk of failure, was based on patients treated within a randomized trial with strict inclusion criteria, including mainly patients with T3–T4 tumors and with limited comorbidity.
A recent approach called rapid learning, which aims to drive the process of knowledge discovery by routinely and iteratively learning from data generated through patient care, proposes an alternative for knowledge extraction to evidence based clinical trials [, ].
Clinical and outcomes information of unselected patients treated with different treatment modalities and with a larger heterogeneity in terms of stages, demographics and comorbidities can be analyzed to generate evidence representative of the consecutive patient in daily clinical practice, particularly for the advance elderly or with high comorbidities, frequently excluded from clinical trials.
In this study we evaluated the most important prognostic factors in OPSCC patients, treated with (chemo) radiation, such as HPV and smoking history, in combination with other patient and tumor characteristics to develop a robust nomogram that could provide individualized predictions of treatment outcome. The proposed predictive nomogram was externally validated in an independent cohort of consecutive OPSCC patients. As this knowledge was extracted from patients in routine clinical care, it can subsequently be implemented in clinical practice: it will improve the information given to patients regarding their prognosis, and could allow eligibility for treatment de-escalation trials.
Materials and methods
All consecutive patients with OPSCC, stages (I–IVb) treated at Maastro Clinic between January 2000 and October 2011. 168 patients were included, treated with curative intent (including definitive radiotherapy or concurrent chemo-radiation). This analysis was approved by the Institutional Review Board (No. 11-29-14/09-intern-6430; NCT01985984).
Treatment options were either definitive radiotherapy alone or concurrent chemo-radiation with high dose cisplatin every 3 weeks. Patients treated with definitive radiotherapy received a continuous course of radiotherapy delivered by 4–6 MV linear accelerator. Patients were treated with fractionation schedules: patients with early oropharyngeal cancers (stage I–II) were treated with Accelerated Fractionated RadioTherapy (AFRT) to 68 Gy in 34 fractions over 37–38 days, the first 23 fractions 2 Gy daily, and the last 11 fractions twice daily in fractions of 2 Gy. Patients in moderate general condition, who were deemed unfit for AFRT received standard fractionated radiotherapy to 70 Gy in 35 fractions over 7 weeks.
To determine HPV status formalin-fixed, paraffin-embedded (FFPE) biopsy material of histopathologically confirmed OPSCC were retrieved from the archives of the Department of Pathology, University Hospital Maastricht, The Netherlands. FFPE material had been classified by histopathology and analyzed by means of p16INK4A immunostaining and for the presence of oncogenic HPV16 DNA by PCR in 168 available specimens . A tumor was considered HPV positive if the HPV16 DNA by PCR results were positive.
The factors evaluated for their prognostic potential were HPV status, smoking and alcohol history, patient comorbidity, pre-treatment hemoglobin levels, gender, age, tumor location and TNM classification. All patient and treatment characteristics were collected from medical records. Patient comorbidity was scored using the Adult Comorbidity Evaluation 27 .
N-stage was subdivided into two categories comparing N0–N2a stage against N2b–N3 stage since patients in these categories have different clinical implications [, ]. Missing values were imputed using the predictive mean matching algorithm .
Study endpoints were progression-free survival and overall survival, calculated from the start of radiotherapy. An event for progression-free survival was defined as death or the first documented recurrence either recurrent local–regional disease or distant metastases after treatment. For overall survival, data were considered right-censored if patients were still alive at the time of last follow-up. For progression-free survival analysis, data were considered right-censored if patients did not develop a local–regional recurrence or distant metastases and were alive at the time of last follow-up.
The X2-test was used for comparisons of categorical variables. For univariate survival analysis, the Kaplan–Meier method was used. Groups were compared using the log rank test.
A multivariate Cox Proportional Hazard Regression analysis was performed to establish factors independently contributing to treatment-outcome. Two-sided p-values of <0.05 were considered statistically significant. A multivariate model combining the most important predictors was converted into a visual nomogram , and validated in an external cohort of patients from the VU University Medical Center, Amsterdam, The Netherlands. Model performance was evaluated using the C-index. The maximum value of the C-index is 1.0; indicating a perfect prediction model. A value of 0.5 indicates that 50% of the patients are correctly classified. Bootstrapping was used to obtain model prediction confidence intervals. The Maastro and external validation cohorts were split, using this model, into three subgroups according to the 33 and 66 percentiles of the risk score. The nomogram will be publicly available on the website www.predictcancer.org, after publication. Raw data of the training dataset is available on https://www.cancerdata.org/10.1016/j.radonc.2014.09.005. Analyses were performed using SPPS 19.0 (SPSS Inc., Chicago) and Matlab 7.11.0 (The MathWorks Inc., Natick, MA).
Patient characteristics of the validation cohort are shown in Supplementary Table 1. It consisted of a consecutive series of 189 OPSCC patients curatively treated at the VU University Medical Center, Amsterdam, The Netherlands, between January 2000 and December 2006. Treatment options included definitive radiotherapy alone and chemo-radiation. The definitive radiotherapy regime consisted of standard fractionated radiotherapy to 70 Gy in fractions of 2 Gy over 7 weeks. The concomitant chemo-radiation scheme included daily fractionation of 2 Gy up to 70 Gy with a concomitant intra-venous administration of cisplatin with a dose of 100 mg/m2 at three weeks intervals.
Patient, tumor and treatment characteristics are shown in Table 1. The majority of the patients were male (74.4%) and the median age at the start of therapy was 59 years (range: 43–83 years). The median follow-up of all patients was 26 months (range: 2.5–127.2) and it was 37.5 months (range: 6.4–127.2) for patients alive at last follow-up. At the time of last follow-up 60.1% of patients were alive and 39.9% had deceased. Progression-free survival was 47% at 5 years with a total of 76 (45%) events. A total of 29 (17.3%) local–regional recurrences were observed.
|Frequency (%)||Log rank test p overall survival||Log rank test p PF survival|
|Age (years)||59.5 (43–83)||0.498||0.282|
|Primary tumor sub-location||0.052||0.140|
|Base of tongue||29.8|
|Smoking pack years||30 (0–100)|
|Split by median (>30)||0.025||0.026|
|Split by percentiles||0.078||0.083|
|Alcohol unit years||134 (0–660)|
|Split by median (>134)||0.042||0.004|
|Split by percentiles||0.047||0.005|
|Comorbidity score (ACE-27)||0.000||0.000|
|N0–N2a vs N2b–N3||0.021||0.053|
|RT dose (Gy)||68 (60–70)||0.888||0.865|
|Pre-RT hemoglobin levels (mmol/L)||8.5 (5.1–11.3)||0.006||0.004|
Immunostaining for p16 was positive in 58 cases (34.5%) and missing in 1.2% of the cases. After HPV DNA testing, a total of 51 (30.4%) was considered as HPV positive. Due to its importance in OPSCC patients, we evaluated the association between HPV status and other patient and tumor characteristics. Overall survival was significantly better for patients with an HPV-positive OPSCC (CI, 83.66–120.21 months), compared to patients with an HPV-negative OPSCC (CI, 48.6–68.2 months; p < 0.0001). The 5-year overall survival rates were 82% in the HPV-positive group and 39% in the HPV-negative group. For progression-free survival, the surviving rates were 83% and 35% for the HPV-positive and HPV-negative groups respectively (p < 0.001).
HPV status was positive in 32.5% and 29.6% of female and male patients respectively. Patients with HPV-positive status were more likely to have none to moderate comorbidity (72.5% of HPV positive cases, p = NS; ACE-27 score 0–1); these patients also showed a clear tendency toward moderate smoking and alcohol consumption compared to HPV-negative patients (p < 0.001). No significant differences were observed when comparing HPV status and nodal status, tumor stage and age. There was a higher incidence of HPV-positive tumors in the tonsils and base of tongue, compared to the other oropharyngeal sub-locations (p = 0.001). Poorly differentiated tumors had significant higher incidence of HPV-positivity compared to well differentiated tumors (p < 0.006).
Univariate analysis was performed to evaluate the prognostic significance of the tumor and patient characteristics shown inTable 1. The variables that were associated with shorter overall survival were male gender (p = 0.004), pack years of smoking higher than the median value (median = 30 pack years; p = 0.025), unit years of alcohol consumption higher than the median value (median = 134 unit years; p = 0.042), higher ACE-27 comorbidity index (p < 0.0001), higher T-stage (p < 0.0001), N2b–N3 stage (p = 0.021), negative HPV status (p < 0.0001) and lower pre-radiotherapy hemoglobin levels than the median value (median = 8.5 mmol/L; p < 0.006). Differentiation grade did not show significant differences in overall survival (p = 0.654). Tumors located in the posterior oropharynx wall had a trend toward worse survival, compared to other tumor sub-locations (p = 0.052).
Treatment parameters such as radiotherapy delivered dose and overall treatment time did not show a correlation with overall survival (p > 0.05). Likewise, no significant differences in overall survival were observed based on treatment type (radiation only vs. chemoradiation, p = 0.29).
Gender, pack years of smoking, unit years of alcohol consumption, comorbidity, T-stage, HPV status and pre-radiotherapy hemoglobin levels were individually associated with progression-free survival (Table 1).
Some prognostic factors in the univariate analysis (Table 1) were no longer significant in the multivariate cox-regression analysis. For overall survival, the factors that remained as independent contributors of unfavorable treatment outcome were male gender, low pre-treatment hemoglobin levels (<median), higher T-stage, N2b–N3 stage, negative HPV status and high comorbidity (moderate to severe). Multivariate hazard ratios, confidence intervals and significance levels are shown inTable 2.
|Overall survival||Progression-free survival|
|Hazard ratio||Confidence intervals||p-Value||Hazard ratio||Confidence intervals||p-Value|
|Pre-RT hemoglobin levels||.693||.531–.903||.007||.802||.633–1.016||0.067|
|Pack years of smoking||1.006||.992–1.021||.792||1.006||.993–1.020||.368|
|Unit years of alcohol consumption||1.001||.999–1.002||.554||1.002||1.000–1.003||.057|
For progression-free survival male gender, high comorbidity, higher T-stage, N2b–N3 stage and negative HPV status remained as significant independent prognostic factors. All other parameters did not show a significant correlation with progression-free survival (Table 2).
Prediction of overall survival yielded a C-index of 0.82 (95% CI, 0.76–0.88) based on the Maastro Clinic dataset. In the independent external validation dataset, the C-index was 0.73 (95% CI, 0.66–0.79). The resulting nomogram, shown in Fig. 1, estimates outcome probabilities by assigning a score to each predictor value. The sum of these scores corresponds to an outcome event probability. The most important factor in the nomogram to estimate overall survival is HPV status. Kaplan–Meier curves of the model estimates for the development and validation cohorts are shown in Fig. 2a. This stratification showed significant differences in outcomes for the three proposed risk groups, in both datasets (p < 0.001). For progression-free survival, the model’s C-index was 0.80 (95% CI, 0.76–0.88), with a validation C-index in the external dataset of 0.67 (95% CI, 0.59–0.74). Again, the predictive nomogram was able to estimate individual progression-free survival rates and assign patients to clearly distinct risk groups in the validation cohort (p < 0.001; Fig. 2b). A comparison of the multivariate model performance with TNM staging, HPV alone and Ang’s model  is shown in Table 3. Median survival rates for the distinct risk groups are summarized in Supplementary Table 2.
C-index confidence intervals were obtained in a bootstrap procedure (n = 100).
We evaluated the prognostic significance of HPV and other factors of clinical interest, in a large cohort of consecutive OPSCC patients, to develop a multifactorial predictive model that can provide individual estimations of treatment outcome in this patient population.
Combining the most important prognostic factors in a multivariate model, including HPV status, comorbidity score, T-stage, N-stage, pack years of smoking, gender and pre-treatment hemoglobin levels yielded high predictive performances, as shown by the C-index for overall survival of 0.82 (95% CI, 0.76–0.88) and of 0.80 (95% CI, 0.76–0.88) for progression-free survival. This model was validated in an external unselected cohort of OPSCC patients (n = 189), showing reliable validation model performances for overall survival (0.73; 95% CI, 0.66–0.79) and progression-free survival (0.67; 95% CI, 0.59–0.74). Model predictions were significantly better than using TNM or HPV alone. Also, this multivariate model showed higher C-indexes when compared with the published model based on the RTOG 0129 study including HPV, T-classification, N-classification and smoking history .
This prognostic model for OPSCC patients has been validated in an independent dataset by directly applying the model weights to the validation raw data. Previously published models have been evaluated in a single development cohort [,], although Ang’s model has been recently evaluated by two groups [, ]. Our model was able to stratify patients according to their estimated risk of failure into distinct risk groups, in both cohorts, for overall and progression-free survival. However, the performance was lower for prediction progression-free survival in the validation cohort.
We followed the so-called rapid-learning approach in which knowledge is derived from unselected patient databases, as compared to medical evidence derived from clinical trials [, ]. This approach has an obvious advantage of including a more heterogeneous group of patients, in terms of clinical stage, comorbidity and treatments. In this way, the knowledge derived can be used for decisions concerning new patients, including the elderly patient or the patient with severe comorbidity, which would not be included in a clinical trial. The clinical and patient characteristics included in this study were selected based on medical expertise, known prognostic importance from literature and availability [, , ].
In our study, overall survival rates and progression-free survival rates were comparable with other studies [, ]. The frequency of HPV-associated OPSCC in our cohort is comparable to other recent European series [, , ]. Similarly, we found HPV-positive status to be associated with low smoking and alcohol consumption, and less likely to have severe comorbidity (p < 0.0001). Furthermore, the presence of HPV correlated positively with poor differentiation grade (p < 0.006) and was more often present in tumors of the tonsils and the base of the tongue (p = 0.001). These findings are in line with previously observed correlations between HPV incidence and patient demographics and tumor characteristics [, ,]. HPV-positive cancers have been associated with smaller primary tumors and with greater regional disease , in our study, no significant differences in HPV-prevalence were observed among different T-stages or N-stages.
Tobacco smoking has been established as a major independent-prognostic factor for patients with OPSCC [, , ], these studies showed that cancer progression and risk of death increases with tobacco exposure, independently of tumor HPV status and treatment. In our study, pack years of smoking was a significant prognostic factor for overall and progression-free survival, however, in the multivariate analysis, it did not remain as independent prognostic factor.
A limitation in our study, inherent to its retrospective nature is the lack of standardization in which data have been collected over the years. Furthermore, smoking behavior during therapy, which has been recently reported as important prognostic factor [, , ], was not available in our study.
This further highlights the increasing need for systematic routine patient care data collection, warehouse and semantic inter-operable data retrieval systems, to assure improved and standardized data retrieval and allow external applicability [, , ].
Moderate to severe comorbidity, higher T-stage and advanced N-stage were independent unfavorable prognostic factors for overall survival and progression-free survival. We used the ACE-27 comorbidity score, a validated comorbidity scoring system, which has been previously associated with patient prognosis in head and neck cancers [, , ]. Advanced clinical T-classification has been reported as a significant risk factor for progressive disease and death in oropharyngeal carcinoma patients . Indeed, T3–T4 tumors showed poorer survival, compared to T1–T2 tumors. Similarly, we observed that higher N-stage was associated with worse survival; however this association was less significant with progression-free survival. Comparing N0–N2a nodal stages against N2b–N3 stages showed marked differences in survival, with the latter being an unfavorable prognostic factor. This re-grouping of N-stage has shown prognostic value previously [, ]. Male gender was a strong negative prognostic factor for overall survival and progression-free survival; however this effect remained significant in the multivariate setting only for overall survival. Other studies have shown male gender to be an unfavorable prognostic factor, as well as in other head and neck cancer sites, however in OPSCC this association can be confounded by the fact that men have a higher incidence of HPV-positive OPSCC than women [, , , ].
We showed that combining tumor HPV status with other important prognostic factors, increased the accuracy in the predictions, compared to the traditional TNM staging system or individually. 95% CI of the model predictions were significantly better than those obtained with TNM alone or HPV status alone, which underlines the importance of multifactorial prediction models.
This model performance is acceptable for clinical support, particularly due to the clear distinction in risk groups, in both cohorts; however it is still far from optimal. Combining clinical parameters with HPV, is a first step into developing validated decision support systems in head and neck cancer; however we anticipate that adding other features, such as diagnostic and molecular imaging, and other important biomarkers such as EGFR or CA-IX will increase model accuracy [, ,, ]. Standardization and systematic collection of routine patient care data will likewise increase model reliability and allow further validation.
In conclusion, we showed that combining HPV status with a set of important clinical parameters allows the development of multifactorial models to predict overall and progression-free survival. Applying this model to individual patients can support their stratification according to their estimated risk and their eligibility for different treatment approaches [, ], for instance, ongoing trials are evaluating treatment de-intensification for OPSCC with estimated good prognosis (NCT01663259). Thus, population-based learning can improve the information given to patients regarding their prognosis as well as in the long term allow stratification in prospective clinical trials and treatment individualization.
E.R.V., F.H. and P.L. conceived of the project, analyzed the data, and wrote the paper. H.J.W.L., M.M.R., R.H.B, C.R.L, E.J.S, J.S and B.K. provided expert guidance, data and reviewed the manuscript.
Conflicts of interest
Authors declare no conflicts of interest.
Authors acknowledge financial support from the CTMM framework (AIRFORCE project, Grant 030-103 ), EU 6th and 7th framework program (METOXIA, EURECA, ARTFORCE), euroCAT (IVA Interreg – www.eurocat.info) and the Dutch Cancer Society ( KWF UM 2011-5020 , KWF UM 2009-4454 ).