Press Release: New Predictors of Post-Menopausal Breast Cancer Unveiled by using Machine Learning

Posted on June 12, 2023 by Admin

One of the most common types of cancer affecting women worldwide is breast cancer. Multiple predictors of this disease have been identified, including inherited genetic factors, reproductive factors, and lifestyle.

Previous studies have emphasized the etiological difference between pre-and post-menopausal breast cancers. Recently, scientists have combined various approaches to accurately predict breast cancer in women.

Study

A recent Scientific Reports study utilized machine learning (ML) methods for feature selection, followed by Cox models for risk prediction. The main aim of this study was to demonstrate the effective application of ML methods for feature selection to assist classical statistical methods.

SHapley Additive exPlanation (SHAP) feature dependence plots were used to explore the potential interaction between phenotypic features and PRS. Data from UKB was used for the current study, which contains over half a million participants from England, Wales, and Scotland. Baseline data was collected through verbal interviews with a trained nurse, questionnaires, biological samples, and physical examination.

Post-menopausal women between the ages of 40 and 69 at baseline were recruited due to the aforementioned etiological heterogeneity by menopausal status. The incidence of breast cancer was identified using the International Classification of Diseases codes, in which PRS313 and PRS120k were considered as potential genetic features.

Results

A total of 104,313 participants were included in this study, 4,010 of whom developed breast cancer over the follow-up period of 11.9 years. Combining ML with traditional cancer epidemiology statistical approaches, several known and unknown risk factors for the incidence of post-menopausal cancer were identified.

The identified known risk factors included age at menopause, testosterone, and age. Five novel predictors, including blood biochemistry, blood counts, and urine biomarkers, were also identified.

The newly identified predictors were strongly associated with the incidence of post-menopausal breast cancer. In the future, more research is needed to understand whether these are potentially modifiable risk factors for breast cancer.

The XGBoost model selected a detailed body composition measure rather than body mass index (BMI), thus implying that precise body composition measure is an important predictor of breast cancer. The basal metabolic rate was also found to be a significant predictor for breast cancer, which contradicts a previous study that did not find any association between basal metabolic rate and breast cancer.

Plasma urea, which is a blood biomarker related to kidney function, was also associated with breast cancer. This is the first time that an association between plasma phosphate, sodium, or creatinine in urine with breast cancer has been reported.

The two polygenic risk scores were ranked as the strongest risk factors by agnostic ML models. Cox regressions proved that PRS are significant predictors for post-menopausal breast cancer.

Conclusion

The current study identified five statistically significant novel correlations with post-menopausal breast cancer, including urine biomarkers, blood counts, and blood biochemistry. Upon adding these five novel features to the baseline Cox model, the discrimination performance was maintained. Furthermore, the two pre-specified PRSs were found to be the most important features by the SHAP value.

These findings motivate further research on the use of more precise anthropometry measures to improve breast cancer prediction. External validation of the results is the next important step ahead of implementation in clinical practice.

Source:

https://www.news-medical.net/news/20230611/Scientists-use-machine-learning-to-unveil-new-predictors-of-post-menopausal-breast-cancer.aspx