Prostate cancer prediction using machine learning techniques


  • Kevin Hernández



Prostate cancer, Machine learning, Feature selection


Prostate cancer (PCa) is currently the most frequently diagnosed cancer in men in industrialized nations and ranks as the second leading cause of male cancer-related deaths globally, early detection is crucial. Originating in the walnut-shaped gland beneath the bladder, PCa poses a significant risk when not identified in its early stages. The diagnostic process, requiring expertise from radiologists, pathologists, and physicians, is time-consuming and introduces variability, potentially leading to delayed or incorrect diagnoses. This underscores the need for efficient and reliable diagnostic tools in addressing the escalating challenge of PCa diagnosis.This study addresses the critical challenge of PCa diagnosis by employing a comprehensive approach involving feature selection methods and model performance evaluation. Utilizing a PCa dataset from Kaggle, consisting of 100 patient observations with eight independent features and a binary diagnosis result, the study explores the nuanced nature of feature relevance in PCa classification. Comparative analyses of Principal Component Analysis (PCA) and ReliefF feature selection methods reveal the limitations of PCA's emphasis on a dominant feature, while ReliefF, incorporating a distributed set of features, demonstrates improved model accuracy and stability. The Random Forest (RF) model, selected through meticulous parameter tuning, achieves an impressive 95% accuracy by leveraging a substantial number of estimators, limited tree depth, and balanced sample splitting. The findings underscore the crucial interplay between feature selection methods and model parameters in optimizing the accuracy and reliability of PCa classification models. Given the anticipated rise in PCa incidence, this research contributes valuable insights for enhancing diagnostic efficiency and addressing the challenges posed by traditional diagnostic procedures.