Cancer is not only a disease, it is a set of diseases with a great impact on public health. In that sense, efforts to consolidate methods of analysis based on large data that contribute to its prediction, is an area of special interest for scientists and data analysts. The objective of this paper is to compare the performance of method prediction: i) Logistic regression, ii) K Nearest Neighbor, iii) K-means, iv) Random Forest, v) Support Vector Machine, vi) Linear Discriminant Analysis, vii) Gaussian Naive Bayes viii) Multilayer Perceptron, within a cancer database. In the case of unsupervised learning models, the relevance of the centroids for the k means algorithm is evident, as well as the learning rate assignments and parameters for the Multilayer Perceptron case. In the case of supervised learning models, SVM performs best.
|Título traducido de la contribución||Comparative analysis of prediction within cancer databases: A machine learning application|
|Número de páginas||10|
|Publicación||RISTI - Revista Iberica de Sistemas e Tecnologias de Informacao|
|Estado||Publicada - 1 ene. 2019|
- Big data
- Cancer prediction
- Machine learning