Mathematics and Statistics Vol. 8(2), pp. 100 - 105
DOI: 10.13189/ms.2020.080205
Reprint (PDF) (651Kb)


Comparison Analysis: Large Data Classification Using PLS-DA and Decision Trees


Nurazlina Abdul Rashid *, Norashikin Nasaruddin , Kartini Kassim , Amirah Hazwani Abdul Rahim
Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Kedah, Malaysia

ABSTRACT

Classification studies are widely applied in many areas of research. In our study, we are using classification analysis to explore approaches for tackling the classification problem for a large number of measures using partial least square discriminant analysis (PLS-DA) and decision trees (DT). The performance for both methods was compared using a sample data of breast tissues from the University of Wisconsin Hospital. A partial least square discriminant analysis (PLS-DA) and decision trees (DT) predict the diagnosis of breast tissues (M = malignant, B = benign). A total of 699 patients diagnose (458 benign and 241 malignant) are used in this study. The performance of PLS-DA and DT has been evaluated based on the misclassification error and accuracy rate. The results show PLS-DA can be considered as a good and reliable technique to be used when dealing with a large dataset for the classification task and have good prediction accuracy.

KEYWORDS
Classification, Large Data, PLS-DA, Decision Tree (DT)

Cite This Paper in IEEE or APA Citation Styles
(a). IEEE Format:
[1] Nurazlina Abdul Rashid , Norashikin Nasaruddin , Kartini Kassim , Amirah Hazwani Abdul Rahim , "Comparison Analysis: Large Data Classification Using PLS-DA and Decision Trees," Mathematics and Statistics, Vol. 8, No. 2, pp. 100 - 105, 2020. DOI: 10.13189/ms.2020.080205.

(b). APA Format:
Nurazlina Abdul Rashid , Norashikin Nasaruddin , Kartini Kassim , Amirah Hazwani Abdul Rahim (2020). Comparison Analysis: Large Data Classification Using PLS-DA and Decision Trees. Mathematics and Statistics, 8(2), 100 - 105. DOI: 10.13189/ms.2020.080205.