Assessment of Imbalanced Dataset in Alzheimer's disease Prediction using Data Mining Techniques
Date
2016Author
Bonab, FR
Dezaje, M
Nourazarian, AR
Kkhatoni, MA
Asl, MRK
Metadata
Show full item recordAbstract
One of the most important goals of medical systems is to minimize human error in diagnosis of diseases. These systems often try to reduce human error by using pattern recognition and data mining techniques. Alzheimer is a degenerative and progressive disease of central nervous system that leads to deterioration of intellectual capabilities. This disease because many changes in the patient's brain that reduces brain size and destroys brain cells. The number of people who have passed away because of varieties of Alzheimer has quadrupled in the past 25 years. The main factor that can decrease the mortality rate of this disease is early and correct diagnosis. Today, the growing use of imbalanced data processing in many real world applications has attracted the attention of many researchers to this subject. Imbalanced data can have a significantly negative effect on the performance of machine learning-based software. In most cases, the focus of standard classification algorithms often lean toward majority classes because the rules that correctly predict these samples are often weighted properly whereas the rules by which the samples of the minority class get predicted are generally ignored or treated as noise, and as a result, the minority class is often classified incorrectly. In this study, the PSC and PSRND methods are proposed for addressing the problem of presence of imbalanced data in the process of classification. The proposed methods are also compared with the existing methods of data balancing such as RS and SRAND. Evaluation results show that, data balancing methods can have a positive effect in increasing the accuracy of classifier. Of all the assessed methods, the PSRND method showed the highest accuracy in classification.