Statistical models for determining the optimal combination of biomarkers and their application in classification of medical data
Abstract
Introduction
Area under a ROC curve (AUC) is a common criterion to assess the overall classification
performance of the markers. In practice due to limited classification ability of a single marker,
we are interested in combining markers linearly or nonlinearly to improve classification
performance. Ramp AUC (RAUC) is a new statistical AUC-based method which can find such
optimal combinations of markers. In this study, RAUC was used to find the optimal
combinations of care indicators related to functional limitation as a complication of diabetes
and accurately discriminate this outcome based on its underlying markers.
Methods
This cross-sectional study was conducted on 378 diabetic patients referred to diabetic centers
of Ardebil and Tabriz during 2014–15. To have an accurate classification of diabetic patients
according to their functional limitation status, RAUC method with RBF kernel was employed
to look for optimal combination of care indicators. Classification performance of the model was
evaluated by AUC and compared with logistic regression, support vector machine (SVM) and
generalized additive model (GAM) via training and test validation method.
Results
Out of 378 diabetics, 67,46% had functional limitation. RAUC had a test dataset AUC equal 1
and outperformed logistic (AUC=.79), GAM (AUC=.82), SVM with linear kernel (AUC=.67)
and was slightly better than SVM with RBF kernel (AUC=.98).
Conclusion110
There was strong nonlinearity in data and RAUC with RBF kernel which is a nonlinear
combination of markers, could detect this pattern.