Profiling waitlisted incoming students for future delinquency with an ensemble of statistical machine learning algorithms

Institute of Computer Science, University of the Philippines Los Baños, College, Laguna, Philippines
DOI
10.7287/peerj.preprints.3312v1
Subject Areas
Data Mining and Machine Learning
Keywords
Profiling, Delinquency, Student, Waitlisted
Copyright
© 2017 Lauron et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Lauron MLC, Pabico JP. 2017. Profiling waitlisted incoming students for future delinquency with an ensemble of statistical machine learning algorithms. PeerJ Preprints 5:e3312v1

Abstract

Given a dataset \(\mathcal{R}=\{R_1, R_2, \dots, R_r\}\) of \(r\)~records of waitlisted incoming freshman students (WIFS), where for any \(i=1, 2, \dots, r\), \(R_i\) is a \((m+1)\)--tuple \((O_i, P_i^{(1)}, P_i^{(2)}, \dots, P_i^{(m)})\), \(O_i\) is any one in a set \(\mathcal{O}=\{O_1, O_2, \dots, O_o\}\) of \(o\)~classes, and \(P_i^{(1)}, P_i^{(2)}, \dots, P_i^{(m)}\) are \(m\)~potential predictors for~\(O_i\). Our purpose is to find a statistical machine learning algorithm (SMLA) \(\mathbb{A}\) such that \(V_i=\mathbb{A}(P_i^{(1)}, P_i^{(2)}, \dots, P_i^{(m)})\), where \(V_i\) is a predicted class by~\(\mathbb{A}\) that was developed using \(n\le m\) correct number of predictors for \(O\in\mathcal{O}\), and \(\mathbb{A}\)~is the best algorithm such that the metric \(v^{-1}\sum_{i=1}^v |O_i - V_i|\) is minimum across \(v<r\)~records in the validation set \(\mathcal{V}\subset\mathcal{R}\). Our problem is to find the subset \(\{P_i^{(1)}, P_i^{(2)}, \dots, P_i^{(n)}\}\) and to train \(\mathbb{A}\)~using \(t<r\) records from the training set \(\mathcal{T}\subset\mathcal{R}\), such that \(\mathcal{T}\cap\mathcal{V}=\emptyset\), so that \(\mathbb{A}\)~can predict whether a WIFS trying to enter an undergraduate program at UPLB will incur at least a ``delinquency'' once the student is accepted into the program. The \(\mathbb{A}\)~can be a useful decision-support tool for UPLB deans and college secretaries in deciding whether a WIFS will be accepted into the program or not.

Author Comment

Submitted and accepted as contributed paper to the 18th National Student-Faculty Conference on the Statistical Sciences (SFCon-Stat 2017), SEARCA, Los Banos, Laguna, Philippines, 16 October 2017.