Análisis predictivo del riesgo de deserción en estudiantes de primer ingreso de la Universidad Tecnológica Centroaméricana de los años 2023 al 2025
Loading...
Date
2026-01-01
Journal Title
Journal ISSN
Volume Title
Publisher
Universidad Tecnológica Centroamericana UNITEC
Abstract
El presente trabajo tiene la finalidad efectuar un análisis predictivo del riesgo de deserción en estudiantes de primer ingreso de la Universidad Tecnológica Centroamericana de los años 2023 al 2025, mediante del entrenamiento de los modelos Gradient Boosted Trees Random Forest, Decision Tree, K-Nearest Neighbors y Regresión Logística, usando Stratified K-Fold Cross Validation para garantizar una evaluación justa, midiendo cada modelo con métricas comunes (exactitud, recall, precisión, F1-score y Cohen's kappa) y matrices de confusión creadas en KNIME. Todo ello con el propósito de identificar tempranamente a los estudiantes en riesgo académico y actuar oportunamente. Ya que la aplicación de herramientas de análisis predictivo permite tomar decisiones más acertadas y oportunas para brindar un eficiente acompañamiento a los estudiantes. Este es un elemento clave que genera una ventaja competitiva al contar con una gestión más eficiente de las bases de datos y registros de los estudiantes. A través de los resultados se demostró que el modelo Gradient Boosted Trees tiene un desempeño robusto en la predicción del riesgo de deserción estudiantil, evidenciando una alta capacidad discriminativa, clasificando correctamente a 13,711 estudiantes (88.48% del total).
The purpose of this study is to conduct a predictive analysis of the risk of dropout among first-year students at the Central American Technological University during the period 2023–2025. This is achieved through the training of Gradient Boosted Trees, Random Forest, Decision Tree, K-Nearest Neighbors, and Logistic Regression models, using Stratified K-Fold Cross Validation to ensure a fair and robust evaluation. Each model is assessed using common performance metrics (accuracy, recall, precision, F1-score, and Cohen’s kappa), as well as confusion matrices generated in KNIME. The overarching objective is to enable the early identification of students at academic risk and to support timely intervention. The application of predictive analytics tools facilitates more accurate and timely decision-making, allowing institutions to provide effective academic support to students. This approach represents a key element in achieving a competitive advantage by enabling more efficient management of student databases and academic records. The results demonstrate that the Gradient Boosted Trees model exhibits robust performance in predicting student dropout risk, showing high discriminative capability and correctly classifying 13,711 students (88.48% of the total population).
The purpose of this study is to conduct a predictive analysis of the risk of dropout among first-year students at the Central American Technological University during the period 2023–2025. This is achieved through the training of Gradient Boosted Trees, Random Forest, Decision Tree, K-Nearest Neighbors, and Logistic Regression models, using Stratified K-Fold Cross Validation to ensure a fair and robust evaluation. Each model is assessed using common performance metrics (accuracy, recall, precision, F1-score, and Cohen’s kappa), as well as confusion matrices generated in KNIME. The overarching objective is to enable the early identification of students at academic risk and to support timely intervention. The application of predictive analytics tools facilitates more accurate and timely decision-making, allowing institutions to provide effective academic support to students. This approach represents a key element in achieving a competitive advantage by enabling more efficient management of student databases and academic records. The results demonstrate that the Gradient Boosted Trees model exhibits robust performance in predicting student dropout risk, showing high discriminative capability and correctly classifying 13,711 students (88.48% of the total population).
Keywords
Acompañamiento estudiantil, Anticipación, Aprendizaje automático, Minería de datos, Modelos de predicción
