Application of the k-medoid algorithm for the segmentation of entering students at a university.

Authors

  • Ledvir Chavez Universidad Nacional Agraria la Molina, Facultad de Economía y Planificación, Departamento de Estadística e Informática, Lima, Perú.
  • Jesús Salinas Universidad Nacional Agraria la Molina, Facultad de Economía y Planificación, Departamento de Estadística e Informática, Lima, Perú.

DOI:

https://doi.org/10.47187/perf.v1i25.118

Keywords:

Admitted student profile, clustering algorithms, segmentation, K-medoid

Abstract

Currently, in the area within higher education, data management has become essential for academic decision making and the improvement of educational processes. Analytics and statistics have been taken to the technological field, where the processes automation and the large databases management through Machine Learning algorithms are the most used, among which are the clustering algorithms, whose purpose is to group data by similarity. The objective of this study was to find types of university students with respect to their sociodemographic, economic and academic performance variables, using the K-medoid algorithm on data of students entering the Universidad Nacional Agraria La Molina in Lima, Peru. It was determined that the students under study can be segmented into 3 groups, each with its own characteristics, which will make it possible to promote changes in favor of educational quality and promote the renovation of teaching spaces in a personalized way around the type of student that the university manages.

Downloads

Download data is not yet available.

References

Arora P, Virmani D, Varshney S. Analysis of K-Means and K-Medoids Algorithm for Big Data. Procedia Computer Science. 2016; 78: 507-512. Disponible en: https://bit.ly/2s5X9xy.

Adams J, Hayunga D, Mansi S, Reeb D, Verardi V. Identifying and treating outliers in finance. Financial Management. 2019; 48(2): 345–384. Disponible en: https://cutt.ly/9hawFSN.

Acock A. A gentle introduction to Stata. 4th ed. College Station: Stata Press. 2014

Aggarwal C. An introduction to cluster analysis. In C. Aggarwal, C. Reddy (Eds.). Data clustering: Algorithms and applications (pp. 1-28). New York: CRC Press. 2014.

Bhat A. K-medoids clustering using partitioning aroud medoids for performing face recognition. International Journal of Soft computing, Mathematics and Control. 2014; 3(3): 1-12. Disponible en: https://cutt.ly/Hharh0H.

Boehmke B, Greenwell B. K-means Clustering. In Hands-On Machine Learning with R (pp. 399–416). 1st ed. New York: CRC Press. 2014. Disponible en: https://cutt.ly/KhaqBcJ.

Castro M. Factor principal que determina la deserción de los estudiantes del primer y segundo ciclo de una universidad privada de lima - campus lima centro, durante el periodo 2018 I – II [Tesis de maestría]. Perú: Universidad Tecnológica del Perú; 2019. Disponible en: https://n9.cl/157s.

Eckert K, Suénaga R. Aplicación de técnicas de Minería de datos al análisis de situación y comportamiento académico de alumnos de la UGD. In XV Worshop de Investigadores en Ciencias de la Computación. Argentina. 2013. Disponible en: https://bit.ly/2QSvppC.

Everitt B, Hothorn, T. Cluster analysis. In B. Everitt, T. Hothorn, An Introduction to Applied Multivariate Analysis with R (pp. 163–200). 1st ed. New York: CRC Press. 2011.

Fávero L, Belfiore P. Análise de agrupamentos. In Manual de análise de dados: Estatística e modelagem multivariada com Excel, SPSS e Stata (pp. 309–378). 1st ed. São Paulo: GEN. 2017.

Huang Z. Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values. Data Mining and Knowledge Discovery. 1998; 2: 283 - 304. Disponible en: https://bit.ly/2FMUgoH.

Hair J, Black W, Babin B, Anderson R. Multivariate data analysis. 8th ed. Ireland: Cengage Learning EMEA. 2018

Hartigan J, Wong M. Algorithm AS 136: A K-means clustering algorithm. Journal of the Royal Statistical Society. 1979; 28(1): 100-108. Disponible en: https://bit.ly/30jLpV1.

Irizarry R, Love M. Data analysis for the life sciences with R. 1st ed. United Kingdom: Chapman and Hall/CRC. 2016.

Janssen A, Wan P. K-means clustering of extremes. Electronic Journal of Statistics. 2020; 14(1): 1211–1233. Disponible en: https://cutt.ly/ihaupE6.

Kaufman L, Rousseeuw P. Partitioning around medoids (Program PAM). In Finding groups in data: An introduction to cluster analysis (pp. 68–125). 1st ed. New York: Wiley-Interscience. 1990.

Ketchen D, Shook C. The application of cluster analysis in strategic management research: An analysis and critique. Strategic Management Journal. 1996; 17(6): 441–458. Disponible en: https://cutt.ly/Whaq1Kh.

Loperfido N. Kurtosis-based projection pursuit for outlier detection in financial time series. The European Journal of Finance. 2020; 26(2–3); 142–164. Disponible en: https://cutt.ly/dhaq0Oc.

MacQueen J. Some methods for classification and análisis of multivariate observations. Proceedings of the Berkeley symposium on mathematical statistics and probability. 1967; 1: 281–297. Disponible en: https://cutt.ly/YhaubYD.

Malhotra N. Marketing research: An applied orientation. 7th ed. New York: Pearson. 2018.

Pandey P, Singh I. Comparision between K-mean clustering and improved K-mean clustering. International Journal of Computer Applications. 2016; 146(13): 39–42. Disponible en: https://cutt.ly/Shaq3uw.

Rai P, Singh S. A Survey of Clustering Techniques. International Journal of Computer Applications. 2010; 7(12): 1-5. Disponible en: https://cutt.ly/OhauJpX.

Raulji G. A Review on Fuzzy C-Mean Clustering Algorithm. International Journal of Modern Trends in Engineering and Research. 2014; 2(2): 751-754. Disponible en: https://bit.ly/2FSxewM.

Scoltock J. A survey of the literature of cluster analysis. The Computer Journal. 1982; 25(1), 130–134. Disponible en: https://cutt.ly/Ghaq8Rg.

Vallejo D. Clustering de documentos con restricciones de tamaño [Tesis de maestría]. España: Universidad Politécnica de Valencia; 2015. Disponible en: https://n9.cl/r2mjx.

Velmurugan T, Santhanam T. A comparative analysis between K-medoids and fuzzy C-means clustering algorithms for statistically distributed data points. Journal of Theoretical and Applied Information Technolog. 2011; 27: 19-29. Disponible en: https://bit.ly/3867V6o.

Wang W, Zhang Y. On fuzzy cluster validity indices. Fuzzy Sets and Systems. 2007; 158(19): 2095-2117. Disponible en: https://cutt.ly/DhaifXB.

Published

2021-05-31

How to Cite

Chavez, L., & Salinas, J. (2021). Application of the k-medoid algorithm for the segmentation of entering students at a university. Perfiles, 1(25), 24-29. https://doi.org/10.47187/perf.v1i25.118