search for




 

Online career counseling text classification using BERT and FastText
Journal of the Korean Data & Information Science Society 2022;33:991-1006
Published online November 30, 2022;  https://doi.org/10.7465/jkdi.2022.33.6.991
© 2022 Korean Data and Information Science Society.

Soon Bo Kwon1 · Jin Eun Yoo2

12Curriculum and Educational Evaluation, Korea National University of Education
Correspondence to: This paper, based on the first author’s dissertation, was revised for journal publication.
1 Instructor, Curriculum and Educational Evaluation, Korea National University of Education, Cheongju 28173, Korea.
2 Professor, Curriculum and Educational Evaluation, Korea National University of Education, Cheongju 28173, Korea. E-mail: jeyoo@knue.ac.kr
Received August 10, 2022; Revised September 20, 2022; Accepted September 23, 2022.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
Natural language processing techniques have been developed in various fields and are continuously updating. Specifically, online career counseling combined with natural language processing is yet under-developed but one of the fields worth investigation. Via online counseling, counseling experts can help clients free from time and space restrictions. Online counseling typically takes place in websites, and the results are stored in text. Categorization efforts of the text data with natural language processing will help diagnose clients’ career counseling needs and predict new cases with higher accuracy. Specifically, this study compared the performances of BERT and FastText in the classification problems of career counseling text data. A total of 4,412 online text data were obtained via webcrawling, and cleaned to have four categories. As results, BERT outperformed FastText in all measures including test data accuracy, precision, recall, and F1-score. Particularly, BERT was superior to FastText in classifying a category of small samples. The results of this study can contribute to the reduced work of counselors and chatbot development in career counseling. Future research directions are discussed including misclassified label problems.
Keywords : BERT, FastText, Online career counseling, text data analysis.