search for




 

Divide and conquer algorithm based support vector machine for massive data analysis
Journal of the Korean Data & Information Science Society 2021;32:463-73
Published online May 31, 2021;  https://doi.org/10.7465/jkdi.2021.32.3.463
© 2021 Korean Data and Information Science Society.

Sungwan Bang1 · Seokwon Han2 · Jaeoh Kim3

12Department of Mathematics, Korea Military Academy
3Center for Army Analysis and Simulation, HQs ROKA
Correspondence to: This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (NO. 2020R1F1A1A01065107).
1 Professor, Department of Mathematics, Korea Military Academy, 574, Hwarang-ro, Nowon-gu, Seoul, Korea.
2 Assistant professor, Department of Mathematics, Korea Military Academy, 574, Hwarang-ro, Nowongu, Seoul, Korea.
3 Corresponding author: LTC, Center for Army Analysis and Simulation, HQs ROKA, 663, Gyeryongdae-ro, Sindoan-myeon, Gyeryong-si, Chungcheongnam-do 32800, Korea. E-mail: c14180@gmail.com
Received January 31, 2021; Revised March 10, 2021; Accepted March 31, 2021.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
The support vector machine (SVM) has been successfully applied to various classification areas with great flexibility and a high level of classification accuracy. However, it is infeasible to use the SVM in analyzing massive data because of its significant computational problems such as the limitation of computer primary memory. To overcome such a problem, we propose a divide and conquer based SVM (DC-SVM) method. The proposed DC-SVM divides the entire training data into a few subsets, and applies the SVM onto each subset to estimate its classifier. And then DC-SVM obtains the final classifier by aggregating all classifiers from subsets. Simulation studies are presented to demonstrate satisfactory performance of the proposed method.
Keywords : Divide and conquer, massive data, quadratic programming, support vector machine.