In the age of data collection, machine learning algorithms have to cope with ever-growing data sets. This requires algorithms that can scale well on modern accelerators like GPUs as well as efficient implementations that can support different hardware platforms. For supervised learning, Support Vector Machines (SVMs) are widely used. However, even modern and optimized implementations do not scale well for large non-trivial data sets on cutting-edge hardware.
These implementations are widely based on Sequential Minimal Optimization, an optimized though inherently sequential algorithm. They are thus not well-suited for highly parallel hardware like GPUs.
We want to tackle both of these issues in our Parallel Least Squares Support Vector Machine (PLSSVM) library. The main goal is to develop an SVM for the HPC context. That is, an SVM that scales on many nodes and can effectively use accelerator cards. For this purpose we develop a Least Squares Support Vector Machine (LS-SVM) and investigate thereby different methods to accelerate the SVM on GPUs as well as to parallelize it on multiple CPU cores and compute nodes. Our second goal is to investigate different aspects of performance portability in the HPC context using our PLSSVM library. For this purpose, for example, a part of PLSSVM is currently written in several programming languages or frameworks such as SYCL, OpenCL, CUDA, HIP, and OpenMP.