Performance Enhancement of Malware Classifiers Using Generative Adversarial Networks

구분

논문

날짜

2022/12/17

시기

2022

게재처

IEEE BigData 2022

저자

Donghwa Shin

Daehee Han

Sunghyon Kyeong

원문 확인

https://ieeexplore.ieee.org/document/10020505

9 more properties

Abstract

This study presents comprehensive experimental results for the IEEE BigData 2022 Cup, using a generative adversarial network (GAN) to generate appropriate malign samples to improve a malware classifier’s performance. For the experiments, we employed conditional tabular GAN (CTGAN), conditional table GAN (CTAB-GAN), and complementary GAN architectures to deal with the data imbalance problem commonly encountered in classification tasks. The results showed that CTAB-GAN outperformed the other GANs in producing synthetic data that are statistically comparable to the given training data. This shows that the classifier’s performance improved on the validation dataset, and suggests that better classification performance can be achieved in terms of machine learning efficacy using better quality synthetic data. Although CTAB-GAN performed better than CTGAN and Complementary GAN in terms of statistical similarity and machine learning efficacy, it could overfit on the training data. Therefore, we used both CTGAN and CTAB-GAN to produce a balanced dataset to train the classifier for the final solution. The root mean square error of the classifier was 0.103, which is an improvement of 0.066 from the baseline performance of 0.169.

카카오뱅크 금융기술연구소

Financial Tech Lab

경기도 성남시 분당내곡로 131 판교테크원 타워2 15층 (13529)

문의 하기