The Impact of Active Learning on Availability Data Poisoning for Android Malware Classifiers

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

81 Downloads (Pure)

Abstract

Can a poisoned machine learning (ML) model passively recover from its adversarial manipulation by retraining with new samples, and regain non-poisoned performance? And if passive recovery is possible, how can it be quantified? From an adversarial perspective, is a small amount of poisoning sufficient to force the defender to retrain more over time? This paper proposes the evaluation of passive recovery from ``availability data poisoning'' using active learning in the context of Android malware detection. To quantify passive recovery, we propose two metrics: intercept to assess the speed of recovery, and recovery rate to quantify the stability of recovery. To investigate passive recovery, we conduct our experiments at different rates of active learning, in conjunction with varying strengths of availability data poisoning. We perform our evaluation on 259,230 applications from AndroZoo, using the Drebin feature representation, with linear SVM, DNN, and Random Forest as classifiers. Our findings show the convergence of the poisoned models to their respective hypothetical non-poisoned models. Therefore, demonstrating that through the use of active learning as a concept drift mitigation strategy, passive recovery is feasible across the three classifiers evaluated.
Original languageEnglish
Title of host publicationProceedings of the Annual Computer Security Applications Conference Workshops (ACSAC Workshops)
Place of PublicationHonolulu, Hawaii, USA
PublisherIEEE
Number of pages12
Edition2024
Publication statusAccepted/In press - 20 Oct 2024

Keywords

  • Supervised learning
  • Malware Classification
  • Poisoning
  • Active Learning
  • Passive Recovery

Fingerprint

Dive into the research topics of 'The Impact of Active Learning on Availability Data Poisoning for Android Malware Classifiers'. Together they form a unique fingerprint.

Cite this