EC2: Ensemble Clustering and Classification for Predicting Android Malware Families

Tanmoy Chakraborty, Fabio Pierazzi, V. S. Subrahmanian

Research output: Contribution to journalArticlepeer-review

36 Citations (Scopus)
399 Downloads (Pure)

Abstract

As the most widely used mobile platform, Android is also the biggest target for mobile malware. Given the increasing number of Android malware variants, detecting malware families is crucial so that security analysts can identify situations where signatures of a known malware family can be adapted as opposed to manually inspecting behavior of all samples. We present EC2 (Ensemble Clustering and Classification), a novel algorithm for discovering Android malware families of varying sizes - ranging from very large to very small families (even if previously unseen). We present a performance comparison of several traditional classification and clustering algorithms for Android malware family identification on DREBIN, the largest public Android malware dataset with labeled families. We use the output of both supervised classifiers and unsupervised clustering to design EC2. Experimental results on both the DREBIN and the more recent Koodous malware datasets show that EC2 accurately detects both small and large families, outperforming several comparative baselines. Furthermore, we show how to automatically characterize and explain unique behaviors of specific malware families, such as FakeInstaller, MobileTx, Geinimi. In short, EC2 presents an early warning system for emerging new malware families, as well as a robust predictor of the family (when it is not new) to which a new malware sample belongs, and the design of novel strategies for data-driven understanding of malware behaviors.

Original languageEnglish
Article number8013726
Pages (from-to)262-277
Number of pages16
JournalIEEE Transactions on Dependable and Secure Computing
Volume17
Issue number2
Early online date21 Aug 2017
DOIs
Publication statusPublished - 1 Mar 2020

Keywords

  • Android
  • Androids
  • classification
  • clustering
  • Clustering algorithms
  • ensemble
  • Feature extraction
  • Humanoid robots
  • Malware
  • malware
  • Mobile communication
  • Smart phones

Fingerprint

Dive into the research topics of 'EC2: Ensemble Clustering and Classification for Predicting Android Malware Families'. Together they form a unique fingerprint.

Cite this