A Measurement Study of Google Play

Nicolas Viennot, Edward Garcia, Jason Nieh

Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS 2014), pp. 221–233, Austin, TX, USA, June 16-20, 2014, pp. 221-233

Abstract

Although millions of users download and use third-party Android applications from the Google Play store, little information is known on an aggregated level about these applications. We have built PlayDrone, the first scalable Google Play store crawler, and used it to index and analyze over 1,100,000 applications in the Google Play store on a daily basis, the largest such index of Android applications. PlayDrone leverages various hacking techniques to circumvent Google’s roadblocks for indexing Google Play store content, and makes proprietary application sources available, including source code for over 880,000 free applications. We demonstrate the usefulness of PlayDrone in decompiling and analyzing application content by exploring four previously unaddressed issues: the characterization of Google Play application content at large scale and its evolution over time, library usage in applications and its impact on application portability, duplicative application content in Google Play, and the ineffectiveness of OAuth and related service authentication mechanisms resulting in malicious users being able to easily gain unauthorized access to user data and resources on Amazon Web Services and Facebook.

PDF

sigmetrics2014:playdrone

Columbia University Department of Computer Science