PlayDrone: A Measurement Study of Google Play

Although millions of users download and use third-party Android applications from the Google Play store, little information is known on an aggregated level about these applications. We have built Playdrone, the first scalable Google Play store crawler, and used it to index and analyze over 1,100,000 applications in the Google Play store on a daily basis, the largest such index of Android applications. Playdrone leverages various hacking techniques to circumvent Google’s roadblocks for indexing Google Play store content, and makes proprietary application sources available, including source code for over 880,000 free applications. We demonstrate the usefulness of Playdrone in decompiling and analyzing application content by exploring four previously unaddressed issues: the characterization of Google Play application content at large scale and its evolution over time, library usage in applications and its impact on application portability, duplicative application content in Google Play, and the ineffectiveness of OAuth and related service authentication mechanisms resulting in malicious users being able to easily gain unauthorized access to user data and resources on Amazon Web Services and Facebook.

PlayDrone in the News

| CNET | Business Insider | CBS News | Network World | Ars Technica | The Hacker News
| Techworld | Tech Times | MacDailyNews | SD Times | The Register | SlashGear
| India Times | BGR | NDTV

Other Information

PlayDrone sources are available on GitHub

A collection of Android apps and metadata generated using PlayDrone are being maintained by Archive.org

Related Publications

A Measurement Study of Google Play

Nicolas Viennot, Edward Garcia, Jason Nieh
Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS 2014), June 2014

Abstract

PDF

 

Columbia University Department of Computer Science