Kinesis: A new approach to replica placement in distributed storage systems

John MacCormick, Nicholas Murphy, Venugopalan Ramasubramanian, Udi Wieder, Junfeng Yang, Lidong Zhou

ACM Transactions on Storage Systems, Volume 4, Issue 4, January, 2009, pp. 1–28

Abstract

Kinesis is a novel data placement model for distributed storage systems. It exemplifies three design principles: structure (division of servers into a few failure-isolated segments), freedom of choice (freedom to allocate the best servers to store and retrieve data based on current resource availability), and scattered distribution (independent, pseudo-random spread of replicas in the system). These design principles enable storage systems to achieve balanced utilization of storage and network resources in the presence of incremental system expansions, failures of single and shared components, and skewed distributions of data size and popularity. In turn, this ability leads to significantly reduced resource provisioning costs, good user-perceived response times, and fast, parallelized recovery from independent and correlated failures. This article validates Kinesis through theoretical analysis, simulations, and experiments on a prototype implementation. Evaluations driven by real-world traces show that Kinesis can significantly outperform the widely used Chain replica-placement strategy in terms of resource requirements, end-to-end delay, and failure recovery.

PDF

kinesis:tos

Columbia University Department of Computer Science