- AboutThis should describe the systems research collaboration, and present the overall research goals of the new group.
- PeopleHere are the different labs in the SRC…
- PublicationsA page where you will find categorized publications!
- ProjectsA page where you will find our projects
- ResourcesVarious resources for prospective students, current students, alumni. Maybe put something here about life in NYC and at Columbia…
Proceedings of the 23rd Large Installation System Administration Conference (LISA 2009), November 2009
Modern computing systems are complex and difficult to administer, making them more prone to system admin- istration faults. Faults can occur simply due to mistakes in the process of administering a complex system. These mistakes can make the system insecure or unavailable. Faults can also occur due to a malicious act of the system administrator. Systems provide little protection against system administrators who install a backdoor or other- wise hide their actions. To prevent these types of sys- tem administration faults, we created ISE-T (I See Ev- erything Twice), a system that applies the two-person control model to system administration. ISE-T requires two separate system administrators to perform each ad- ministration task. ISE-T then compares the results of the two administratorsâ€™ actions for equivalence. ISE-T only applies the results of the actions to the real sys- tem if they are equivalent. This provides a higher level of assurance that administration tasks are completed in a manner that will not introduce faults into the system. While the two-person control model is expensive, it is a natural fit for many financial, government, and mili- tary systems that require higher levels of assurance. We implemented a prototype ISE-T system for Linux using virtual machines and a unioning file system. Using this system, we conducted a real user study to test its ability to capture changes performed by seperate system admin- istrators and compare them for equivalence. Our results show that ISE-T is effective at determining equivalence for many common administration tasks, even when ad- ministrators perform those tasks in different ways.
Proceedings of the 3rd International Conference on Mobile Ubiquitous Computing, Systems, Services, and Technologies (UBICOMM 2009), October 2009
We present GamePod, a portable system that enables mo- bile users to use the same persistent, gaming environment on any available computer. No matter what computer is being used, GamePod provides a consistent gaming en- vironment, maintaining all of a userâ€™s games, including active game state. This is achieved by leveraging rapid improvements in capacity, cost, and size of portable stor- age devices. GamePod provides a middleware layer that enables virtualization and checkpoint/restart functional- ity that decouples the gaming environment from a host machine. This enables gaming sessions to be suspended to portable storage, carried around, and resumed from the storage device on another computer. GamePodâ€™s middle- ware layer also isolates gaming sessions from the host, protecting the host by preventing malicious executable content from damaging the host. We have implemented a Linux GamePod prototype and demonstrate its ability to quickly suspend and resume gaming sessions, enabling a seamless gaming experience for mobile users as they move among computers.
Proceedings of the 12th Information Security Conference (ISC 2009), September 2009
While peer-to-peer (P2P) file-sharing is a powerful and cost-effective content distribution model, most paid-for digital-content providers (CPs) use di- rect download to deliver their content. CPs are hesitant to rely on a P2P distribu- tion model because it introduces a number of security concerns including content pollution by malicious peers, and lack of enforcement of authorized downloads. Furthermore, because users communicate directly with one another, the users can easily form illegal file-sharing clusters to exchange copyrighted content. Such ex- change could hurt the content providersâ€™ profits. We present a P2P system TP2P, where we introduce a notion of trusted auditors (TAs). TAs are P2P peers that police the system by covertly monitoring and taking measures against misbehav- ing peers. This policing allows TP2P to enable a stronger security model making P2P a viable alternative for the distribution of paid digital content. Through anal- ysis and simulation, we show the effectiveness of even a small number of TAs at policing the system. In a system with as many as 60% of misbehaving users, even a small number of TAs can detect 99% of illegal cluster formation. We de- velop a simple economic model to show that even with such a large presence of malicious nodes, TP2P can improve CPâ€™s profits (which could translate to user savings) by 62% to 122%, even while assuming conservative estimates of content and bandwidth costs. We implemented TP2P as a layer on top of BitTorrent and demonstrated experimentally using PlanetLab that our system provides trusted P2P file sharing with negligible performance overhead.
Proceedings of the 12th Information Security Conference (ISC 2009), September 2009
Continued improvements in network bandwidth, cost, and ubiquitous access are enabling service providers to host desktop computing environments to address the complexity, cost, and mobility limitations of todayâ€™s personal com- puting infrastructure. However, distributed denial of service attacks can deny use of such services to users. We present A2 M, a secure and attack-resilient desktop computing hosting infrastructure. A2 M combines a stateless and secure commu- nication protocol, a single-hop Indirection-based network (IBN) and a remote display architecture to provide mobile users with continuous access to their desk- top computing sessions. Our architecture protects both the hosting infrastructure and the clientâ€™s connections against a wide range of service disruption attacks. Unlike any other DoS protection system, A2 M takes advantage of its low-latency remote display mechanisms and asymmetric traffic characteristics by using multi- path routing to send a small number of replicas of each packet transmitted from client to server. This packet replication through different paths, diversifies the client-server communication, boosting system resiliency and reducing end-to- end latency. Our analysis and experimental results on PlanetLab demonstrate that A2 M significantly increases the hosting infrastructureâ€™s attack resilience even for wireless scenarios. Using conservative ISP bandwidth data, we show that we can protect against attacks involving thousands (150, 000) attackers, while providing good performance for multimedia and web applications and basic GUI interac- tions even when up to 30% and 50%, respectively, of indirection nodes become unresponsive.
Proceedings of the 18th USENIX Security Symposium, August 2009
Today's technical and legal landscape presents formidable challenges to personal data privacy. First, our increasing reliance on Web services causes personal data to be cached, copied, and archived by third parties, often without our knowledge or control. Second, the disclosure of private data has become commonplace due to carelessness, theft, or legal actions.
Our research seeks to protect the privacy of past, archived data - such as copies of emails maintained by an email provider - against accidental, malicious, and legal attacks. Specifically, we wish to ensure that all copies of certain data become unreadable after a userspecified time, without any specific action on the part of a user, and even if an attacker obtains both a cached copy of that data and the userâ€™s cryptographic keys and passwords.
This paper presents Vanish, a system that meets this challenge through a novel integration of cryptographic techniques with global-scale, P2P, distributed hash tables (DHTs). We implemented a proof-of-concept Vanish prototype to use both the million-plus-node Vuze Bit- Torrent DHT and the restricted-membership OpenDHT. We evaluate experimentally and analytically the functionality, security, and performance properties of Vanish, demonstrating that it is practical to use and meets the privacy-preserving goals described above. We also describe two applications that we prototyped on Vanish: a Firefox plugin for Gmail and other Web sites and a Vanishing File application.
Proceedings of the Workshop on Hot Topics in Cloud Computing (HotCloud), June 2009
Web services are undergoing an exciting transition from in-house data centers to public clouds. Attracted by automatic scalability and extremely low compute, storage, and management costs, Web services are increasingly opting for public cloud deployment over traditional in-house datacenters. For example, Amazon's S3 provides storage and backup services for numerous applications, a number of mature services have recently migrated to Amazon EC2, and many startups are adopting the cloud as their sole viable solution to achieve scale. While predictions regarding cloud computing vary, most of the community agrees that public clouds will continue to grow in the number and importance of their tenants.
This paper focuses on a new opportunity introduced by the cloud environment: specifically, rich data sharing among independent Web services that are co-located within the same cloud. In the future, we expect that a small number of giant-scale shared clouds - such as Amazon AWS, Google AppEngine, or Microsoft Azure - will result in an unprecedented environment where thousands of independent and mutually distrustful Web services share the same runtime environment, storage system, and cloud infrastructure. One could even imagine that most of the Web will someday be served from a handful of giant-scale clouds. What will that new sharedcloud environment look like? What are the opportunities and challenges created by this integration and consolidation? While challenges raised by the multi-tenant environment, such as isolation, security, and privacy, have received significant recent attention, we believe that identifying untapped opportunities is equally important, as it enables innovation and advancement in the new shared-cloud world.
Proceedings of the Sixth Symposium on Networked Systems Design and Implementation (NSDI '09), April 2009
MoDist is the first model checker designed for transparently checking unmodified distributed systems running on unmodified operating systems. It achieves this transparency via a novel architecture: a thin interposition layer exposes all actions in a distributed system and a centralized, OS-independent model checking engine explores these actions systematically. We made MoDist practical through three techniques: an execution engine to simulate consistent, deterministic executions and failures; a virtual clock mechanism to avoid false positives and false negatives; and a state exploration framework to incorporate heuristics for efficient error detection.
We implemented MoDist on Windows and applied it to three well-tested distributed systems: Berkeley DB, a widely used open source database; MPS, a deployed Paxos implementation; and PacificA, a primary-backup replication protocol implementation. MoDist found 35 bugs in total. Most importantly, it found protocol-level bugs (i.e., flaws in the core distributed protocols) in every system checked: 10 in total, including 2 in Berkeley DB, 2 in MPS, and 6 in PacificA.
Proceedings of the 2009 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2009), April 2009
Developing CPU scheduling algorithms and under- standing their impact in practice can be difficult and time consum- ing due to the need to modify and test operating system kernel code and measure the resulting performance on a consistent workload of real applications. To address this problem, we have developed WARP, a trace-driven virtualized scheduler execution environment that can dramatically simplify and speed the development of CPU schedulers. WARP is easy to use as it can run unmodified kernel scheduling code and can be used with standard user-space debugging and performance monitoring tools. It accomplishes this by virtualizing operating system and hardware events to decouple kernel scheduling code from its native operating system and hardware environment. A simple kernel tracing toolkit can be used with WARP to capture traces of all CPU scheduling related events from a real system. WARP can then replay these traces in its virtualized environment with the same timing characteristics as in the real system. Traces can be used with different schedulers to provide accurate comparisons of scheduling performance for a given application workload. We have implemented a WARP Linux prototype. Our results show that WARP can use application traces captured from its toolkit to accurately reflect the scheduling behavior of the real Linux operating system. Furthermore, testing scheduler behavior using WARP with application traces can be two orders
Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2009), March 2009
Software failures in server applications are a significant problem for preserving system availability. We present AS- SURE, a system that introduces rescue points that recover software from unknown faults while maintaining both sys- tem integrity and availability, by mimicking system behav- ior under known error conditions. Rescue points are loca- tions in existing application code for handling a given set of programmer-anticipated failures, which are automatically repurposed and tested for safely enabling fault recovery from a larger class of (unanticipated) faults. When a fault occurs at an arbitrary location in the program, ASSURE restores execution to an appropriate rescue point and in- duces the program to recover execution by virtualizing the programâ€™s existing error-handling facilities. Rescue points are identified using fuzzing, implemented using a fast co- ordinated checkpoint-restart mechanism that handles multi- process and multi-threaded applications, and, after testing, are injected into production code using binary patching. We have implemented an ASSURE Linux prototype that oper- ates without application source code and without base op- erating system kernel changes. Our experimental results on a set of real-world server applications and bugs show that ASSURE enabled recovery for all of the bugs tested with fast recovery times, has modest performance overhead, and provides automatic self-healing orders of magnitude faster than current human-driven patch deployment methods.
ACM Transactions on Storage Systems, Volume 4, Issue 4, January 2009
Kinesis is a novel data placement model for distributed storage systems. It exemplifies three design principles: structure (division of servers into a few failure-isolated segments), freedom of choice (freedom to allocate the best servers to store and retrieve data based on current resource availability), and scattered distribution (independent, pseudo-random spread of replicas in the system). These design principles enable storage systems to achieve balanced utilization of storage and network resources in the presence of incremental system expansions, failures of single and shared components, and skewed distributions of data size and popularity. In turn, this ability leads to significantly reduced resource provisioning costs, good user-perceived response times, and fast, parallelized recovery from independent and correlated failures. This article validates Kinesis through theoretical analysis, simulations, and experiments on a prototype implementation. Evaluations driven by real-world traces show that Kinesis can significantly outperform the widely used Chain replica-placement strategy in terms of resource requirements, end-to-end delay, and failure recovery.