- AboutThis should describe the systems research collaboration, and present the overall research goals of the new group.
- PeopleHere are the different labs in the SRC…
- PublicationsA page where you will find categorized publications!
- ProjectsA page where you will find our projects
- ResourcesVarious resources for prospective students, current students, alumni. Maybe put something here about life in NYC and at Columbia…
Proceedings of the 12th Annual Linux Symposium, July 2010
As ARM CPUs grow in performance and ubiquity across phones, netbooks, and embedded computers, providing virtualization support for ARM-based devices is increasingly important. We present KVM/ARM, a KVM-based virtualization solution for ARM-based devices that can run virtual machines with nearly unmodified operating systems. Because ARM is not virtualizable, KVM/ARM uses lightweight paravirtualization, a script-based method to automatically modify the source code of an operating system kernel to allow it to run in a virtual machine. Lightweight paravirtualization is architecture specific, but operating system independent. It is minimally intrusive, completely automated, and requires no knowledge or understanding of the guest operating system kernel code. By leveraging KVM, which is an intrinsic part of the Linux kernel, KVM/ARM's code base can be always kept in line with new kernel releases without additional maintenance costs, and can be easily included in most Linux distributions. We have implemented a KVM/ARM prototype based on the Linux kernel used in Google Android, and demonstrated its ability to successfully run nearly unmodified Linux guest operating systems.
Ph.D. Thesis, Department of Computer Science, Columbia University, July 2010
Over the past decade, Peer-to-Peer (P2P) file-sharing and streaming systems have evolved as a cheap and effective technology in distributing content to users. Guar- anteeing a level of performance in P2P systems is, therefore, of utmost importance. However, P2P file-sharing and streaming applications suffer from a fundamental prob- lem of unfairness, where many users have a tendency to free-ride by contributing little or no upload bandwidth while consuming much download bandwidth. By taking away an unfair share of resources, free-riders deteriorate the quality of service experienced by other users, by causing slower download times in P2P file-sharing networks and higher stream updatesâ€™ miss rates in P2P streaming networks. Previous attempts at addressing fair bandwidth allocation in P2P, such as BitTorrent-like systems, suf- fer from slow peer discovery, inaccurate predictions of neighboring peersâ€™ bandwidth allocations, under-utilization of bandwidth, and complex parameter tuning. We present FairTorrent, a new deficit-based distributed algorithm that accurately rewards peers in accordance with their contribution in a file-sharing P2P system. In a nutshell, a FairTorrent peer uploads the next data block to a peer to whom it owes the most data. FairTorrent is resilient to exploitation by free-riders and strategic peers, is simple to implement, requires no bandwidth over-allocation, no prediction of peersâ€™ rates, no centralized control, and no parameter tuning. We implemented FairTorrent in a BitTorrent client without modifications to the BitTorrent protocol, and evaluated its performance against other widely-used BitTorrent clients using various scenarios including live BitTorrent swarms. Our results show that FairTorrent provides up to two orders of magnitude better fairness, up to five times better download performance for contributing peers, and 60-100% better performance on average in live BitTorrent swarms. We show analytically that for a number of upload capacity distributions, in an n-node FairTorrent network, no peer is ever owed more than O(log n) data blocks with high probability. Achieving fair bandwidth allocation in a P2P streaming scenario is even more difficult, as it comes with an additional constraint: each stream update must be received before its playback deadline. P2P live streaming systems require global re- source over-provisioning to deliver adequate streaming performance. When there is not enough bandwidth to accommodate all users for a particular stream, such as due to free-riders or low-contributing peers, all users, including high-contributing peers, observe poor performance. We present FairStream, a new P2P streaming system that delivers a good quality stream to peers that upload data at a rate above the stream rate, even in the presence of free-riders or malicious users. FairStream achieves this with three mechanisms. First, it provides a new peer reply policy framework that enables file sharing incentive mechanisms to be adapted for streaming. Second, it uses this framework to incorporate a deficit-based peer reply policy that enables each peer to reply first to the neighbor to whom it owes the most data as measured by a deficit counter. Third, it introduces a collusion-resistant mechanism to ensure ef- fective data distribution of a stream despite a large fraction of free-riders who do not forward received data. We prove that FairStream is resilient to free-riders and rewards peers with streaming performance correlated with their contributions. We have also implemented FairStream as a BitTorrent client and evaluated its perfor- mance against other popular streaming systems. Our results on PlanetLab show that FairStream, similar to other systems, provides good quality streaming performance when resources are over-provisioned, but it also provides orders of magnitude better streaming performance for peers uploading above the stream rate when resources are constrained, in the presence of free-riders and low-contributing peers.
Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS 2010), June 2010
We present SCRIBE, the first system to provide transparent, low- overhead application record-replay and the ability to go live from replayed execution. SCRIBE introduces new lightweight operat- ing system mechanisms, rendezvous and sync points, to efficiently record nondeterministic interactions such as related system calls, signals, and shared memory accesses. Rendezvous points make a partial ordering of execution based on system call dependen- cies sufficient for replay, avoiding the recording overhead of main- taining an exact execution ordering. Sync points convert asyn- chronous interactions that can occur at arbitrary times into syn- chronous events that are much easier to record and replay. We have implemented SCRIBE without changing, relinking, or re- compiling applications, libraries, or operating system kernels, and without any specialized hardware support such as hardware perfor- mance counters. It works on commodity Linux operating systems, and commodity multi-core and multiprocessor hardware. Our re- sults show for the first time that an operating system mechanism can correctly and transparently record and replay multi-process and multi-threaded applications on commodity multiprocessors. SCRIBE recording overhead is less than 2.5% for server applications includ- ing Apache and MySQL, and less than 15% for desktop applica- tions including Firefox, Acrobat, OpenOffice, parallel kernel com- pilation, and movie playback.
Proceedings of the 2010 USENIX Annual Technical Conference (USENIX 2010), June 2010
Desktop computers are often compromised by the inter- action of untrusted data and buggy software. To address this problem, we present Apiary, a system that trans- parently contains application faults while retaining the usage metaphors of a traditional desktop environment. Apiary accomplishes this with three key mechanisms. It isolates applications in containers that integrate in a con- trolled manner at the display and file system. It intro- duces ephemeral containers that are quickly instantiated for single application execution, to prevent any exploit that occurs from persisting and to protect user privacy. It introduces the Virtual Layered File System to make instantiating containers fast and space efficient, and to make managing many containers no more complex than a single traditional desktop. We have implemented Api- ary on Linux without any application or operating sys- tem kernel changes. Our results with real applications, known exploits, and a 24-person user study show that Apiary has modest performance overhead, is effective in limiting the damage from real vulnerabilities, and is as easy for users to use as a traditional desktop.
Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS 2010), June 2010
We present RSIO, a processor scheduling framework for im- proving the response time of latency-sensitive applications by monitoring accesses to I/O channels and inferring when user interactions occur. RSIO automatically identifies pro- cesses involved in a user interaction and boosts their prior- ities at the time the interaction occurs to improve system response time. RSIO also detects processes indirectly in- volved in processing an interaction, automatically account- ing for dependencies and boosting their priorities accord- ingly. RSIO works with existing schedulers and requires no application modifications to identify periods of latency- sensitive application activity. We have implemented RSIO in Linux and measured its effectiveness on microbenchmarks and real applications. Our results show that RSIO is easy to use and can provide substantial improvements in system performance for latency-sensitive applications.
Proceedings of the 3rd Annual Haifa Experimental Systems Conference (SYSTOR 2010), May 2010
Operating system (OS) virtualization can provide a num- ber of important benefits, including transparent migration of applications, server consolidation, online OS maintenance, and enhanced system security. However, the construction of such a system presents a myriad of challenges, even for the most cautious developer, that if overlooked may result in a weak, incomplete virtualization. We present a detailed dis- cussion of key implementation issues in providing OS virtu- alization in a commodity OS, including system call interposi- tion, virtualization state management, and race conditions. We discuss our experiences in implementing such functional- ity across two major versions of Linux entirely in a loadable kernel module without any kernel modification. We present experimental results on both uniprocessor and multiproces- sor systems that demonstrate the ability of our approach to provide fine-grain virtualization with very low overhead.
Ph.D. Thesis, Department of Computer Science, Columbia University, March 2010
This dissertation demonstrates that operating system virtualization is an effective method for solving many different types of computing problems. We have designed novel systems that make use of commodity software while solving problems that were not conceived when the software was originally written. We show that by leveraging and extending existing virtualization techniques, and introducing new ones, we can build these novel systems without requiring the applications or operating systems to be rewritten. We introduce six architectures that leverage operating system virtualization. *Pod creates fully secure virtual environments and improves user mobility. AutoPod re- duces the downtime needed to apply kernel patches and perform system maintenance. PeaPod creates least-privilege systems by introducing the pea abstraction. Strata im- proves the ability of administrators to manage large numbers of machines by introduc- ing the Virtual Layered File System. Apiary builds upon Strata to create a new form of desktop security by using isolated persistent and ephemeral application containers. Finally, ISE-T applies the two-person control model to system administration. By leveraging operating system virtualization, we have built these architectures on Linux without requiring any changes to the underlying kernel or user-space ap- plications. Our results, with real applications, demonstrate that operating system virtualization has minimal overhead. These architectures solve problems with min- imal impact on end-users while providing functionality that would previously have required modifications to the underlying system.
Proceedings of the 41st ACM Technical Symposium on Computer Science Education (SIGCSE 2010), March 2010
Students learn more through hands-on project experience for computer science courses such as operating systems, but pro- viding the infrastructure support for a large class to learn by doing can be hard. To address this issue, we introduce a new approach to managing and grading operating system home- work assignments based on virtual appliances, a distributed version control system, and live demonstrations. Our solu- tion is easy to deploy and use with studentsâ€™ personal com- puters, and obviates the need to provide a computer labora- tory for teaching purposes. It supports the most demanding course projects, such as those that involve operating system kernel development, and can be used by both on-campus and remote distance learning students even with intermit- tent network connectivity. Our experiences deploying and using this solution to teach operating systems at Columbia University show that it is easier to use, more flexible, and more pedagogically effective than other approaches.
Proceedings of the 11th IEEE International Symposium on Multimedia (ISM 2009), December 2009
We present MediaPod, a portable system that al- lows mobile users to maintain the same persistent, personalized multimedia desktop environment on any available computer. Regardless of which computer is being used, MediaPod pro- vides a consistent multimedia desktop session, maintaining all of a userâ€™s applications, documents and configuration settings. This is achieved by leveraging rapid improvements in capacity, cost, and size of portable storage devices. MediaPod provides a virtualization and checkpoint-restart mechanism that de- couples a desktop environment and its applications from the host, enabling multimedia desktop sessions to be suspended to portable storage, carried around, and resumed from the storage device on another computer. MediaPod virtualization also isolates desktop sessions from the host, protecting the privacy of the user and preventing malicious applications from damaging the host. We have implemented a Linux MediaPod prototype and demonstrate its ability to quickly suspend and resume multimedia desktop sessions, enabling a seamless computing experience for mobile users as they move among computers.
Proceedings of the 5th ACM Conference on emerging Networking EXperiments and Technologies (CoNEXT 2009), December 2009
Peer-to-Peer file-sharing applications suffer from a fundamental problem of unfairness. Free-riders cause slower down load times for others by contributing little or no upload bandwidth while consuming much download bandwidth. Previous attempts to address this fair bandwidth allocation problem suffer from slow peer discovery, inaccurate predictions of neighboring peersâ€™ bandwidth allocations, un derutilization of bandwidth, and complex parameter tuning. We present FairTorrent, a new deficit-based distributed algorithm that accurately rewards peers in accordance with their contribution. A FairTorrent peer simply uploads the next data block to a peer to whom it owes the most data as measured by a deficit counter. FairTorrent is resilient to exploitation by free-riders and strategic peers, is simple to im plement, requires no bandwidth over-allocation, no predic tion of peersâ€™ rates, no centralized control, and no parameter tuning. We implemented FairTorrent in a BitTorrent client without modifications to the BitTorrent protocol, and evaluated its performance against other widely-used BitTorrent clients. Our results show that FairTorrent provides up to two orders of magnitude better fairness, up to five times better download times for contributing peers, and 60% to 100% better performance on average in live BitTorrent swarms.