- AboutThis should describe the systems research collaboration, and present the overall research goals of the new group.
- PeopleHere are the different labs in the SRC…
- PublicationsA page where you will find categorized publications!
- ProjectsA page where you will find our projects
- ResourcesVarious resources for prospective students, current students, alumni. Maybe put something here about life in NYC and at Columbia…
Proceedings of the 2002 USENIX Annual Technical Conference, June 2002
The growing popularity of thin-client systems makes it important to determine the factors that govern the performance of these thin-client architectures. To assess the viability of the thin-client computing model, we measured the performance of six popular thin-client platformsâ€”Citrix MetaFrame, Microsoft Terminal Services, Sun Ray, Tarantella, VNC, and Xâ€”running over a wide range of network access bandwidths. We find that thin- client systems can perform well on web and multimedia applications in LAN environments, but the efficiency of the thin-client protocols varies widely. We analyze the differences in the various approaches and explain the impact of the underlying remote display protocols on overall performance. Our results quantify the impact of different approaches in display encoding primitives, display update policies, and display caching and compression techniques across a broad range of thin-client systems.
PC Magazine, May 2002
After having led the thin- client movement in the Windows world, Citrix takes a step forward with Meta- Frame XP. MetaFrame runs under Microsoft Windows 2000 Ser- ver, and the latest release demon- strated better performance in the lab than its predecessor (Meta- Frame 1.8). It also delivers richer management features.
Proceedings of the Data Compression Conference (DCC) 2002, April 2002
In recent years, the growing total cost of ownership has resulted in a shift away from the distributed model of desktop computing toward a more centralized server-based computing (SBC) model. In SBC, all application logic is executed on the server while clients simply process the resulting screen updates sent from the server. To provide good performance, SBC systems employ various techniques to encode the screen updates to minimize the bandwidth and processing requirements of sending the screen updates. However, existing SBC encoding techniques are not able to effectively support multimedia applications. To address this problem, we have developed a family of linear interpolation algorithms for encoding SBC screen updates. We first present an overview of an optimal linear interpolation (OLI) algorithm. Given a rectangular region of pixels to be encoded, OLI represents the region as a one-dimensional function, mapping from the cardinal number of each pixel to the color value of the pixel. OLI converts this function into a piecewise linear representation consisting of points. OLI can provide a precise lossless piecewise linear representation. Alternatively, when network bandwidth is limited, OLI can adapt to the available bandwidth by limiting the value of and recursively iterating through all combinations of pixels in search of an optimal set of points to provide an approximate but lossy representation. The points are sent to the client, which then interpolates through those pixel values to recover the original rectangular region of pixels. While the decoding time on the client is negligible, the encoding time on the server has exponential computational complexity, which limits its utility for SBC. To reduce encoding complexity, we developed two linear interpolation algorithms with linear encoding and decoding computational complexity. The algorithms near optimal linear interpolation (NOLI) and 2-D lossless linear interpolation (2DLI). NOLI is a greedy algorithm that heuristically minimizes the number of pivot points of the piecewise function while maintaining that all pixels within the same segment have pixel values within a bound defined by that segment. Those pivot points are sent to the clients to recover the original rectangular region through linear interpolation. If the bounds for all segments are , NOLI results in a lossless representation, otherwise, it results in a lossy representation. 2DLI is also a greedy algorithm. It tries to minimize the number of pixels that can not be interpolated by gradients with a constant through its neighboring pixels in the rectangular region. Those pixel values that can not be interpolated are delivered to the clients for recovery of the rectangular region through 2-dimensional linear interpolation. We have implemented our linear interpolation algorithms and compared their performance with other approaches on discrete-toned and smoothed-toned images. Our initial results show that 2DLI achieves a good combination of high compression ratios with low coding time. In particular, our results show that 2DLI provides much better compression than JPEG, gzip or VNC on web pages, screen dumps, and instructional videos, and performs second only to JPEG on smooth-toned images, but with much lower coding time.
Proceedings of the IEEE International Conference on Communications (ICC) 2002, April 2002
Server-based computing (SBC) is becoming a popular ap- proach to deliver computational services across the network due to its re- duced administrative costs and better resource utilization. In SBC, all application processing is done on servers while only screen updates are sent to clients. While many SBC encoding techniques have been explored for transmitting these screen updates efficiently, existing screen update ap- proaches do not effectively support multimedia applications. To address this problem, we propose optimal linear interpolation (OLI), a new pixel- based SBC screen update coding algorithm. With OLI, the server selects and transmits only a small sample of pixels to represent a screen update. The client recovers the complete screen update from these samples using piecewise linear interpolation to achieve the best visual quality. OLI can be used to provide lossless or lossy compression to adaptively trade-off between network bandwidth and processing time requirements. We fur- ther propose and evaluate 2-D lossless linear interpolation (2DLI), which is based on OLI but additionally provides lower encoding complexity for lossless compression. Our experimental results show that when compared with other compression methods, 2DLI provides good data compression ratio with modest computational overhead, for both servers and clients.
Department of Computer Science, Columbia University Technical Report , CUCS-003-02, February 2002
Virtual Network Address Translation (VNAT) is a novel architecture that allows transparent migration of end-to-end live network connections associated with various computation units. Such computation units can be either a single process, or a group of processes, or an entire host. VNAT virtualizes network connections perceived by transport protocols so that identification of network connections is decoupled from stationary hosts. Such virtual connections are then remapped into physical connections to be carried on the physical network using network address translation. VNAT requires no modifi- cation to existing applications, operating systems, or protocol stacks. Furthermore, it is fully compatible with the existing communication infrastructure; virtual and normal connections can coexist without interfering each other. VNAT func- tions entirely within end systems and requires no third party services. We have implemented a VNAT prototype with the Linux 2.4 kernel and demonstrated its functionality on a wide range of popular real-world network applications. Our per- formance results show that VNAT has essentially no network performance overhead except when connections are migrated, in which case the overhead of our Linux prototype is less than 7 percent over a stock RedHat Linux system.
Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP '01), November 2001
We present a study of operating system errors found by automatic, static, compiler analysis applied to the Linux and OpenBSD kernels. Our approach differs from previous studies that consider errors found by manual inspection of logs, testing, and surveys because static analysis is applied uniformly to the entire kernel source, though our approach necessarily considers a less comprehensive variety of errors than previous studies. In addition, automation allows us to track errors over multiple versions of the kernel source to estimate how long errors remain in the system before they are fixed.
We found that device drivers have error rates up to three to seven times higher than the rest of the kernel. We found that the largest quartile of functions have error rates two to six times higher than the smallest quartile. We found that the newest quartile of files have error rates up to twice that of the oldest quartile, which provides evidence that code ``hardens'' over time. Finally, we found that bugs remain in the Linux kernel an average of 1.8 years before being fixed.
Department of Computer Science, Columbia University Technical Report , CUCS-014-01, November 2001
Process checkpoint/restart is a very useful technology for process migration, load balancing, crash recovery, rollback transaction, job controlling and many other purposes. Although process migration has not yet been widely used and is not widely available commercial systems, the growing shift of computing facilities from supercomputers to networked workstations and distributed systems is increasing the importance and demand for migration technologies. In this paper, we describe the design and implementation of CRAK, an innovative transparent checkpoint/restart package for Linux. CRAK provides transparent migration of Linux networked applications and computing environments without modifying, recompiling, or relinking applications or the operating system. CRAK is the first system for Unix/Linux that provides transparent checkpoint/restart with the following properties: (1) it does not require any modifications of existing operating system or application code and (2) it supports migrating network sockets. Prototype implementations are available for Linux 2.2 and Linux 2.4 kernels.
Proceedings of the 2001 USENIX Annual Technical Conference, June 2001
Modern thin-client systems are designed to provide the same graphical interfaces and applications available on traditional desktop computers while centralizing administration and allowing more efficient use of computing resources. Despite the rapidly increasing popularity of these client-server systems, there are few reliable analyses of their performance. Industry standard benchmark techniques commonly used for measuring desktop system performance are ill-suited for measuring the performance of thin-client systems because these benchmarks only measure application performance on the server, not the actual user- perceived performance on the client. To address this problem, we have developed slow-motion benchmarking, a new measurement technique for evaluating thin-client systems. In slow-motion benchmarking, performance is measured by capturing network packet traces between a thin client and its respective server during the execution of a slow-motion version of a standard application benchmark. These results can then be used either independently or in conjunction with standard benchmark results to yield an accurate and objective measure of the performance of thin-client systems. We have demonstrated the effectiveness of slow-motion benchmarking by using this technique to measure the performance of several popular thin-client systems in various network environments on web and multimedia workloads. Our results show that slow-motion benchmarking resolves the problems with using standard benchmarks on thin-client systems and is an accurate tool for analyzing the performance of these systems.
Proceedings of the 2001 USENIX Annual Technical Conference, June 2001
Proportional share resource management provides a flexible and useful abstraction for multiplexing time- shared resources. However, previous proportional share mechanisms have either weak proportional sharing ac- curacy or high scheduling overhead. We present Virtual- Time Round-Robin (VTRR), a proportional share sched- uler that can provide good proportional sharing accuracy with O(1) scheduling overhead. VTRR achieves this by combining the benefits of fair queueing algorithms with a round-robin scheduling mechanism. Unlike many other schedulers, VTRR is simple to implement. We have implemented a VTRR CPU scheduler in Linux in less than 100 lines of code. Our performance re- sults demonstrate that VTRR provides accurate propor- tional share allocation with constant, sub-microsecond scheduling overhead. The scheduling overhead using VTRR is two orders of magnitude less than the standard Linux scheduler for large numbers of clients.