- AboutThis should describe the systems research collaboration, and present the overall research goals of the new group.
- PeopleHere are the different labs in the SRC…
- PublicationsA page where you will find categorized publications!
- ProjectsA page where you will find our projects
- ResourcesVarious resources for prospective students, current students, alumni. Maybe put something here about life in NYC and at Columbia…
Publications from 2006
Proceedings of the 2006 IEEE Symposium on Security and Privacy (SP '06), May 2006
Many current systems allow data produced by potentially malicious sources to be mounted as a file system. File system code must check this data for dangerous values or invariant violations before using it. Because file system code typically runs inside the operating system kernel, even a single unchecked value can crash the machine or lead to an exploit. Unfortunately, validating file system images is complex: they form DAGs with complex dependency relationships across massive amounts of data bound together with intricate, undocumented assumptions. This paper shows how to automatically find bugs in such code using symbolic execution. Rather than running the code on manually-constructed concrete input, we instead run it on symbolic input that is initially allowed to be ``anything.'' As the code runs, it observes (tests) this input and thus constrains its possible values.
We generate test cases by solving these constraints for concrete values. The approach works well in practice: we checked the disk mounting code of three widely-used Linux file systems: ext2, ext3, and JFS and found bugs in all of them where malicious data could either cause a kernel panic or form the basis of a buffer overflow attack.
ACM Transactions on Computer Systems (TOCS), Volume 24, Issue 2, May 2006
As businesses continue to grow their World Wide Web presence, it is becoming increasingly vital for them to have quantitative measures of the mean client perceived response times of their web services. We present Certes (CliEnt Response Time Estimated by the Server), an online server-based mechanism that allows web servers to estimate mean client perceived re- sponse time, as if measured at the client. Certes is based on a model of TCP that quanti- fies the effect that connection drops have on mean client perceived response time by using three simple server-side measurements: connection drop rate, connection accept rate and con- nection completion rate. The mechanism does not require modifications to HTTP servers or web pages, does not rely on probing or third party sampling, and does not require client-side modifications or scripting. Certes can be used to estimate response times for any web content, not just HTML. We have implemented Certes and compared its response time estimates with those obtained with detailed client instrumentation. Our results demonstrate that Certes pro- vides accurate server-based estimates of mean client response times in HTTP 1.0/1.1 environ- ments, even with rapidly changing workloads. Certes runs online in constant time with very low overhead. It can be used at websites and server farms to verify compliance with service level objectives.
Operating Systems Review (OSR), April 2006
Operating system courses teach students much more when they provide hands-on kernel-level project experience with a real operating system. However, enabling a large class of students to do kernel development can be difficult. To ad- dress this problem, we created a virtual kernel development environment in which operating systems can be developed, debugged, and rebooted in a shared computer facility with- out affecting other users. Using virtual machines and remote display technology, our virtual kernel development labora- tory enables even distance learning students at remote loca- tions to participate in kernel development projects with on- campus students. We have successfully deployed and used our virtual kernel development environment together with the open-source Linux kernel to provide kernel-level project experiences for over nine hundred students in the introduc- tory operating system course at Columbia University.
Department of Computer Science, Columbia University Technical Report , CUCS-004-06, February 2006
We present Grouped Distributed Queues (GDQ ), the first proportional share scheduler for multiprocessor systems that, by using a distributed queue architecture, scales well with a large number of processors and processes. GDQ achieves accurate proportional fairness scheduling with only O(1) scheduling overhead. GDQ takes a novel approach to distributed queuing: instead of creating per-processor queues that need to be constantly balanced to achieve any measure of proportional sharing fairness, GDQ uses a simple group- ing strategy to organize processes into groups based on similar processor time allocation rights, and then assigns processors to groups based on aggregate group shares. Group membership of processes is static, and fairness is achieved by dynamically migrating processors among groups. The set of processors work- ing on a group use simple, low-overhead round-robin queues, while processor reallocation among groups is achieved using a new multiprocessor adaptation of the well-known Weighted Fair Queuing (WFQ ) algorithm. By commoditizing processors and decoupling their allocation from process scheduling, GDQ provides, with only constant scheduling cost, fairness within a constant of the ideal generalized processor sharing model for process weights with a fixed upper bound. We have implemented GDQ in Linux and measured its performance. Our experimental results show that GDQ has low overhead and scales well with the number of processors.
Ph.D. Thesis, Department of Computer Science, Columbia University, January 2006
Remote display provides a number of benefits, including ubiquitous access, thin-client computing, remote assistance and secure central management. In practice, it is crucial that remote display systems effectively support existing unmodified applications on commodity operating systems. However, commodity operating system environments have very differ- ent display architectures which impact how remote display can be implemented. This dissertation discusses the details of deploying remote display on commodity operating systems, including Linux(with X Window System) and Microsoft Windows. We focus on the implementation of a remote display system on Microsoft Windows and introduce several tech- niques, including kernel-user mode communication, a high throughput command channel, and efficient synchronization. With these techniques, we successfully implement an asynchronous remote display system on top of a synchronous local display system. Our work extends THINC, a virtual display architecture for remote display, from an X/Linux-only archi- tecture to one across X/Linux and Windows. We present the detailed algorithm we used for supporting a portable command processing pipeline across multiple commodity operating systems. Our experiment results demonstrate that our approach can achieve similar remote display performance on both X/Linux and Windows.