CPU Scheduling with Automatic Interactivity and Dependency Detection

Haoqiang Zheng

Ph.D. Thesis, Department of Computer Science, Columbia University, August 2010

Abstract

Recent trends in virtualization and server consolidation have expanded the number of appli- cations with different resource requirements and quality-of-service demands being run on the same system. Users expect computers not only to run these different applications, but to be able to run them all at the same time. A key challenge is how to ensure that the sys- tem provides acceptable interactive responsiveness to users while multiplexing resources among a diverse collection of applications. However, identifying interactive processes and scheduling them promptly are not easy because the latency sensitivity of a process in mod- ern computer systems may vary dynamically based on the user request it is processing and the user requests that depend on this process directly and indirectly. Most commod- ity operating systems either allow users to specify the latency sensitivity of a process or try to detect interactive latency-sensitive processes based on processor resource usage and sleeping behavior. These are generally ineffective. This dissertation introduces RSIO. It detects latency-sensitive processes by monitor- ing accesses to I/O channels and inferring when user interactions occur. RSIO provides a general mechanism for all user interactions, including direct interactions via local HCI devices such as mouse and keyboard, indirect interactions through middleware, and remote interactions through networks. It automatically and dynamically identifies processes in- volved in a user interaction and boosts their priorities at the time the interaction occurs to improve system response time. RSIO detects processes that directly handle a user in- teraction as well as those indirectly involved in processing the interaction, automatically accounting for dependencies and boosting their priorities accordingly. RSIO works with existing schedulers, processes that may mix interactive and batch activities, and requires no application modifications to identify periods of latency-sensitive application activity. Even when a process is detected as latency-sensitive and its priority is boosted, the process may still not be scheduled promptly because of a problem known as priority in- version, which happens when a high priority process blocks waiting for the response from a low priority process. Without knowing the dependency among the processes, the CPU scheduler may schedule a medium priority process to run, and thus effectively delay the execution of the high priority process. We have developed SWAP to address the prior- ity inversion problems caused by inter-process dependencies. SWAP can automatically determine possible resource dependencies among processes based on process system call history. Because some dependencies cannot be precisely determined, SWAP associates confidence levels with dependency information that are dynamically adjusted using feed- back from process blocking behavior. SWAP can schedule processes using this imprecise dependency information in a manner that is compatible with existing scheduling mecha- nisms and ensures that actual scheduling behavior corresponds to the desired scheduling policy in the presence of process dependencies. Our results show that SWAP can provide substantial improvements in system performance in scheduling processes with dependen- cies. As CPU schedulers are complicated to develop and increasingly important with the introduction of multi-core systems, we also introduce WARP, which is a new scheduler de- velopment and evaluation platform which facilitated our solutions. WARP is a trace-driven virtualized CPU scheduler execution environment that can dramatically simplify and speed the development and evaluation of CPU schedulers, including SWAP and RSIO. It is easy to use as it can run unmodified kernel scheduling code in user-space and can be used with standard user-space debugging and performance monitoring tools. We have implemented a WARP Linux prototype. Our results show that it can use application traces captured from its toolkit to accurately reflect the scheduling behavior of the real Linux operating system. Executing an application trace using WARP can be two orders of magnitude faster than running real applications.

PDF

zhengthesis

Columbia University Department of Computer Science