ZAP: Transparent Checkpoint-Restart and Migration Using Operating System Virtualization

We have created Zap, a novel system for transparent migration of legacy and networked applications. Zap provides a thin virtualization layer on top of the operating system that introduces pods, which are groups of processes that are provided a consistent, virtualized view of the system. This decouples processes in pods from dependencies to the host operating system and other processes on the system. By integrating Zap virtualization with a checkpoint-restart mechanism, Zap can migrate a pod of processes as a unit among machines running independent operating systems without leaving behind any residual state after migration.

We have implemented a Zap prototype in Linux that supports transparent migration of unmodified applications without any kernel modifications. Our Linux Zap system extends a novel checkpoint / restart mechanism implementation from our earlier work on CRAK, a system that provided process Checkpoint and Restart As a Kernel Module for Linux. We demonstrate that our Linux Zap prototype can provide general-purpose process migration functionality with low overhead. Our experimental results for migrating pods used for running a standard user’s X windows desktop computing environment and for running an Apache web server show that these kinds of pods can be migrated with subsecond checkpoint and restart latencies.

More Information

Columbia University Department of Computer Science