We have developed a variety of software hardening techniques against most of the known classes of software vulnerabilities and malicious attacks, such as race conditions, buffer underflows/overflows, null pointer exceptions, unhandled exceptions, code injection, and code-reuse attacks. However, our prior work focused on binary-only environments (i.e., post-compilation processing). While it has the advantage of broad applicability, it sacrifices efficiency and completeness due to the loss of semantic information that is present in the source code but is lost during the compilation phase. Meanwhile, the source code of many software systems, including those developed by the open-source community or used by the military, is often available. Utilizing the rich semantics of the source code can improve the efficiency and completeness of many software protection techniques. Our objective in this project is to develop new and integrate existing software protection techniques into compiler frameworks such as LLVM and GCC. We will integrate several binary-only techniques we have developed as well as several new techniques we will develop into the compiler to improve their efficiency and security.
Participants
PI: Prof. Junfeng Yang, Columbia University
PI: Prof. Angelos Keromytis, Columbia University
This work is supported by the Office of Naval Research (ONR) through Contract N00014-12-1-0166. Opinions, findings, conclusions and recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the US Government or ONR.
Related Publications
Proceedings of the 20th ACM Conference on Computer and Communications Security (CCS 2013), November 2013
Dynamic data flow tracking (DFT) is a technique broadly used in a variety of security applications that, unfortunately, exhibits poor performance, preventing its adoption in production systems. We present ShadowReplica, a new and efficient approach for accelerating DFT and other shadow memory-based analyses, by decoupling analysis from execution and utilizing spare CPU cores to run them in parallel. Our approach enables us to run a heavyweight technique, like dynamic taint analysis (DTA), twice as fast, while concurrently consuming fewer CPU cycles than when applying it in-line. DFT is run in parallel by a second shadow thread that is spawned for each application thread, and the two communicate using a shared data structure. We avoid the problems suffered by previous approaches, by introducing an off-line application analysis phase that utilizes both static and dynamic analysis methodologies to generate optimized code for decoupling execution and implementing DFT, while it also minimizes the amount of information that needs to be communicated between the two threads. Furthermore, we use a lock-free ring buffer structure and an N- way buffering scheme to efficiently exchange data between threads and maintain high cache-hit rates on multi-core CPUs. Our evaluation shows that ShadowReplica is on average ~2.3x faster than in-line DFT (~2.75x slowdown over native execution) when running the SPEC CPU2006 benchmark, while similar speed ups were observed with command-line utilities and popular server software. Astoundingly, ShadowReplica also reduces the CPU cycles used up to 30%.