The Road Not Taken: Exploring Alias Analysis Based Optimizations Missed by the Compiler
Context-sensitive inter-procedural alias analyses are more precise than intra-procedural alias analyses. However, context-sensitive inter-procedural alias analyses are not scalable. As a consequence, most of the production compilers sacrifice precision for scalability and implement intra-procedural alias analysis. The alias analysis is used by many compiler optimizations, including loop transformations. Due to the imprecision of alias analysis, the programโs performance may suffer, especially in the presence of loops.
Previous work proposed a general approach based on code-versioning with dynamic checks to disambiguate pointers at runtime. However, the overhead of dynamic checks in this approach is ๐(๐๐๐ ๐), which is substantially high to enable interesting optimizations. Other suggested approaches, e.g., polyhedral and symbolic range analysis, have ๐(1) overheads, but they only work for loops with certain constraints. The production compilers, such as LLVM and GCC, use scalar evolution analysis to compute an ๐(1) range check for loops to resolve memory dependencies at runtime. However, this approach also can only be applied to loops with certain constraints.
In this work, we present our tool, Scout, that can disambiguate two pointers at runtime using single memory access. Scout is based on the key idea to constrain the allocation size and alignment during memory allocations. Scout can also disambiguate array accesses within a loop for which the existing ๐(1) range checks technique cannot be applied. In addition, Scout uses feedback from static optimizations to reduce the number of dynamic checks needed for optimizations.
Our technique enabled new opportunities for loop-invariant code motion, dead store elimination, loop vectorization, and load elimination in an already optimized code. Our performance improvements are up to 51.11% for Polybench and up to 0.89% for CPU SPEC 2017 suites. The geometric means for our allocatorโs CPU and memory overheads for CPU SPEC 2017 benchmarks are 1.05%, and 7.47%, respectively. For Polybench benchmarks, the geometric mean of CPU and memory overheads are 0.21% and 0.13%, respectively.
Thu 8 DecDisplayed time zone: Auckland, Wellington change
10:30 - 12:00 | |||
10:30 30mTalk | A Fast In-Place Interpreter for WebAssembly OOPSLA Ben L. Titzer Carnegie Mellon University DOI | ||
11:00 30mTalk | Optimal Heap Limits for Reducing Browser Memory Use OOPSLA Marisa Kirisame University of Utah, Pranav Shenoy University of Utah, Pavel Panchekha University of Utah DOI | ||
11:30 30mTalk | The Road Not Taken: Exploring Alias Analysis Based Optimizations Missed by the Compiler OOPSLA DOI |