Philippe Clauss

Université de Strasbourg, Icube, Faculty Member

Followers

Following

Co-authors

Public Views

Interests

Uploads

Papers by Philippe Clauss

Minimizing Strides in Loops with Affine Array References

by Vincent Loechner and Philippe Clauss

Combinatorics, Probability & Computing, 2001

A signicant source of enhancing application performance and of reducing power consumptionin embed... more A signicant source of enhancing application performance and of reducing power consumptionin embedded processor applications is to improve the usage of the memory hierarchy. In this work,a temporal and spatial locality optimization framework of nested loops is proposed, driven by parameterizedcost functions. It is based on the minimization of strides occuring while accessing arrayelements from an ane reference function. A

Dynamic and Speculative Polyhedral Parallelization of Loop Nests Using Binary Code Patterns

by Vincent Loechner and Philippe Clauss

Procedia Computer Science, 2013

Speculative parallelization is a classic strategy for automatically parallelizing codes that cann... more Speculative parallelization is a classic strategy for automatically parallelizing codes that cannot be handled at compile-time due to the use of dynamic data and control structures. Another motivation of being speculative is to adapt the code to the current execution context, by selecting at run-time an efficient parallel schedule. However, since this parallelization scheme requires on-the-fly semantics verification, it is in general difficult to perform advanced transformations for optimization and parallelism extraction. We propose a framework dedicated to speculative parallelization of scientific nested loop kernels, able to transform the code at runtime by re-scheduling the iterations to exhibit parallelism and data locality. The run-time process includes a transformation selection guided by profiling phases on short samples, using an instrumented version of the code. During this phase, the accessed memory addresses are interpolated to build a predictor of the forthcoming accesses. The collected addresses are also used to compute on-the-fly dependence distance vectors by tracking accesses to common addresses. Interpolating functions and distance vectors are then employed in dynamic dependence analysis and in selecting a parallelizing transformation that, if the prediction is correct, does not induce any rollback during execution. In order to ensure that the rollback time overhead stays low, the code is executed in successive slices of the outermost original loop of the nest. Each slice can be either a parallelized version, a sequential original version, or an instrumented version. Moreover, such slicing of the execution provides the opportunity of transforming differently the code to adapt to the observed execution phases. Parallel code generation is achieved almost at no cost by using binary code patterns that are generated at compile-time and that are simply patched at run-time to result in the transformed code.

Philippe Clauss

Uploads

Papers by Philippe Clauss

Log In