Abstract:
|
Acoustic wave propagation has been the preferred engine for geophysical exploration applications for the last few years due to the large cost involved in using better approximations, especially for 3D full-wave field modelling-based applications. Hence, simplified approaches have been used to generate images of the subsurface so that data processing can be finished in a reasonable time. The current trend in seismic imaging aims at using an improved physical model, considering that the Earth is not rigid but an elastic body. This new model takes simulations closer to the real physics of the problem, at the cost of raising the needed computational resources.
Moreover, to take the simulation representation a step closer to the real physics, some kind of anisotropy in the propagation medium should be considered. However, this again may rice the computational cost of the simulation.
On the hardware front, recently developed high-performing devices, called accelerators or co-processors, have shown that can outperform their general purpose counterparts by orders of magnitude in terms of performance per watt. These new alternatives may then provide the necessary resources for making possible to represent complex wave physics in a reasonable time.
There might be, however, a penalty associated to the usage of such devices, as some portion of the simulation code might need some re-writing or new optimization strategies explored and applied (Araya-Polo et al., 2011). In this work we will show some optimization strategies evaluated and applied to an elastic propagator based on a Fully Staggered Grid, running on the Intel® Xeon Phi™ coprocessor. It is important to remark, that the propagator is able to reproduce elastic wave propagation, even for an arbitrary anisotropy. |
Abstract:
|
We have shown a set of optimizations, applied to a Finite Difference numerical method solving elastic wave propagation equations on the Intel Xeon Phi coprocessor. Moreover, the proposed scheme for solving the elastic equation supports arbitrary anisotropy at a higher computational cost when compared to more traditional ways of solving elastic propagation. The evaluated set of optimizations ranges from memory to compute optimizations. Our results show that it is possible to obtain more than an order of magnitude of improvement when comparing the fully optimized code with a naïve version (which only had OpenMP parallelization included), while up to a 7x of improvement is possible with little investment on code optimization. A comparison with a system with two Intel Xeon E5-2697v3 processor shows that a single Intel Xeon Phi coprocessor is able to outperform such computational architecture. |