In recent years, much effort has been done to solve the problems of huge processing time associated to large amount of particles required for simulating practical problems with desired refinement.
However, the efficient use of the currently available computational resources, such as computer cluster, remains as a challenge.
One of the critical …show more content…
Due to this, the study of new strategies of parallelization, designed to a especific architecture or to a combination of architectures, has become the focus of efforts to accelerate the simulation of particle methods.
The GPU was initially designed to accelerate graphics processing, nevertheless, from the mid-2000s, it becomes a more generalized computing device that promises accelerate codes that demand high computational power at lower cost. Despite the known challenges to speedup linear system solutions using GPU, in 2011, Hori et al. cite{Hori-2011} developed a GPU-accelerated version of standard MPS code. The authors shown that, for simulations of two-dimensions model of moderate size (until 100K particles), the GPU-accelerated was about 10 times fast than the code for CPU only.
To use efficiently the computational resources of cluster computers, the main focus of the studies was redirected to distributed parallelization of particle-based methods.
To achieve this goal, Domain Decomposition (DD) strategies are the most used …show more content…
In the mid-2000s, the Finite Element (FE) code named ADVENTURE, using a hierarchical domain decomposition method, were already able to analyze three dimensional models defined by meshes, of arbitrary shape, with hundreds of million Degrees Of Freedom (DOF) cite{Ogino-2005}.
More recently, explicit particles methods, parallelized by DD strategies, were also able to perform simulations of very large models: in 2014, Murotani et al. cite{Murotani-2014a}, using a two level voxel-based domain decomposition to parallelize explicit MPS, performed tsunami simulations with models up to 390 millions particles.
Although the implementation of DD for the semi-implicit methods is generally a very big challenge, specific problems may have features that can simplify this task: Ovaysi and Piri cite{Ovaysi-2012} developed a parallel versions of Modified Moving Particle Semi-implicit (MMPS) to model fluid flow in porous media. The authors took advantage of the almost homogenous distribution of particles throughout the whole simulation, which enable to use a simplified domain decomposition technique, to integrate a single-GPU code in a multi-GPU code to run on distributed memory computer