Academia.eduAcademia.edu

FMM accelerated BEM for 3D Laplace & Helmholtz equations

2006, Proceedings Int. Conf. on Boundary Element Techniques

FMM accelerated BEM for 3D Laplace & Helmholtz equations Nail A. Gumerov and Ramani Duraiswami Institute for Advanced Computer Studies University of Maryland, U.S.A. Presentation for the International Conference on Boundary Element Technique BETEQ-7, September 2006, Paris, France © Gumerov & Duraiswami, 2006 Outline • • • • Introduction Problem formulation BEM speed up with the FMM Performance tests • Laplace equation • Helmholtz equation • Conclusions © Gumerov & Duraiswami, 2006 Introduction • BEM is a great method • Drawbacks of the standard method • Slow: O(N3)-direct, O(NiterN2)-iterative • Memory expensive: O(N2)-full matrix storage • For modern PC’s are not suitable for solution of large problems, N > 104 • FMM acceleration/memory reduction • Fast: O(NiterN) (or O(NiterNlogN)) • Memory inexpensive: O(N) (or O(NlogN)) • Modern PC’s can solve large problems, N ~ 106 © Gumerov & Duraiswami, 2006 Large problems: Example 1 Acoustic scattering from a sphere (model of human head) Mesh: 30000 elements 15002 vertices © Gumerov & Duraiswami, 2006 Mesh: 33082 elements 16543 vertices Introduction Large problems: Example 1/ 1 KEMAR: 405002 elements, 202503 vertices © Gumerov & Duraiswami, 2006 Introduction Large problems: Example 2 Multiparticle system (e.g. blood flow) Large mesh (hundreds of thousands elements) © Gumerov & Duraiswami, 2006 Introduction Problem formulation S n Laplace: Helmholtz: For external problems: Boundary conditions: where © Gumerov & Duraiswami, 2006 For external problems: Layer potentials Laplace: Normal derivatives: © Gumerov & Duraiswami, 2006 Helmholtz: Problem formulation Problem formulation Boundary integral equations Green’s identity (direct) formulation: Layer potential (indirect) formulation: © Gumerov & Duraiswami, 2006 BEM speed up with the FMM • Use iterative methods (we use GMRES) • The system matrix can be decomposed as A = Asparse+Adense . • The sparse part includes diagonal elements+neighbor element influence • Integrals can be computed using standard BEM techniques, O(N) memory/operation complexity • The dense part is related to far element influence • Most memory/computationally expensive • Can be done with the FMM • Low order quadratures over elements can be used © Gumerov & Duraiswami, 2006 BEM speed up with the FMM BEM and FMM data structures Level lmax FMM neighborhood This element is far to red one in the sense of BEM, but close in the sense of the FMM (low order quadrature, influence directly) rmin BEM neighborhood This element is far to red one in all senses (low order quadrature, influence via expansion) This element is close to red one in all senses © Gumerov & Duraiswami, 2006 (high order quadrature, influence directly) BEM speed up with the FMM Computation of normal derivatives Differential operators in the space of expansion coefficients: sparse matrix Detailed theory and expressions can be found in book N.A. Gumerov and R. Duraiswami, Fast Multipole Methods for the Helmholtz Equation in Three Dimensions, Elsevier, Oxford UK, 2004. © Gumerov & Duraiswami, 2006 BEM speed up with the FMM Expansion of boundary integrals Greens function expansion: Quadratures: Expansion coefficients: © Gumerov & Duraiswami, 2006 BEM speed up with the FMM Particulars of the FMM employed • Two or four BEM matrix-vector multiplications are performed at “one shot” using sparse matrix representation of differential operators • O(p3) translation methods based on the RCR* -decomposition are used both for the Laplace and Helmholtz equation • Details can be found in our technical reports and FMM papers, 2003-2006. * RCR-decomposition is the Rotation-Coaxial translation-Rotation decomposition of an arbitrary translation operator in three dimensions © Gumerov & Duraiswami, 2006 BEM speed up with the FMM Basic FMM flow chart Levels lmax-1,…, 2 Level lmax 1. Get S-expansion coefficients (directly) 2. Get S-expansion coefficients from children (S|S translation) Level lmax Start End © Gumerov & Duraiswami, 2006 7. Sum sources in close neighborhood (directly) Level 2 3. Get R-expansion coefficients from far neighbors (S|R translation) Levels 3,…,lmax 4. Get R-expansion coefficients from far neighbors (S|R translation) Level lmax 6. Evaluate R-expansions (directly) 5. Get R-expansion coefficients from parents (R|R translation) Performance tests • Flat triangular surface discretization • Accuracy of the FMM selected to be consistent with the accuracy of the BEM (relative errors normaly within the range 10-2 10-4) • Problem sizes 500-3,000,000 boundary elements • Where possible compared with direct BEM, iterative BEM, and low memory direct BEM • Hardware: PC: 3.2 GHz, Intel Xeon processor, 3.5 GB RAM © Gumerov & Duraiswami, 2006 Performance tests Laplace equation Typical test configuration: for external Dirichlet and Neumann problems 1000 randomly oriented ellipsoids 488000 vertices and 972000 elements © Gumerov & Duraiswami, 2006 Potential Performance tests Laplace equation 1.E+04 1.E+03 BEM Dirichlet Problem for Ellipsoids (3D Laplace) y=bx2 y=cx3 FMM: p = 8 Total CPU time (s) y=ax 1.E+02 1.E+01 O(N2) Memory Threshold 1.E+00 GMRES+FMM GMRES+Low Mem Direct 1.E-01 GMRES+High Mem Direct LU-decomposition 1.E-02 1.E+02 © Gumerov & Duraiswami, 2006 1.E+03 1.E+04 1.E+05 Number of Vertices, N 1.E+06 Performance tests Laplace equation Single Matrix-Vector Multiplication CPU Time (s) 1.E+02 y=bx2 Number of GMRES Iterations 1.E+01 y=ax Direct+ Matrix Entries Computation FMM 1.E+00 1.E-01 FMM: p = 8 Multiplication of Stored Matrix 1.E-02 1.E-03 1.E+02 3D Laplace 1.E+03 1.E+04 1.E+05 Number of Vertices, N © Gumerov & Duraiswami, 2006 1.E+06 Performance tests Helmholtz equation (comparison with analytical solution) Incident wave GMRES: 115 iterations, ε =10-5, FMM: lmax= 6, pmax= 22 © Gumerov & Duraiswami, 2006 Plane wave scattering from a sphere, ka=30 BEM/FMM, 240002 vertices, 480000 elements Performance tests Helmholtz equation 1.E+04 kD=34.64 y=ax 3 y=cx y=bx2 Total CPU time (s) 1.E+03 BEM Neumann Problem for Sphere (3D Helmholtz) 1.E+02 1.E+01 GMRES+FMM GMRES+Low Mem Direct 2 O(N ) Memory Threshold 1.E+00 1.E+03 © Gumerov & Duraiswami, 2006 1.E+04 GMRES+High Mem Direct LU-decomposition 1.E+05 Number of Vertices, N 1.E+06 Performance tests Helmholtz equation Single Matrix-Vector Multiplication CPU Time (s) 1.E+03 1.E+02 1.E+01 Number of GMRES Iterations y=ax y=bx2 FMM Low Mem Direct 1.E+00 Multiplication of Stored Matrix 1.E-01 1.E-02 1.E+03 3D Helmholtz, kD=33.64 1.E+04 1.E+05 Number of Vertices, N © Gumerov & Duraiswami, 2006 1.E+06 Performance tests Helmholtz equation (some other scattering problems were solved) Mesh: 249856 vertices/497664 elements kD=29, Neumann problem © Gumerov & Duraiswami, 2006 kD=144, Robin problem (impedance, sigma=1) Performance tests Helmholtz equation (some other scattering problems were solved) Incident plane wave exp(ikz) © Gumerov & Duraiswami, 2006 Mesh: 132072 elements 65539 vertices Performance tests Helmholtz equation (some other scattering problems were solved) Sound pressure kD=0.96 (250 Hz) © Gumerov & Duraiswami, 2006 kD=9.6 (2.5 kHz) kD=96 (25 kHz) Conclusions • Developed method (BEM with GMRES+FMM) shows a good performance and can be used for solution of large boundary value problems for the Laplace and Helmholtz equtaions (up to million size) on a single PC • Future research: • Parallel versions of the method should bring additional speedups • BEM for different equations (e.g. biharmonic and Stokes equations) can be also speeded up in the same way • Efficient iterative methods for BEM are of high demand, as the matrix-vector products can be accelerated via the FMM © Gumerov & Duraiswami, 2006 Software Availability: • FMM-based software is licensed from the University of Maryland to Fantalgo, LLC • Includes matrix-vector product for 3D Laplace, biharmonic, Helmholtz (Maxwell is coming) equations and Gaussian kernel • BEM and function fitting via iterative solvers • Contact [email protected] for further information © Gumerov & Duraiswami, 2006 THANK YOU !