Academia.eduAcademia.edu

Technique of the Fast Ferrofluids Simulation

We present a highly-parallel implementation of the Langevin simulation method for modeling ferrofluids on Graphical Processor Units (GPU). Our method is based on the Barnes-Hut algorithm. As a benchmark we use the straightforward 'all-to-all interaction' algorithm. The obtained results are in good agreement with known theoretical model. With the proposed method we were able to follow the evolution of a system of one million interacting particles over long timescales , the task hitherto is out of reach with the standard, CPU-based numerical schemes.

CORE Metadata, citation and similar papers at core.ac.uk Provided by Electronic Sumy State University Institutional Repository PROCEEDINGS OF THE INTERNATIONAL CONFERENCE NANOMATERIALS: APPLICATIONS AND PROPERTIES Vol. 1 No 1, 04MFPN05(3pp) (2012) Technique of the Fast Ferrofluids Simulation A.Yu. Polyakov1,2,*, T.V. Lyutyy1,†, S. Denisov2, P. Hänggi2 2 1 Sumy State University, 2, Rymsky Korsakov Str., 40007 Sumy, Ukraine Institut für Physik, Universität Augsburg, Universitätsstraβe 1, D-86135 Augsburg, Germany (Received 10 July 2012; published online 22 August 2012) We present a highly-parallel implementation of the Langevin simulation method for modeling ferrofluids on Graphical Processor Units (GPU). Our method is based on the Barnes-Hut algorithm. As a benchmark we use the straightforward 'all-to-all interaction' algorithm. The obtained results are in good agreement with known theoretical model. With the proposed method we were able to follow the evolution of a system of one million interacting particles over long time-scales, the task hitherto is out of reach with the standard, CPU-based numerical schemes. Keywords: Ferrofluids, Molecular dynamics, N-body problem, Graphics processing unit, CUDA, BarnesHut algorithm. PACS numbers: 47.65.Cb, 02.70.Ns 1. INTRODUCTION A system of N interacting particles is one of the most common set-ups in physics. Since analytical solutions typically do not exist when the number of interacted particles is larger then two, N > 2, the various numerical methods are used in order to get an insight in the dynamics of the corresponding system. The interaction involves all the pairs of particles, and therefore the computational time and efforts scale like N2. This effect certainly constitutes a problem, and, in order to run big systems, researchers relied on big computers. Practically that means the use of large computational clusters consisting of many Central Processing Units (CPUs). Such clusters are physically big, expensive and consume a lot of electricity. Over the last decade, an alternative, Graphic Processing Units (GPUs), become more and more popular. GPUs, initially designed to serve as the data-pipelines for graphical information, posses a high level of parallelism, while being cheap (as compared to CPUs) and small. The high computation capability of GPUs is owing to their architecture, which represents a set of hundreds of computational cores (note that the best modern CPUs have typically up to 16 cores), which can perform computations in parallel. Nowadays, the scientific GPUcomputing is used in many areas of computational physics, thanks to the Compute United Device Architecture (CUDA) developed by NVIDIA Corporation, which step simplified the development of programs for GPUs. 2. MODEL OF THE DISPERSED FERROFLUID Theory of the continuous bulk ferrofluid is welldeveloped within the concept of ferrohydrodynamics [1]. However, recently the aspects of the dynamics of confined ferrofluids attracted a lot of attention because of the promising potential of ferrofluids for biotechnologies and medicine [2]. In this context, the disperse structure of a ferrofluidic media certainly deserves an attention. The dynamics of a ferromagnetic particle, suspended in liq* † uid, can be described by a Langevin equation [3] that accounts for the long-range dipole interactions with other particles, caused by the particle magnetization, and for short-range steric repulsions, caused by the antiagglomeration coating. Here we use the simplest model and consider an ensemble of identical spherical particles of radius R, submerged into the liquid of viscosity . We suppose also that the hydrodynamic radius is equal to R, and every particle has a homogenous structure with respect to its inertia and magnetic properties. Therefore, the dynamics of i-th particle is determined by the following equations 2 d2 5 d2 2 d2 5 d2 i i uix hiy (uiy hiz (uiz hix d2 ρi d2 uiy hix r uiz hiy )sin k uix hiz )cos fidip fisr t d d k dρi d Tch2 VR 2 D i r d d i r , Tch2 VR 2 D Tch2 ξd , VR 2 D (1) (2) r , (3) where i and i are the azimuthal and polar angles of i-th particle; t/Tch is the dimensionless time (Tch R/ (3D/4 0)0.5; is the particle material magnetization, 4 10 – 7H/m is the magnetic constant, D is the 0 material particle density); ui mi/m is the dimensionless magnetic moment (m 4 R3 /3 V , V is the particle volume); Γr 8 RTch/DV is the rotation friction coefficient; i ri/R is the dimensionless radius-vector, which define i-th particle position; Γt 6 RTch/DV is the transition friction coefficient, hi N j 1, j i 3ρ j (u j ρij ) u j 5 ij 2 ij hext (4) is the magnetic field, which acts on the particle, ij i – j; N is the number of the particles in the ensemble; hext 3Ηext 4 is the reduced external field; [email protected] [email protected] 2304-1862/2012/1(1)04MFPN05(3) 04MFPN05-1 2012 Sumy State University T.V. LYUTYY, A.YU. POLYAKOV, S. DENISOV, P. HÄNGGI N fidip 3 ρij ( u j ui ) ui (u j ρij ) u j (ui ρij ) 5 ij j 1, j i 15 (5) ρij ( u j ρij )(ui ρij ) 7 ij is the dipole force; and finally fisr 24 12 ρij N (6) 2 ij j 1, j i 6 ij ij is the force, which is modeling the repulsion. Here and are phenomenological parameters, that define the depth of the potential well and distance between the particles centers, which corresponds to the minimal value of the Lennard-Jones potential, respectively. The random forces r ( ), and d, represent the particle interaction with heat bath, and are characterized by zero means and delta-like correlation functions r d ( ) ( ) r d ( ) ( ) 3kBT 2kBT r d ( ) Tch ( , , ), ( ) Tch ( , x , y, z ) , where kB is the Boltzmann constant, T is the absolute temperature, is Kronecker delta symbol, and ( ) is Dirac delta function. PROC. NAP 1, 04MFPN05 (2012) of the standard all-pairs algorithm on a CPU, and speed-up of the order 200 when compared to the allpairs implemented on a GPU. However, when the number of particles is relatively small, typically less than several thousands, the Barnes-Hut algorithm performs slower than the all-pairs algorithm. This is because the tree-building procedure takes most of the computational time. Time, ms 1.00E+07 All Pairs GPU Barnes-Hut GPU 1.00E+05 1.00E+03 1.00E+01 1.00E+03 1.00E-01 1.00E+04 1.00E+05 1.00E+06 N Fig. 1 – Performance of different algorithms. 3. PARALLEL COMPUTATIONS WITH CUDA The most straightforward way to simulate ferrofluid dynamics is to use the so-called All-Pairs Algorithm, that takes into account interactions between all the pairs of the particles. The computation time in this case scales like O(N2), thus making this brute-force approach not suitable for the numerical propagation of large systems, N >> 1. Computation time can be tangible decreased by implementing the Barnes-Hut algorithm [4], which leads to the scaling of the computational time O(NlogN). The algorithm is based on the replacing of a group of particles by a pseudo-particle with integral characteristics of the group. If the group if far enough from the reference particle, then, instead of computing interaction between chosen particle and each particle in the group, it is reasonable to calculate the influence of the pseudoparticle. The partition of the set of particles into the groups is performed in a hierarchical manner. The region where ferrofluid is located can be considered as a root cell, or a root group [4]. In 3d case the root cell is divided into 8 sub-cells, and each sub-cell is again divided, etc., until every sub-sub-... -cell holds one or none particles. From every sub-cell we produce a pseudo particle has a magnetic moment u equal to the sum of ui, that belong to the subcell. The position of each pseudo particle determines as 1,…,N ) i/N (i Here N is the number of particles in the subcell. The Barnes-Hut algorithm by the construction possesses a high-potential for parallelization [5] , but its implementation is quite challenging due to hierarchical tree it is based on. So we used recommendations in [5] to overcome these difficulties. Fig. 1 shows the efficiency of the Barnes-Hut GPUbased algorithm. For the N 106 we observed a speedup of the order 6 105 as compared to the performance All Pairs CPU 4. TESTING OF THE DEVELOPED METHOD We performed simulations for the ensemble of maghemite ( -Fe2O3) particles, by using Eqs.(1)-(3), with the parameters 3.1 105 A/m, D 5000 kg/m3, suspended in a water of viscosity 0,89 10 – 3 Pa s. According to Shliomis [6], if the diluted (volume fraction < 1 %) ferrofluid contains small particles for which the magnetic energy does not exceed the thermal energy, its magnetization is given by the Langevin function uz coth 1 , 0 mHzext . kBT (7) The result of our simulation (see Fig. 2) are in a good agreement with Eq.(7), that is confirm the proposed approach validity. The slight excess of the ferrofluid reduced magnetization over the Langevin function with grows is explained by increasing of the dipole field component, that directed along the external field Hzext . When the particle magnetic energy is larger then the thermal one, the ferrofluid response to the external field becomes more pronounced (see Fig. 3). At the same time, the role of interparticle interaction becomes more important. In particular, due to the antiferromagnetic coupling between the column clusters, and the formation of the ring-like clusters, the magnetization of a dense ferrofluid can be several times smaller than the applied magnetic field. 04MFPN05-2 TECHNIQUE OF THE FAST FERROFLUIDS SIMULATION PROC. NAP 1, 04MFPN05 (2012) u1z 1.00 uz 0.8 0.80 0.6 Volume fraction 1% Volume fraction 5% 0.60 Langevin function 0.4 0.40 Simulation 0.2 0.20 Langevin function 0 0 2 4 6 8 Fig. 2 – The magnetization curve for diluted ( < 1 %) ferrofluid. N 106, T 298 K 5. CONCLUSIONS 0.00 0 0.05 0.1 0.15 0.2 0.25 0.3 Fig. 3 – The magnetization curves for different volume fraction, when a > 1. N 106, T 298 K AKNOWLEDGEMENTS We proposed the new, GPU-based approach to ferrofluid simulations. By using the the Barnes-Hut algorithm, which takes into account both the nearestneighbor correlation and interaction of the remote particles, we have reached an impressive calculation speed-up. One of the intriguing perspectives for further studies is to explore the dynamics of ferroliquids in thin vessels, a problem which is very important in the context of biomedical applications. T.V.L. and A.Yu.P. acknowledge the support of the Cabinet of Ministers of Ukraine, The Program of Studying and Training for Students, PhD Students and Professor's Stuff Abroad (order dated April 13, 2011 No 411) and the support of the Ministry of Education, Science, Youth, and Sport of Ukraine (Project No 0112U001383). REFERENCES 1. R. Rosensweig, Ferrohydrodynamics, (Cambridge University Press: 1985). 2. Q. Pankhurst, J. Connolly, S. Jones, J. Dobson, J. Phys. D.: Appl. Phys. 36, R167 (2003). 3. Z. Wang, C. Holm, H.W. Muller, Phys. Rev. E 66, 021405 (2002). 4. Josh Barnes, and Piet Hut, Nature 324, 446 (1986). 5. Wen-mei W. Hwu, GPU Computing Gems Emerald Edition, 1st edition (Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA: 2011). 6. M. Shliomis, Sov. Phys. Usp. 17, 153 (1974). 04MFPN05-3