Performance Improvement by MPI Parallelization in 3D Device Simulation
Introduction
As the design technology for power devices, such as MOSFET, GTO, and IGBT has matured, the importance of large domain 3D TCAD simulation has increased rapidly. Distributed computing is one of the attractive solutions for such simulations, because the system’s performance and capability is not limited by the number of CPUs or the total amount of memory on a specific computer. This advantage of distributed computing is expected to be increasingly advantageous, as the size and mesh point count for these devices becomes ever larger.
Silvaco’s TCAD applications provide the user with the distributed computing feature which is supported in the solution of linear systems using the PAM solver [1, 2]. The PAM solver is a domain decomposition type solver that runs in parallel using MPI (Message Passing Interface). The user can set up the distributed computing feature with MPI parallelization easily, with the addition of a few simple settings on a Linux operating system [1].
In this article, we demonstrate good performance from the PAM solver with MPI parallelization using Victory Device on a blade server with a total of 120 threads. In addition, we verify the dependence of performance improvement by MPI parallelization, on the device size and number of mesh points.