Feb 23, 2010, Tsukuba-Edinburgh Computational Science Workshop, Edinburgh Large-Scale...

30
b 23, 2010, Tsukuba-Edinburgh Computational Science Workshop, Edinburgh Large-Scale Density- Functional calculations for nano-meter size Si materials Jun-Ichi Iwata Center for Computational Sciences University of Tsukuba

Transcript of Feb 23, 2010, Tsukuba-Edinburgh Computational Science Workshop, Edinburgh Large-Scale...

1

Feb 23, 2010, Tsukuba-Edinburgh Computational Science Workshop, EdinburghLarge-Scale Density-Functional calculations for nano-meter size Si materialsJun-Ichi Iwata

Center for Computational Sciences University of TsukubaId like to talk about development of a program code for large scale density-functional simulationand its applications to nano-meter size Si materials

1OutlineQuantum Mechanical (First-Principles) Simulation in Solid-State PhysicsDensity-Functional Theory

W. Kohn (Nobel Prize in 1998) Density-Functional simulations for large systemsReal-Space DFT program codefor Parallel Computation

-RSDFT-Applications of RSDFTfor Si nano materials

>10,000-atom systemThis is the outline of my talk.My research field is computational solid-state physics or computational materials science, and I always studies the material properties based on the quantum mechanics. In our field, the material simulation based on the quantum mechanics is called the first-principles calculations/simulation. First of all I briefly introduce my research field the first-principles material simulation.The basic theory of the first-principles calculations is the density-functional theory. Then Id like to explain what is the density-functional theory.The DFT was developed by W. Kohn, and he got novel prize in 1998. So the DFT is now wide spread in many fields and some peoples want to apply the DFT for very large systems.So I have developed the Real-Space DFT program code suitable parallel computers in order to study very large systems in material sciences.Finally, Id like to introuce some applications of our code for Si nano materials. 2First-Principles Calculation in Material PhysicsWe describe material properties from the behavior of electrons and ions. ions classical, electrons quantumWe solve the Schrodinger equation for electronic ground stateDensity-functional theory is a powerful tool for this purpose.

This slide shows what is first-principles calculation in material sciences.In our approach, we describe the materials properties from the behavior of electrons and ions.And the basic equation for such systems is already known, that is called the Schredinger equation.In order to solve the Schredinger equation with low computational costs, the DFT is reliable tool for such purposes.3Density-Functional Theory

electron densityEnergy FunctionalWe get stable atomic & electronic structures. minimizeP. Hohenberg and W. Kohn, Phys. Rev. 136 (1964) B864.W. Kohn and L. J. Sham, Phys. Rev. 140 (1965) A1133.

minimize with respect to

PotentialKohn-Sham equation We have to solve this equation self-consistently ( Nonlinear eigenvalue problem )This slide show a brief introduction of the DFT.In the DFT, we start from the energy functional. This functional is written by using electronic wave function phi and the electronic density rho. The density and wave function is related through this formula.Then we minimize the energy functional with respect to the wave functions phi, and we obtain differential equation whose solution gives the wave functions minimizing the energy functional. The variational equation is called the Kohn-Sham equation, and solving the Kohn-Sham equation is the main task of the first-principles calculations.As you can see, the Kohn-Sham equation is an eigenvalue problem. However the operator in the left-hand side include a density-dependent potential. Since the electron density depend on the eigensolution phi through this formula, so the equation is a nonlinear eigenvalue equation. And we have to self-consistently solve the Kohn-Sham equation.4

M. T. Yin and M. L. CohenPhys. Rev. B26, 5668 (1982).

Exchange functional in Local-Density Approx.DFT calc.Expt.Lattice Constant ()5.375.41Bulk Modulus (Mb)0.9770.988Siin diamond structurePerformance of DFT

with simple approximationquantitatively good resultsCorrectly describe various propertiesThis slide show, why the DFT is such widely spread in many field of material sciences.In the practical use of DFT, we have to apply an approximation in a part of the energy functional called the exchange-correlation functional. Nobody knows the exact functional form. So we use some approximation for this functional. The most simple one is the local-density approximation. The exchange part of the functional explicitly written here. As you can sse this has very simple form. However, the LDA is very good approximation. The LDA correctly describe many material properties and the quantitative agreement to the experiments are also reasonably good. 5

Proteinscytochrome c oxidase30,000 atoms

Nano structures (Si pyramid)100,000 atomsA. Ichimiya et al., Surf. Sci. 493, 555 (2001).Everybody wants to apply the DFT for Large systemsUsually, we treat 10- to 1000-atom systems by DFT.However, we need to treat larger systems.to study large objects (nano structures, proteins)to make the atomic model more realisticSince the performance of DFT is very high, everybody wants to apply the DFT for much larger systems.In usual, we study the systems of 10 to 1000 atoms. However there many systems which contains huge number of atoms. The examples of such large systems are proteins in bio sciences and the nano structures in semiconductor physics. These systems contains 10,000 to 100,000 or more atoms in the system. So we need a new program code to treat such large systems.6Real-Space DFT program code(RSDFT)Solve Kohn-Sham equation (eigenvalue problem) Computational costs O(N3)

Developed for parallel computers

discretize

functionColumn vectorLaplacian Higher-Order Finite-DifferenceHigher-order finite difference pseudopotential methodJ. R. Chelikowsky et al., Phys. Rev. B, (1994)Real-Space Method

continuous spacediscrete spaceTypical number of grid points10,0001,000,000( Reciprocal-Space (Plane-Wave) Method )Traditionally, reciprocal space (or plane wave) method is used in the solid state physics. The real-space method is somewhat new approach. The basics of the real-space method is given in 1994.

8

Real-Space Finite-DifferenceSparse MatrixFFT free (FFT is inevitable in the conventional plane-wave code)Kohn-Sham eq. (finite-difference)3D grid is divided by several regionsfor parallel computation. Higher-order finite differenceIntegrationMPI_ISEND, MPI_IRECVMPI_ALLREDUCERSDFT suitable for parallel first-principles calculation -MPI ( Message Passing Interface ) library

CPU0CPU8CPU7CPU6CPU5CPU4CPU3CPU2CPU1The strong point of the real-space method is suitable for parallel computation.In the conventional plane-wave codes, we do not avoid to use the Fast-Fourier transformation, and the FFT is not suitable for parallel computation, so we need to change the computational scheme to the real-space method.

Parallelization of the code is implemented as shown here.Exchnage the boundary values for finite-difference calculation and global summation is implemented by using MPI library.9

Convergence behavior for Si10701H1996The largest system in the present study Si10701H1996 Massively Parallel ComputingComputational Time (with 1024 nodes of PACS-CS) 6781 sec. 60 iteration step = 113 hourBased on the finite-difference pseudopotential method (J. R. Chelikowsky et al., PRB1994)

Highly tuned for massively parallel computersComputations are done on a massively-parallel cluster PACS-CS at University of Tsukuba.

(Theoretical Peak Performance = 5.6GFLOPS/node)with our recently developed code RSDFTIwata et al, J. Comp. Phys. (2010)

Real-Space Density-Functional Theory code (RSDFT)Grid points = 3,402,059Bands = 22,432Conjugate-Gradient MethodGram-Schmidt orthonormalizationDensity, Potentials updateSubspace DiagonalizationTotal Computational Cost O(N3)O(N3)O(N3)O(N)O(N2)Flow chartCalc. Ionic PotentialsInput initial configuration of IonsHellman-Feynman ForceMove ions Convergence Check

Convergence Check

Electronic structure optimization must be performed in each atomic optimization stepAtomic structure optimizationElectronic structure optimizationAlgorithm subspace iteration method (Rayleigh-Ritz method)yesyesIn order to solve the nonlinear eigenvalue equation, we perform iterative calculation as shown here.In the first-principles studies, we perform two type of calculation. One is the electronic structure calculation, we solve the Kohn-Sham equation at the given initial atomic configuration.The other type of the calculation is the atomic structure optimization. In this calculations, first we solve the KS equation and then we calculate the force acting on the ions in the system, the force called the Hellman-Feynman force, and we move the ions according to the HF forces, and we solve the Kohn-Sham equation again at the updated ionic configuration. Finally, we obtain the stable atom structure of the system.11AlgorithmSubspace Iteration MethodRayleigh-Ritz MethodM-dimensional eigenvalue problem We need smallest N(M) eigen-pairs

Minimize Reyleigh quotients by Conjugate-Gradient Methodwave function updateInitial guess

Problem

Algorithm 2O(MN2)O(N3)Subspace DiagonalizationO(MN2)Ritz vectors

Gram-Schmidt Orthogonalization

as a basis set

initial guess for the next iterationO(MN2)Calc. Matrix ElementsGram-Schmidt orthogonalizationTime (sec)GFLOPS/nodeOld algorithm661 (710)0.70 (0.65)New algorithm111 (140)4.30 (3.50)Time & Performance for Gram-SchmidtO(N3) part can be computed at 80% of the theoretical peak performance!Active use of Level 3 BLAS in O(N3) computation Collaboration with computer scientistsmuch improve the performance of the RSDFT!Theoretical peak performance = 5.6 GFLOPS/node

Part of the calculations can beperformed as Matrix Matrix operation!Algorithm of GS

PACS-CS(5.6GFLOPS/node)256nodestime for O(N2)-part and O(N3)-part become comparableElapsed time for 1 step of iterationO(N2)O(N3)O(N3)Application 1Nano-meter sizeSi quantum dotsSi quantum dot is a promising material for several device applications Memory Single-electron transistor Optical DeviceClarifying the relation between the Dot size and Band gapis important for controlling the device properties.

System size is very large!A model of the Si quantum dotof 6.6 nm diameterSi7055H1596First-principles calculations are useful for such studies? Yes, but Si QD is promising material for several device applications such asIn order to know device characteristics, it is important to clarifying the relationFirst-principles calculation is useful for such studies, but in this case, there is a problem, because the nano-meter size Si dot is very large. 17

(eV)Experimental fit curveFrom STS measurement B.Zanknoon et al., Nano letters 8, 1689 (2008).

The SCF gap seems to be closer to the KS gap Band Gaps300 atoms>10,000 atomsThis is the result of the size dependence of the band gap of Si QDs.The horisontal axis show the diameter of the Si QDs, the smallest dot has only 300 atoms, and the largest dot consist from about 10,000 atoms.The band gap is calculated from the total energy difference as shown here.From the eigenvalues of the Kohn-Sham equation, we can estimate approximate band gap. We plot in this figure both exact and approximate band gap is plotted. As we can see, the exact one is good agreement in some regions aroun here, and the aproximate band gap is always underestimate the experimental band gaps. However for larger QDs the exact bandgap is also underesitimate the experimental band gap. We have shown the exact band gap and the KS band gapbecomes same at the inifinitly large size limie. This shows the limitation of the LDA for the band gap of the Si QDs.band gap is infinitely large size limit 18Application Si nanowires

IEDM2005IEDM2006Diameter of NW10 nm8 nmGate length30 nm15 nmVdd1.0 V1.0 VI_on (n)2.64 mA/m1.4 mA/mI_on (p)1.11 mA/m1.94 mA/mI_off (n)3.1 nA/m2.0 nA/mI_off (p)0.0056 mA/m1.0 nA/mSamsung Si nanowire devicesSi nanowire is also an important material for future device application.This slide show a device characteristics of SiNW FET developed by Samsung, a Korean company. They report the device characteristics in an international conference IEDM, and they said that The SiNW device characteristics far exceed the ITRS roadmap requirement.So researchers of LSI devices have been done much now 20

4 nm diameter 425 atoms)10 nm diameter 2341 atoms 20 nm diameter 8941 atoms

There may be an optimum diameterin the region of 10 nm 20 nm.Several size of Si nanowiresIn order to provide an useful information f or the development of the SiNW devices, we need large-scale quantum mechanical simulation. So I think he first-principles simulation with RSDFT must be useful in this field.In order to see the wire-size dependence of the electronic structures of SiNW, we perform electronic structure calculation for several size of SiNW.21

d=1nmSi21H2041 atomsEg=2.60eV(LDA Bulk : 0.53eV) X

Band Structure and DOS of SiNW (d=1nm)

d=4nmSi341H84425 atomsEg=0.81eV (LDA Bulk=0.53eV) X

Band Structure and DOS of SiNW (d=4nm)

XSi1361H164(1525 atoms), Eg=0.61eV

Band Structure and DOS of SiNW (d=8nm)

XBulk SiEg=0.53eVThis is the 8nm-diameter SiNW, The system consist from 1525 atoms.We found that the electronic structure is almost the same as that of the bulk Si crystal.24

Si12822H154414,366 atoms 10nm diameter3.3nm height(100) Grid spacing0.45 (~14Ry) # of grid points4,718,592 # of bands29,024 Memory1,022GB2,044GB

Si12822H1544Top ViewSide ViewSi nano wire with surface roughness In more realistic situation, the SiNW has some roughness at the surface or interface of the wire.In order to construct an atomic model for the SiNW with surface-roughness, we need much more atoms because periodicity along this direction can not be assumed.The largest model we try to simulate the rough SiNW is shown here. It has about 14,000 atoms.25PACS-CS1024 nodespeak performance5.6 GFLOPS/node

Subspace diagonalization4600 sec.Gram-Schmidt2300 sec.Conjugate-Gradient Method3700 sec.Total Energy calc.1200 sec.

Total(1 step)12,000 sec.

DOS of SiNW with roughnessDOS of Bulk Sid=10nmwith roughnessSi12822H1544(14,366 atoms)Eg=0.57eVThe computational costs are shown here. In order to get self-consistent electronic sturcture for this system, we need about 3 weeks by using 1024 nodes of the PACS-CS at CCS Tsukuba.The results of the DOS fo SINW is shown here. We found that the DOS of SiNW are largely changed from the bulk Si crystal by introducing the surface roughness. This may change the device characteristics, so it is important to provide such information from the computer simulation for the development of the SiNW devices.26Application3Si divacancySi divacancyStructure of Si divacancySmall-yellow balls : vacancies (no atoms)Green balls : Si atoms with dangling bonds.

There are two possibilitiesfor the structure of Sidivacancy.Resonant-Bond typeLarge-paring type

Both Large-paring and Resonant-Bond structure were found.Large-Paring type is the most stable (RB type is a local minimum)

More recent LDA calculation (Oguet et al., 1999)EPR experiment(Watkins & Corbett, 1965)LDA calculation(Saito & Oshiyama, 1994)Large-Paring typeResonant-Bond type is stable(Large-Paring type was not found)What is the stable structure ?Model size 60 atomsModel size 300 atomsModel Size dependence ?Si divacancy

dac, dab ()Model size (# of atoms)Large-paringResonant-BondSmall-ParingStructures converge at 998-atom model.

LP structure appears at 510 or larger models.

RB structure is most stable, but the energy difference is very small (