This lesson has been created for current stable version. Earlier versions are fully capable of running this tutorial but input files may have to be changed according to possible earlier formats.
Basics of BigDFT: running a wavelet computation on a CH4 molecule
The purpose of this lesson is to get familiar with basic variables needed to run a wavelet computation in isolated boundary conditions. At the end of the lesson, one can run a wavelet run, check the amount of needed memory and understand the important part of the output.
Introduction: running the code
This lesson is based on this skeleton
input.dft
file. To run electronic convergence loop, BigDFT stores its parameters in an optional file named input.dft
with all lines mandatory.
Beside this input file, BigDFT requires the atomic positions
for the studied system and optionaly the pseudo-potential files. For the
following tutorial, a methane molecule will be used. The position file
is a simple XYZ file named posinp.xyz
:
5 angstroemd0 # a methane molecule free C 0 0 0 H -0.63169789 -0.63169789 -0.63169789 H +0.63169789 +0.63169789 -0.63169789 H +0.63169789 -0.63169789 +0.63169789 H -0.63169789 +0.63169789 +0.63169789
The pseudo-potential files are following the ABINIT structure and are of GTH or HGH types (see the pseudo-potential file page on the ABINIT website for several LDA and GGA files and the page of M. Krack on the CP2K server for HGH pseudo for several functionals). The following files may be used for this tutorial: psppar.C and psppar.H.
Running BigDFT is done using the bigdft
executable in a standard Unix way, the output being by default the
standard output, it must be redirected to a file or to a pipe, like with the unix command
tee
:
user@garulfo:~/CH4/$ ls bigdft psppar.C psppar.H input.dft posinp.xyz user@garulfo:~/CH4/$ ./bigdft | tee screenOutput ...
Warning, to run properly, the pseudo-potential files must be
psppar.XX
where XX is the symbol used in the position
file. The other files can have user-defined names, as explained in this lesson.
If the code has been compiled with MPI capabilities (which is enabled by default), running BigDFT on several cores is as easy as run it as a serial job. There is no need to change anything in the input files. The following example shows how to run it on a Debian system with installed OpenMPI on a 4 core machine:
user@garulfo:~/CH4/$ ls bigdft psppar.C psppar.H input.dft posinp.xyz user@garulfo:~/CH4/$ mpirun -np 4 ./bigdft | tee screenOutput ...
The wavelet basis set, a convergence study
The wavelet is a systematic basis set (as plane waves are), which means than one can increase arbitrarily the accuracy of the results by varying some parameters.
The main grid parameters

hgrid
The two first lines of input.dft
are used to set
up the basis set. In free boundary conditions, the basis set is characterised
by a spatial expansion and a grid step, as shown in the side
figure.
There are hgrid
.
crmult, frmult
The second line contains two float values that are two
multiplying factors. They multiply quantities that are chemical
species dependant. The first factor is the most important since it
describes crmult
for
Coarse grid Radius MULTiplier. Increasing it means that further
spatial expansion is possible for the wavefunctions. Typical values
are 5 to 7.
Exercise: run BigDFT for the following
values of hgrid
and crmult
and plot the
total energy convergence versus hgrid
. The final total energy
can be retrieved at the end of the screen output, or using this command
`grep FINAL screenOutput`
, the value is in Hartree. A
comprehensive explanation of the screen output
will be given later in this tutorial.
hgrid = 0.55bohr / crmult = 3.5 hgrid = 0.50bohr / crmult = 4.0 hgrid = 0.45bohr / crmult = 4.5 hgrid = 0.40bohr / crmult = 5.0 hgrid = 0.35bohr / crmult = 5.5 hgrid = 0.30bohr / crmult = 6.0 hgrid = 0.20bohr / crmult = 7.0

This precision plot shows the systematicity of the wavelet basis set: by improving the basis set, we improve the value of the total energy.
hgrid = 0.55bohr / crmult = 3.5 --> -8.025214Ht hgrid = 0.50bohr / crmult = 4.0 --> -8.031315Ht hgrid = 0.45bohr / crmult = 4.5 --> -8.032501Ht hgrid = 0.40bohr / crmult = 5.0 --> -8.033107Ht hgrid = 0.35bohr / crmult = 5.5 --> -8.033239Ht hgrid = 0.30bohr / crmult = 6.0 --> -8.033300Ht hgrid = 0.20bohr / crmult = 7.0 --> -8.033319Ht
To go further, one can vary hgrid
and
crmult
independently. This is shown in the previous
figure with the grey line. The shape of the convergence curve shows that both these
parameters should be modified simoultaneously in order to increase accuracy.
Indeed, there are two kind of errors arising from the
basis set. The first one is due to the fact the basis set can't
account for quickly varying wavefunctions (value of hgrid
should
be decreased). The second error is the fact that the wavefunctions are
constrained to stay inside the defined basis set (output values are
zero). In the last case crmult
should be raised.
Fine tuning of the basis set
The multi-scale property of the wavelets is used in BigDFT and
a two level grid is used for the calculation. We've seen previously
the coarse grid definition using the the multiplying factor
crmult
. The second multiplying value on this line of the
input file is used for the fine grid and is called
frmult
. Like crmult
, it defines a factor for
the radii used to define the fine grid region where the number of degrees of freedom
is indeed eight times the one of the coarse grid. It allows to define region
near the atoms where the wavefunctions are allowed to vary more
quickly. Typical values for this factor are 8 to 10. It's worth to
note that even if the value of the multiplier is greater than
crmult
it defines a smaller region due to the fact that
the units which are associated to these radii are significantly different.
The physical quantities used by crmult
and
frmult
can be changed in the pseudo-potential by adding
an additional line with two values in bohr. The two values that the
code are using (either computed or read from the pseudo-potential
files) are output in the following way in the screen output:
------------------------------------------------------------------ System Properties Atom N.Electr. PSP Code Radii: Coarse Fine CoarsePSP Calculated File Si 4 10 1.80603 0.43563 0.93364 X H 1 10 1.46342 0.20000 0.00000 X
Analysing the output
The output of BigDFT is divided into four parts:
- Input values are printed out, including a summary of the different input files (DFT calculation parameters, atom positions, pseudo-potential values...).
- Input wavefunction creation, usually called "input guess".
- The SCF loop itself.
- The post SCF calculations including the forces calculation and other possible treatment like a finite size effect estimation or a virtual states determination.
The system parameters output
All the read values from the different input files are printed out at the program startup. Some additional values are provided there also, like the memory consumption. Values are given for one process, which corresponds to one core in an MPI environment.
Estimation performed for 1 processors. Memory occupation for principal arrays: Poisson Solver Kernel (K): 11 MB 9 KB Poisson Solver Density (D): 10 MB 736 KB Single Wavefunction for one orbital: 0 MB 412 KB All Wavefunctions for each processor: 3 MB 217 KB Wavefunctions + DIIS per proc (W): 22 MB 493 KB Nonlocal Pseudopotential Arrays (P): 1 MB 256 KB Arrays of full uncompressed grid (U): 10 MB 445 KB Estimation of Memory requirements for principal code sections: Kernel calculation | Density Construction | Poisson Solver | Hamiltonian application ~11*K | ~W+(~3)*U+P | ~8*D+K+W+P | ~W+(~3)*U+P 121MB | 59MB | 120MB | 63MB The overall memory requirement needed for this calculation is thus: 121 MB
In this example, the memory requirement is given for one process run and the peak of memory will be in the initialisation during the Poisson solver kernel creation, while the SCF loop will reach 120MB during the Poisson solver calculation. For bigger systems, with more orbitals, the peak of memory is usually reached during the Hamiltonian application.
Exercise: run a small utility program provided with
BigDFT called bigdft-tool
to estimate the memory requirement
of a run before submitting it to the queue system of a
super-computer. It reads the same input file than the
bigdft
executable, and is thus convenient to validate inputs.
The executable take one mandatory argument that is the number of cores to run BigDFT on. Try several values from 1 to 6 and discuss the memory distribution.
user@garulfo:~/CH4/$ ls bigdft-tool psppar.C psppar.H input.dft posinp.xyz user@garulfo:~/CH4/$ ./bigdft-tool 2 ...
BigDFT distributes the orbitals over the available processes (the value W does not decrease anymore after 4 processes since there are only 4 bands in our example). This means that running a parallel job with more processors than orbitals will result in a bad speedup. The number of cores involved in the calculation might be however increased via OMP parallelisation, as it is indicated in this lesson.
The input guess
The initial wavefunctions in BigDFT are calculated using the atomic orbitals for all the electrons of the s, p, d shells, obtained from the solution of the PSP self-consistent equation for the isolated atom.
------------------------------------------------------- Input Wavefunctions Creation Generating 8 Atomic Input Orbitals Processes from 0 to 1 treat 4 inguess orbitals Calculating AIO wavefunctions: Generation of input wavefunction data for atom C: Elec. Configuration: s 2.00 , p 2/3 2/3 2/3 , ... done. Generation of input wavefunction data for atom H: Elec. Configuration: s 1.00 , ... done.
The corresponding hamiltonian is then diagonalised and the
n_band (norb
in the code notations) lower eigenfunctions are used to start the SCF loop. BigDFT outputs the
eigenvalues, in the following example, 8 electrons were used in the
input guess and the resulting first fourth eigenfunctions will be used
for a four band calculation.
Input Wavefunctions Orthogonalization: Overlap Matrix... Direct diagonalization... evale(1)= -6.49353915254710E-01 <- evale(2)= -3.62562636487377E-01 <- evale(3)= -3.62467583819684E-01 <- evale(4)= -3.62467583819682E-01 <- Last InputGuess eval, H-L IG gap: 20.6959 eV evale(5)= 3.98091665658305E-01 <- First virtual eval evale(6)= 3.98308777292841E-01 <- evale(7)= 3.98308777292842E-01 <- evale(8)= 5.99339322351149E-01 <- Building orthogonal Wavefunctions... done.
The SCF loop
The SCF loop follows a direct minimisation scheme and is made of the following steps:
- Calculate the charge density from the previous wavefunctions.
- Apply the Poisson solver to obtain the Hartree potential from the charges and calculate the exchange-correlation energy and the energy of the XC potential thanks to the chosen functional.
- Apply the resulting hamiltonian on the current wavefunctions.
- Precondition the result and apply a steepest descent or a DIIS
history method (depending on the 8th line of
input.dft
file, the second value being the DIIS history length, which is usually 5 or 6 and should be put to 0 for SD minimisation). - Orthogonalise the new wavefunctions.
Then, BigDFT outputs a summary of the parts of the energy:
ekin_sum,epot_sum,eproj_sum 6.84560740872E+00 -1.00862806135E+01 5.49669947400E-01 ehart, eexcu, vexcu 1.57287496892E+01 -3.14375984400E+00 -4.11099591069E+00
Finally the total energy and the square norm of the residue
(gnrm) are printed out. The gnrm value is the stopping criterion. It is
chosen at the sixth line of the input.dft
file. A common
value is 1e-4 and good value can reach 1e-5.
iter,total energy,gnrm 2 -8.01285441883433336E+00 7.72E-02
Exercise: run `grep "total energy" screenOutput`
and
look at the convergence rate for our methane molecule.
The minimisation scheme coupled with DIIS (and thanks to the good preconditioner) is a very efficient way to obtain convergence for systems with a gap, even with a very small one. Usual run should reach the 1e-4 stop criterion within 15 to 25 iterations. Otherwise, there is an issue with the system, either there is no gap, or the input guess is too symmetric due to the LCAO diagonalization, specific spin polarization...
The post-SCF treatments
At the end of the SCF loop, a diagonalisation of the current hamiltonian is done to obtain Kohn-Sham eigenfunctions. The corresponding eigenvalues are also given.
The forces are then calculated.
Some other post-SCF may be done depending on the
input.dft
file:
- One can run an estimation of finite-size effects. This is explained in the manual (which is not yet completely updated to recent BigDFT versions).
- One can run a Davidson treatment on the current hamiltonian to obtain the energies (and virtual wavefunctions) of the first unoccupied levels.
Exercise: Before going further, review the
input.dft
file to identify the meaning of the different
lines as explained previously.
1st line, "0.450 0.450 0.450" hx, hy, hz are the grid spacing in the three directions.
2nd line, "5.0 9.0" crmult, frmult define the basis set real space expansion.
3rd line, "1" defines the exchange correlation functional, following the ABINIT numbering convention.
6th line, "1.e-04" is the stop criterion.
7th line, "50 10" the first value is the maximum number of SCF iteration and the second is the maximum number of restart after a fresh diagonalisation if convergence is not reached.
8th line, "6 6" the second value is the length of the DIIS history and should be put to 0 to use SD instead.
Exercise: run bigdft-tool
when varying the
DIIS history length and discuss the memory consumption.
Reducing the DIIS history is a good way to reduce the memory consumption when one cannot increase the number of processes. Of course this implies more iterations in SCF loops.
Adding a charge
BigDFT can treat charged system without the requirement to add a compensating background like in plane waves.
The additional charge to add to the system is set in the
input.dft
file at the fourth line. In the following
example an electron has been added (-1):
-1 0.0 0.0 0.0 ncharge efield
Exercise: remove the last hydrogen atom in the previous
methane example and modify input.dft
to add an
electron. Then run BigDFT for an electronic convergence.
One can notice that the total charge in the system is indeed -8 thanks to the additional charge. The convergence rate is still good for this CH3- radical since it is a closed shell system.
Running a geometry optimisation
In the previous charged example the geometry of the radical is kept the same than for the methane molecule, while it is likely to change. One can thus optimize the geometry with BigDFT.
To run geometry calculations (molecular dynamics, structure
optimisations...) one should add another input file called input.geopt
. The first line
of this file contains the method to use. Here, we look for a local
minimum so we can use the keyword LBFGS
. The third line of
this file contains the stopping criteria. There are two stopping
criteria: the first being ... and the second is the maximum on
forces. For isolated systems, the first criterion is well adapted
while the second is good for periodic boundary conditions.
Exercise: take the CH3- radical
posinp.xyz
file, add the input.geopt
and run
a geometry optimisation.
The evolution of the forces during relaxation
can be easily obtained running `grep FORCES screenOutput`. At each
iteration, BigDFT outputs a file posoutXXX.xyz
with the
geometry of the iteration XXX.