Blocked-Davidson algorithm: Difference between revisions
Vaspmaster (talk | contribs) No edit summary |
Vaspmaster (talk | contribs) No edit summary |
||
| (One intermediate revision by the same user not shown) | |||
| Line 21: | Line 21: | ||
::<math>\{ \psi^d_k| k=1,..,n_1\} \Rightarrow \{ \psi_k| k=1,..,N_{\rm bands}\}</math>. | ::<math>\{ \psi^d_k| k=1,..,n_1\} \Rightarrow \{ \psi_k| k=1,..,N_{\rm bands}\}</math>. | ||
* Move on to the next block <math>\{ \psi^1_k| k=n_1+1,..,2 n_1\}</math>. | * Move on to the next block <math>\{ \psi^1_k| k=n_1+1,..,2 n_1\}</math>. | ||
* | * When {{TAG|LDIAG}}=.TRUE. (default), a Rayleigh-Ritz optimization in the complete subspace <math>\{ \psi_k| k=1,..,N_{\rm bands}\}</math> is performed after all orbitals have been optimized. | ||
The blocksize <math>n_1</math> used in the blocked-Davidson algorithm can be set by means of the {{TAG|NSIM}} tag. | |||
In principle <math>n_1= 2\times</math> {{TAG|NSIM}}, but for technical reasons it needs to be dividable by an integer ''N'': | |||
:<math> | |||
n_1 = {\rm int}\left(\frac{2*{\rm NSIM} + N - 1}{N}\right) N | |||
</math> | |||
where <math>N</math> is the "number of band groups per k-point group": | |||
:<math> | |||
N = \frac{{\rm \#\; of\; MPI\; ranks}}{{\rm IMAGES}*{\rm KPAR}*{\rm NCORE}} | |||
</math> | |||
(see [[Parallelization#Basic_parallelization|the section on parallelization basics]]). | |||
As mentioned before, the optimization of a block of orbitals is stopped when either the maximum iteration depth ({{TAG|NRMM}}), or a certain convergence threshold has been reached. The latter may be fine-tuned by means of the {{TAG|EBREAK}}, {{TAG|DEPER}}, and {{TAG|WEIMIN}} tags. Note: we do not recommend you to do so! Rather rely on the defaults instead. | |||
The blocked-Davidson algorithm is approximately a factor of 1.5-2 slower than the [[RMM-DIIS]], but more robust. | The blocked-Davidson algorithm is approximately a factor of 1.5-2 slower than the [[RMM-DIIS]], but more robust. | ||
Latest revision as of 12:31, 14 November 2023
The workflow of the blocked-Davidson iterative matrix diagonalization scheme implemented in VASP is as follows:[1][2]
- Take a subset (block) of [math]\displaystyle{ n_1 }[/math] orbitals out of the total set of NBANDS orbitals:
- [math]\displaystyle{ \{ \psi_n| n=1,..,N_{\rm bands}\}\Rightarrow \{ \psi^1_k| k=1,..,n_1\} }[/math].
- Extend the subspace spanned by [math]\displaystyle{ \{\psi^1\} }[/math] by adding the preconditioned residual vectors of [math]\displaystyle{ \{\psi^1\} }[/math]:
- [math]\displaystyle{ \left \{ \psi^1_k \, / \, g^1_k = \left (1- \sum_{n=1}^{N_{\rm bands}} | \psi_n \rangle \langle\psi_n | {\bf S} \right) {\bf K} \left ({\bf H} - \epsilon_{\rm app} {\bf S} \right ) \psi^1_k \, | \, k=1,..,n_1 \right \}. }[/math]
- Rayleigh-Ritz optimization ("subspace rotation") within the [math]\displaystyle{ 2n_1 }[/math]-dimensional space spanned by [math]\displaystyle{ \{\psi^1/g^1\} }[/math], to determine the [math]\displaystyle{ n_1 }[/math] lowest eigenvectors:
- [math]\displaystyle{ {\rm diag}\{\psi^1/g^1\} \Rightarrow \{ \psi^2_k| k=1,..,n_1\} }[/math]
- Extend the subspace with the residuals of [math]\displaystyle{ \{\psi^2\} }[/math]:
- [math]\displaystyle{ \left \{ \psi^2_k \,/ \, g^1_k \, / \, g^2_k = \left (1- \sum_{n=1}^{N_{\rm bands}} | \psi_n \rangle \langle\psi_n | {\bf S} \right ) {\bf K} \left ({\bf H} - \epsilon_{\rm app} {\bf S} \right) \psi^2_k \, | \, k=1,..,n_1 \right \}. }[/math]
- Rayleigh-Ritz optimization ("subspace rotation") within the [math]\displaystyle{ 3n_1 }[/math]-dimensional space spanned by [math]\displaystyle{ \{\psi^1/g^1/g^2\} }[/math]:
- [math]\displaystyle{ {\rm diag}\{\psi^1/g^1/g^2\} \Rightarrow \{ \psi^3_k| k=1,..,n_1\} }[/math]
- If need be the subspace may be extended by repetition of this cycle of adding residual vectors and Rayleigh-Ritz optimization of the resulting subspace:
- [math]\displaystyle{ {\rm diag}\{\psi^1/g^1/g^2/../g^{d-1}\}\Rightarrow \{ \psi^d_k| k=1,..,n_1\} }[/math]
- Per default VASP will not iterate deeper than [math]\displaystyle{ d=4 }[/math], though it may break off even sooner when certain criteria that measure the convergence of the orbitals have been met.
- When the iteration is finished, store the optimized block of orbitals back into the set:
- [math]\displaystyle{ \{ \psi^d_k| k=1,..,n_1\} \Rightarrow \{ \psi_k| k=1,..,N_{\rm bands}\} }[/math].
- Move on to the next block [math]\displaystyle{ \{ \psi^1_k| k=n_1+1,..,2 n_1\} }[/math].
- When LDIAG=.TRUE. (default), a Rayleigh-Ritz optimization in the complete subspace [math]\displaystyle{ \{ \psi_k| k=1,..,N_{\rm bands}\} }[/math] is performed after all orbitals have been optimized.
The blocksize [math]\displaystyle{ n_1 }[/math] used in the blocked-Davidson algorithm can be set by means of the NSIM tag.
In principle [math]\displaystyle{ n_1= 2\times }[/math] NSIM, but for technical reasons it needs to be dividable by an integer N:
- [math]\displaystyle{ n_1 = {\rm int}\left(\frac{2*{\rm NSIM} + N - 1}{N}\right) N }[/math]
where [math]\displaystyle{ N }[/math] is the "number of band groups per k-point group":
- [math]\displaystyle{ N = \frac{{\rm \#\; of\; MPI\; ranks}}{{\rm IMAGES}*{\rm KPAR}*{\rm NCORE}} }[/math]
(see the section on parallelization basics).
As mentioned before, the optimization of a block of orbitals is stopped when either the maximum iteration depth (NRMM), or a certain convergence threshold has been reached. The latter may be fine-tuned by means of the EBREAK, DEPER, and WEIMIN tags. Note: we do not recommend you to do so! Rather rely on the defaults instead.
The blocked-Davidson algorithm is approximately a factor of 1.5-2 slower than the RMM-DIIS, but more robust.