NBLOCK FOCK: Difference between revisions
No edit summary |
No edit summary |
||
| Line 1: | Line 1: | ||
{{TAGDEF|NBLOCK_FOCK|[integer]} | {{TAGDEF|NBLOCK_FOCK|[integer]}} | ||
{{DEF|LNONCOLLINEAR|64|'''CPU build''' |32|'''GPU build''' (OpenACC/OpenMP offload))|<code>OMP_NUM_THREADS</code>|'''CPU build''' that is compiled with [[Precompiler_options#-D_OPENMP|OpenMP support]] and threading is active (<code>OMP_NUM_THREADS</code> > 1)}} | {{DEF|LNONCOLLINEAR|64|'''CPU build''' |32|'''GPU build''' (OpenACC/OpenMP offload))|<code>OMP_NUM_THREADS</code>|'''CPU build''' that is compiled with [[Precompiler_options#-D_OPENMP|OpenMP support]] and threading is active (<code>OMP_NUM_THREADS</code> > 1)}} | ||
{{DISPLAYTITLE:NBLOCK_FOCK}} | {{DISPLAYTITLE:NBLOCK_FOCK}} | ||
Revision as of 08:08, 31 March 2026
NBLOCK_FOCK = [integer]
| Default: LNONCOLLINEAR | = 64 | CPU build |
| = 32 | GPU build (OpenACC/OpenMP offload)) | |
= OMP_NUM_THREADS
|
CPU build that is compiled with OpenMP support and threading is active (OMP_NUM_THREADS > 1)
|
Description: Sets the number of orbitals that are processed simultaneously when computing the action of the Fock potential.
Instead of computing the action of the Fock potential on one orbital at a time, up to NBLOCK_FOCK orbitals are gathered and processed at once. This enables the use of matrix-matrix operations rather than matrix-vector operations, which is beneficial for performance on modern hardware.
Tuning NBLOCK_FOCK can significantly affect both the performance and memory consumption of hybrid functional calculations. Especially on GPUs, NBLOCK_FOCK should be tuned carefully to achieve optimal performance.
| Tip: On GPU architectures, the optimal value of NBLOCK_FOCK depends strongly on the specific hardware and the number of bands. It is recommended to experiment with values in the range 16-64. |
Related tags and articles
NSIM, LHFCALC, AEXX, HFSCREEN, NCORE, NPAR, KPAR, PRECFOCK