Hello,
When running the GPU-accelerated version of Vasp 6.5.1, I noticed my KPOINTS_OPT loop was significantly slower when using hybrid functionals. I checked the GPU utilization of these runs during the SCF loop and during the KPOINTS_OPT loop and noticed that the GPU utilization hovers around 90% during the SCF loop (good) but stays at 0 for KPOINTS_OPT, with CPU utilization increasing. I later tested a PBE run and confirmed the GPU is not used during the KPOINTS_OPT loop.
Is GPU offloading not supported for KPOINTS_OPT?
A few notes:
- KPOINTS_OPT_BATCH had no effect on the GPU utilization
- I run on the Delta (https://docs.ncsa.illinois.edu/systems/ ... index.html) GPU nodes which have up to 4 A100 gpus and 1 AMD 64 core processor. Number of GPUs does not change the fact the utilization goes to 0 during the KPOINTS_OPT step.
- Compilers are the Cray-MPICH compilers, makefile.include attached
This is my first time running KPOINTS_OPT on GPUs so I do not have "working" comparisons or other hardware to test on. In the past I had only run calculations with KPOINTS_OPT on CPUs and never noticed a performance change within the KPOINTS_OPT loop.
Thank you,
Erick

