BSE Wavefunction calculation crashes with OpenACC FATAL ERROR (PRESENT clause w1(:) missing on device) in vasp.6.6.0 GPU
VASP Version and Build Environment:
- VASP Version: 6.6.0 (Released 06Mar2026, built Apr 11 2026)
- Execution Mode: Complex, Parallel with OpenACC GPU acceleration enabled (vasp_ncl)
- Compiler/Toolchain: NVIDIA HPC SDK (NVFORTRAN compiler with OpenACC)
- Hardware Setup: 1 Node, 4 MPI ranks, 24 threads/rank, 4x NVIDIA H100 Tesla GPUs (Compute Capability 9.0)
Description of the Bug:
When performing a Bethe-Salpeter Equation (BSE) calculation (ALGO = BSE) to compute and plot the excitonic wavefunctions/charge densities in the real space via NBSEEIG and BSEHOLE (or BSEELECTRON), the GPU execution crashes immediately after the matrix diagonalization step. The Hamiltonian setup, distribution, and diagonalization via ZHEEVX run successfully on the GPU. However, as soon as the code reaches the subroutine responsible for gathering and transforming the excitonic wavefunctions to write the CHG.XX files, an OpenACC runtime error is triggered. The host-to-device data status check fails because the array/vector w1(:) has not been mapped or present-validated on the device memory. The fatal error occurs in wave_high.f90 inside the subroutine w1_gather_ksel_nofft (around line 3827).
My guess is that an OpenACC directive (likely #pragma acc parallel loop present(w1) or similar !$acc directive) specifies that the array w1 should already reside in the device memory. However, during the post-processing phase of the BSE states where the wavefunctions are collected from the various K-points without Fast Fourier Transform (FFT) processing, the data migration (!$acc update device or !$acc enter data copyin) was missing or bypassed on the CPU host side before calling this device kernel.
ps: Standard calculations for the dielectric function, BSE fatbands, and so on, work correctly and produce accurate results.
GW@HSE06 + BSE files attached.