Dear VASP community,
I have recently compiled VASP 6.5.1 and would like to report some testsuite failures and request clarification on whether they represent genuine issues with our build or known reference incompatibilities.Ultimately I would like to know if this compilation is reliable regardless of the test errors.
HARDWARE
The cluster has two node types:
- 180× Lenovo SR645 (sr nodes): 2× AMD EPYC 7H12 (Zen 2, 64c/socket, 128c/node), 512GB RAM, HDR100 InfiniBand
- 34× Lenovo SR645 v3 (bc nodes): 2× AMD EPYC 9754 (Zen 4c Bergamo, 128c/socket, 256c/node), 768GB RAM, 2× HDR200 InfiniBand
OS: OpenSUSE 15.4
---
TOOLCHAIN
We use GCC 13.2 + OpenMPI 5.0.2 with the following key libraries:
- OpenBLAS 0.3.26 (runtime)
- MKL 2021.4 (linked as BLAS/LAPACK backend for ELPA compatibility)
- ScaLAPACK 2.2.0
- ELPA 2021.11.002 (OpenMP variant: -lelpa_openmp)
- FFTW 3.3.10 (with _threads variant)
- HDF5 1.14.4
- Wannier90 3.1.0
Key compilation flags:
-march=znver4 -mtune=znver4 -O3 -ffast-math -funroll-loops -fomit-frame-pointer -fopenmp
Two binaries were produced:
- znver4: 323,248 AVX-512 zmm instructions confirmed via objdump
- znver2: ~520 zmm instructions (MKL only, zero in VASP code)
Note: ELPA was compiled against MKL (LP64 interface). Using OpenBLAS directly alongside ELPA caused DGETRF integer interface errors, so MKL is used as the BLAS/LAPACK backend while OpenBLAS is loaded for runtime symbol resolution only.
---
RUNTIME CONFIGURATION
Hybrid MPI+OpenMP: 32 ranks × 8 OpenMP threads = 256 cores per bc node.
ELPA warning at runtime: "MPI threading level MPI_THREAD_SERIALIZED or MPI_THREAD_MULTIPLE required but your implementation does not support this. The number of OpenMP threads within ELPA will be limited to 1." We understand this is a compile-time limitation of OpenMPI on our cluster and accept it.
---
TESTSUITE RESULTS
The testsuite was run with LSCALAPACK=.FALSE. using 4 MPI ranks, OMP_NUM_THREADS=1.
Failed tests:
1. andersen_nve_constrain_fixed and andersen_nve_constrain_fixed_RPR
- andersen_nve PASSES
- andersen_nve_constrain_fixed FAILS
- Energy differences appear from step 3 onward, suggesting trajectory divergence
- RANDOM_SEED is explicitly set in INCAR (245175543 3381 0)
- Failure is identical on znver2 and znver4 binaries
- Failure persists after removing -ffast-math and recompiling
My Question: is this a known GCC vs Intel reference incompatibility for constrained NVE MD?
2. HEG_333_LW
- We suspect this fails due to LSCALAPACK=.FALSE. in our testsuite run since it is categorized as RPA/GW and likely requires ScaLAPACK's distributed eigensolver. Is this correct?
3. SiC8_GW0R
- This test appears to hang indefinitely on our AMD EPYC nodes. We have seen reports of this being an AMD-specific known issue. Can you confirm?
---
All other tests pass. Production calculations (LDA+U, relaxations, static SCF) produce physically reasonable results. We would greatly appreciate confirmation on whether the three failure categories above are known issues with GCC-compiled VASP on AMD hardware, or whether they indicate a genuine problem with our build.
Thank you,
Felipe

