VASP hanging-up

Message

hashan_peiris · #1 Post by **hashan_peiris** » Thu Jul 31, 2025 7:41 pm

Hello,

I have been noticing that some calculations (relaxations/single-points/AIMD) tend to hang up soon after "entering main loop" or the "N E dE d eps ncg rms rms(c)" line in the output. These hang-ups are only seen for some systems and only occur when I use a VASP that has been compiled using Intel compilers (works fine when the same calculation is run with VASP compiled using GNU and Nvidia-based compilers). This hanging-up has been observed on multiple supercomputers and clusters, including workstations.
This has been observed to happen since VASP 6.4.0 up to VASP 6.5.1.

These are the modules that are loaded when I compile/run these calculations:

# INTEL 651
module load cpu/0.17.3b
module load intel/19.1.3.304/6pv46so
module load intel-mpi/2019.10.317/ezrfjne
module load intel-mkl/2020.4.304
ml hdf5/1.10.7

mpiexec -n $SLURM_NTASKS /expanse/projects/qstore/sbi121/SOFTWARE/EXE/INTEL/vasp_gam
or
mpirun -n $SLURM_NTASKS /expanse/projects/qstore/sbi121/SOFTWARE/EXE/INTEL/vasp_gam

This was the makefile.include used in a recent compilation:

# Default precompiler options
CPP_OPTIONS = -DHOST=\"LinuxIFC\" \
-DMPI -DMPI_BLOCK=8000 -Duse_collective \
-DscaLAPACK \
-DCACHE_SIZE=4000 \
-Davoidalloc \
-Dvasp6 \
-Dtbdyn \
-Dfock_dblbuf

CPP = fpp -f_com=no -free -w0 $*$(FUFFIX) $*$(SUFFIX) $(CPP_OPTIONS)

FC = mpiifort
FCL = mpiifort

FREE = -free -names lowercase

FFLAGS = -assume byterecl -w

OFLAG = -O2
OFLAG_IN = $(OFLAG)
DEBUG = -O0

# For what used to be vasp.5.lib
CPP_LIB = $(CPP)
FC_LIB = $(FC)
CC_LIB = icc
CFLAGS_LIB = -O
FFLAGS_LIB = -O1
FREE_LIB = $(FREE)

OBJECTS_LIB = linpack_double.o

# For the parser library
CXX_PARS = icpc
LLIBS = -lstdc++

# When compiling on the target machine itself, change this to the
# relevant target when cross-compiling for another architecture
VASP_TARGET_CPU ?= -march=core-avx2
FFLAGS += $(VASP_TARGET_CPU)

# Intel MKL (FFTW, BLAS, LAPACK, and scaLAPACK)
# (Note: for Intel Parallel Studio's MKL use -mkl instead of -qmkl)
FCL += -mkl
MKLROOT ?= /cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen2/intel-19.1.3.304/intel-mkl-2020.4.304-vg6aq26t4jbq6aia4xzf3ugsqos5ayuw/compilers_and_libraries_2020.4.304/linux/mkl
LLIBS += -L$(MKLROOT)/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64
INCS =-I$(MKLROOT)/include/fftw

# HDF5-support (optional but strongly recommended)
CPP_OPTIONS+= -DVASP_HDF5
HDF5_ROOT ?= /cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen2/intel-19.1.3.304/hdf5-1.10.7-pkh2ta5hmaxizkrrww4qplxyvki2mqfi
LLIBS += -L$(HDF5_ROOT)/lib -lhdf5_fortran
INCS += -I$(HDF5_ROOT)/include

#2 Post by **ahampel** » Fri Aug 01, 2025 8:15 am

Hi,

thank you for reaching out to us on the official VASP forum.

I just compiled VASP 6.5.1 with intel oneapi/2024.0.2 (intel-oneapi-mkl/2023.2.0 and intel-oneapi-mpi/2021.10.0) using the included arch/makefile.include.oneapi makefile . I just changed "-march=core-avx2" and enabled HDF5 support. This should match your used makefile.include fairly well.

With this your set of input files works perfectly fine for me:

Code: Select all

 running   32 mpi-ranks, on    1 nodes                                                                                                                                                                                                                                                                                                                  
 distrk:  each k-point on   32 cores,    1 groups                                                                                                                                                                                                                                                                                                       
 distr:  one band on    8 cores,    4 groups                                                                                                                                                                                                                                                                                                            
 vasp.6.5.1 (build Aug 01 2025 10:02:43) gamma-only                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
 POSCAR found type information on POSCAR C H CuN O                                                                                                                                                                                                                                                                                                      
 POSCAR found :  5 types and     302 ions                                                                                                                                                                                                                                                                                                               
 Reading from existing POTCAR                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
 scaLAPACK will be used                                                                                                                                                                                                                                                                                                                                 
 Reading from existing POTCAR                                                                                                                                                                                                                                                                                                                           
 LDA part: xc-table for (Slater+PW92), standard interpolation                                                                                                                                                                                                                                                                                           
 POSCAR, INCAR and KPOINTS ok, starting setup                                                                                                                                                                                                                                                                                                           
 FFT: planning ... GRIDC                                                                                                                                                                                                                                                                                                                                
 FFT: planning ... GRID_SOFT                                                                                                                                                                                                                                                                                                                            
 FFT: planning ... GRID                                                                                                                                                                                                                                                                                                                                 
 WAVECAR not read                                                                                                                                                                                                                                                                                                                                       
 prediction of wavefunctions initialized - no I/O                                                                                                                                                                                                                                                                                                       
 entering main loop                                                                                                                                                                                                                                                                                                                                     
       N       E                     dE             d eps       ncg     rms          rms(c)                                                                                                                                                                                                                                                             
RMM:   1     0.182700536315E+05    0.18270E+05   -0.69375E+05  1984   0.149E+03                                                                                                                                                                                                                                                                         
RMM:   2     0.580962098409E+04   -0.12460E+05   -0.19100E+05  1984   0.370E+02                                                                                                                                                                                                                                                                         
RMM:   3     0.105429260345E+04   -0.47553E+04   -0.56272E+04  1984   0.190E+02                                     
...

Could you try to set in your input file NCORE=1 and run the job with only 1 rank and see if it eventually works? Maybe you can also reduce some of the INCAR flags. Just to reduce the number of possible flags that causes the issue. Please also add an OUTCAR file of such aborted hanging job. Thanks!

Additionally I have to note that your used intel compiler version is quite old. Do you happen to have access to a newer version. The oldest version I have to test is intel/2022.0.1 . If you can reproduce the problem on a slightly newer version, and we settle on a number of MPI ranks that you used I can better try to reproduce the problem. But so far everything looks okay to me.

Best,
Alex

hashan_peiris · #3 Post by **hashan_peiris** » Thu Aug 14, 2025 6:44 pm

Thank you very much for looking into this!

I compiled VASP 651 with the latest Intel OneAPI toolchain, and it did seem to work fine this time for the same calculation. I have in the past done the same (with the latest IntelOneAPI) on a different computer and had run into the same issue (this was about 6 months ago for VASP 642), so I didn't think it was a compiler issue back then. But looks like it works fine now.

I will provide an update if it comes back.
Thanks again!
:)

#4 Post by **ahampel** » Fri Aug 15, 2025 12:04 pm

Hi,

okay good to hear. If the problem occurs again just open another issue and reference this one and me in the post. If you have reproduced the issue with a more current compiler and I can reproduce it I am happy to dig deeper. Such problems are not unheard of. We had this with specific versions of intel mpi and open mpi a few times, but we also of course found problems in our code that led to such behavior. It is just very hard to debug if we can't reproduce the problem directly.

Best,
Alex

VASP Forum

VASP hanging-up

VASP hanging-up

Re: VASP hanging-up

Re: VASP hanging-up

Re: VASP hanging-up