vasp-edge runs on single core only

Questions regarding the compilation of VASP on various platforms: hardware, compilers and libraries, etc.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
akretschmer
Newbie
Newbie
Posts: 30
Joined: Wed Nov 13, 2019 8:14 am

vasp-edge runs on single core only

#1 Post by akretschmer » Tue Sep 30, 2025 1:22 pm

Hello,

I am trying to get vasp-edge to run on the VSC-5. I have managed to compile it, but when I try to run it on mutltiple cores, I get an error. On a single core it works.

This is the CPU: AMD EPYC 7713 (Milan), 64 cores, with 512 GB

This is the makefile.include:

Code: Select all

# Default precompiler options
CPP_OPTIONS = -DHOST=\"LinuxIFC\" \
              -DMPI -DMPI_BLOCK=8000 -Duse_collective \
              -DscaLAPACK \
              -DCACHE_SIZE=4000 \
              -Davoidalloc \
              -Dvasp6 \
              -Dtbdyn \
              -Dfock_dblbuf \
              -D_OPENMP

CPP         = fpp -f_com=no -free -w0  $*$(FUFFIX) $*$(SUFFIX) $(CPP_OPTIONS)

FC          = mpiifort -fc=ifx -qopenmp
FCL         = mpiifort -fc=ifx

FREE        = -free -names lowercase

FFLAGS      = -assume byterecl -w

OFLAG       = -O2
OFLAG_IN    = $(OFLAG)
DEBUG       = -O0

# For what used to be vasp.5.lib
CPP_LIB     = $(CPP)
FC_LIB      = $(FC)
CC_LIB      = icx
CFLAGS_LIB  = -O
FFLAGS_LIB  = -O1
FREE_LIB    = $(FREE)

OBJECTS_LIB = linpack_double.o

# For the parser library
CXX_PARS    = icpx
LLIBS       = -lstdc++

##
## Customize as of this point! Of course you may change the preceding
## part of this file as well if you like, but it should rarely be
## necessary ...
##

# When compiling on the target machine itself, change this to the
# relevant target when cross-compiling for another architecture
VASP_TARGET_CPU ?= -xHOST
FFLAGS     += $(VASP_TARGET_CPU)

# Intel MKL (FFTW, BLAS, LAPACK, and scaLAPACK)
# (Note: for Intel Parallel Studio's MKL use -mkl instead of -qmkl)
FCL        += -qmkl
MKLROOT    ?= /gpfs/opt/sw/zen/spack-0.19.0/opt/spack/linux-almalinux8-zen3/gcc-12.2.0/intel-oneapi-mkl-2024.0.0-tk3clqdskbecdrkh4suranan25gzomqy/mkl/2024.0
LLIBS      += -L$(MKLROOT)/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64
INCS        =-I$(MKLROOT)/include/fftw

# HDF5-support (optional but strongly recommended, and mandatory for some features)
CPP_OPTIONS+= -DVASP_HDF5
HDF5_ROOT  ?= /gpfs/opt/sw/zen/spack-0.19.0/opt/spack/linux-almalinux8-zen3/intel-2021.9.0/hdf5-1.12.2-tiadmsp7pkeq5xgsjn7e2cczsmzclxa4/
LLIBS      += -L$(HDF5_ROOT)/lib -lhdf5_fortran
INCS       += -I$(HDF5_ROOT)/include

# For the VASP-2-Wannier90 interface (optional)
#CPP_OPTIONS    += -DVASP2WANNIER90
#WANNIER90_ROOT ?= /path/to/your/wannier90/installation
#LLIBS          += -L$(WANNIER90_ROOT)/lib -lwannier

# For the fftlib library (hardly any benefit in combination with MKL's FFTs)
#FCL         = mpiifort fftlib.o -qmkl
#CXX_FFTLIB  = icpc -qopenmp -std=c++11 -DFFTLIB_USE_MKL -DFFTLIB_THREADSAFE
#INCS_FFTLIB = -I./include -I$(MKLROOT)/include/fftw
#LIBS       += fftlib

# For machine learning library vaspml (experimental)
#CPP_OPTIONS += -Dlibvaspml
#CPP_OPTIONS += -DVASPML_USE_CBLAS
#CPP_OPTIONS += -DVASPML_USE_MKL
#CPP_OPTIONS += -DVASPML_DEBUG_LEVEL=3
#CXX_ML      = mpiicpc -cxx=icpx -qopenmp
#CXXFLAGS_ML = -O3 -std=c++17 -Wall
#INCLUDE_ML  =

CPP_OPTIONS+= -DPLUGINS
LLIBS      += $(shell python3-config --ldflags --embed) -lstdc++
CXX_FLAGS   = -Wall -Wextra  $(shell python3 -m pybind11 --includes) -std=c++11

I load the following modules and export the following paths in my submission script (I loaded the same modules when compiling):

Code: Select all

#!/bin/sh
#SBATCH -J plugin-test
#SBATCH -N 1
#SBATCH --partition=zen3_0512
#SBATCH --ntasks-per-node=128
#SBATCH --qos=zen3_0512_devel
#SBATCH --time=00:10:00

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/gpfs/opt/sw/zen/spack-0.19.0/opt/spack/linux-almalinux8-zen2/gcc-9.5.0/gcc-12.2.0-ohbahzapabgcslhxaguhvihxwuw7hjri/lib64
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/gpfs/opt/sw/zen/spack-0.19.0/opt/spack/linux-almalinux8-zen3/intel-2021.9.0/hdf5-1.12.2-tiadmsp7pkeq5xgsjn7e2cczsmzclxa4/lib
export OMP_NUM_THREADS=1
export I_MPI_PIN_RESPECT_CPUSET=0
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/fs72669/akretschmer3/.conda/envs/vasp-python/lib


module purge
ml intel-oneapi-mkl/2023.1.0-intel-2021.9.0-7aranik
ml hdf5/1.12.2-intel-2021.9.0-tiadmsp
ml intel-mpi/2019.12
ml intel-oneapi-compilers/2024.0.0-gcc-12.2.0-ehznprr

export UCX_TLS=self,sm,dc_mlx5

ulimit -s unlimited

mpirun -np 1 --map-by ppr:64:socket --bind-to core /gpfs/data/fs72669/akretschmer3/vasp-edge/vasp-edge/bin/vasp_std

When I change the last line to

Code: Select all

mpirun -np 128 --map-by ppr:64:socket --bind-to core /gpfs/data/fs72669/akretschmer3/vasp-edge/vasp-edge/bin/vasp_std

I get an error. See this output file:

slurm-6252927.zip

What do I have to do?

You do not have the required permissions to view the files attached to this post.

michael_wolloch
Global Moderator
Global Moderator
Posts: 209
Joined: Tue Oct 17, 2023 10:17 am

Re: vasp-edge runs on single core only

#2 Post by michael_wolloch » Tue Sep 30, 2025 2:26 pm

Hi akretschmer,

Unfortunately, I currently have no access to VSC5, so I cannot really test stuff there, but as I remember, the module system there is a bit confusing.

There is also another thread on this forum that is currently active that reports a similar problem:

VASP 6.5.0 installed but it runs only on 1 core and is slow

The issue is not the same, since the code runs, even when selecting multiple cores, but it is constrained to only one of them.

My initial idea would be to use another toolchain. Can you use aocl/aocc or GNU/MKL? Performance should be really similar.

Otherwise, I remember that I used to set

Code: Select all

SLURM_TASKS_PER_NODE='64(x1)' 

in my jobscripts, maybe that can help as well.

Due to memory bandwidth limitations, it is really not beneficial to run VASP on more than 64 MPI ranks on a VSC5 node. But make sure that the ranks are spread to both sockets to use all available memory channels.

Let me know if another toolchain is an option.
Cheers, Michael


Post Reply