Page 1 of 1

SiC8_GW0R test hangup VASP 6.3.2

Posted: Mon Aug 01, 2022 4:12 pm
by matthew_matzelle
Hi All,

I have been having trouble with this test in particular(SiC8_GW0R) when run on our zen 2 architecture nodes. It always gets stuck at the end and never finishes. This is the line it always gets stuck at in the OUTCAR:
" GAMMA: cpu time 0.0315: real time 0.0316"
I have attached the relevant inputs and outputs in addition to my makefile in a zip file. I have not included the WAVECAR or WAVEDER as they would make the zipfile too large. Let me know if you would like them as well.

It has been compiled with intel oneapi 2022.0.1 compilers and mkl and intel oneapi mpi 2021.5.0 and hdf5-1.12.1 and -march=core-avx2 while on the AMD EPYC node.

Additionally, I have tested this exact compilation on a cascade lake architecture node and this issue DOESN'T occur. It seems to be specific to the zen 2 architecture.

I have also tested it on a slightly different setup(different number of cores on the node) but still in the AMD EPYC ROME processor family and I run into the same issue.

All other fast tests finish successfully.

I have also tested it when compiling without OPENMP and with VASP 6.2.0(albeit with slightly older intel compilers, mpi, and mkl and without hdf5) and I still see the exact same issue with the zen 2 architecture.

Please let me know if you require any more information.
Thank you for your help.
Matt Matzelle

Re: SiC8_GW0R test hangup VASP 6.3.2

Posted: Tue Aug 02, 2022 10:53 am
by alexey.tal
Hi,

There was a similar issue reported for AMD (thread).
Such hangups on AMD architecture can be due to the fabrics control. Try setting the variable I_MPI_FABRICS=shm or run the tests with

Code: Select all

 
mpirun -np 4 -genv I_MPI_FABRICS=shm vasp_std

Re: SiC8_GW0R test hangup VASP 6.3.2

Posted: Tue Aug 02, 2022 10:32 pm
by matthew_matzelle
Thank you!
That alleviated the issue.
However, I am wondering if this will lead to a slowdown on internode calculations. Do you think this setting will have any impact vs using I_MPI_FABRICS=shm:ofi on internode calculations?

Re: SiC8_GW0R test hangup VASP 6.3.2

Posted: Wed Aug 03, 2022 9:50 am
by alexey.tal
If you run your calculation on multiple AMD nodes, the inter-node communication has to be selected too, so you should use I_MPI_FABRICS=shm:ofi.

Re: SiC8_GW0R test hangup VASP 6.3.2

Posted: Wed Aug 03, 2022 11:54 am
by matthew_matzelle
Then I am a little confused.

I believe the default on the AMD EPYC ROME nodes is shm:ofi as you can see from output file and setting I_MPI_DEBUG=4 and not setting I_MPI_FABRICS at all:

[0] MPI startup(): Intel(R) MPI Library, Version 2021.5 Build 20211102 (id: 9279b7d62)
[0] MPI startup(): Copyright (C) 2003-2021 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.13.2rc1-impi
[0] MPI startup(): libfabric provider: mlx
[0] MPI startup(): Load tuning file: "/shared/centos7/intel/oneapi/2022.1.0/mpi/2021.5.0/etc/tuning_generic_shm-ofi_mlx_hcoll.dat"

Notice how it uses "tuning_generic_shm-ofi_mlx_hcoll.dat"

However after explicitly setting I_MPI_FABRICS=shm it says the following:

[0] MPI startup(): Intel(R) MPI Library, Version 2021.5 Build 20211102 (id: 9279b7d62)
[0] MPI startup(): Copyright (C) 2003-2021 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): File "" not found
[0] MPI startup(): Load tuning file: "/shared/centos7/intel/oneapi/2022.1.0/mpi/2021.5.0/etc/tuning_generic_shm.dat"

The job without setting I_MPI_FABRICS fails while the job setting I_MPI_FABRICS=shm works fine. So I am thinking the problem lies in this file:
"tuning_generic_shm-ofi_mlx_hcoll.dat"

Additionally when running on the cascade lake nodes and not setting I_MPI_FABRICS the output file says the following:

[0] MPI startup(): Intel(R) MPI Library, Version 2021.5 Build 20211102 (id: 9279b7d62)
[0] MPI startup(): Copyright (C) 2003-2021 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.13.2rc1-impi
[0] MPI startup(): libfabric provider: mlx
[0] MPI startup(): File "/shared/centos7/intel/oneapi/2022.1.0/mpi/2021.5.0/etc/tuning_skx_shm-ofi_mlx_100.dat" not found
[0] MPI startup(): Load tuning file: "/shared/centos7/intel/oneapi/2022.1.0/mpi/2021.5.0/etc/tuning_skx_shm-ofi.dat"

and this job completes successfully.

So in summary the GW issue is fixed by using I_MPI_FABRICS=shm. However, this fix won't transfer to multinode calculations because this issue is actually inherent in the shm:ofi fabric choice. Furthermore, the problem somehow likely lies in the "tuning_generic_shm-ofi_mlx_hcoll.dat" file because the "tuning_skx_shm-ofi.dat" file has no such problems.

I am wondering if it is possible to use the "tuning_skx_shm-ofi.dat" with the AMD nodes via setting some variables or if there is a more general fix that an be applied.

Thank you,
Matt

Re: SiC8_GW0R test hangup VASP 6.3.2

Posted: Fri Aug 05, 2022 2:24 pm
by alexey.tal
Thank you for doing these tests.
Did you run a calculation with I_MPI_FABRICS=shm:ofi on a single or multiple AMD nodes? Does it hang up?
So far we have not seen that shm:ofi does not work on multiple nodes.

Re: SiC8_GW0R test hangup VASP 6.3.2

Posted: Tue Aug 09, 2022 11:47 pm
by matthew_matzelle
I have done the test and it still hangs. Here is the top of output(the bottom is the same as previous):

[0] MPI startup(): Intel(R) MPI Library, Version 2021.5 Build 20211102 (id: 9279b7d62)
[0] MPI startup(): Copyright (C) 2003-2021 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.13.2rc1-impi
[0] MPI startup(): libfabric provider: mlx
[0] MPI startup(): Load tuning file: "/shared/centos7/intel/oneapi/2022.1.0/mpi/2021.5.0/etc/tuning_generic_shm-ofi_mlx_hcoll.dat"
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 48530 d3041 {0}
[0] MPI startup(): 1 48531 d3041 {8}
[0] MPI startup(): 2 48532 d3041 {16}
[0] MPI startup(): 3 37818 d3042 {0}
[0] MPI startup(): threading: mode: direct
[0] MPI startup(): threading: vcis: 1
[0] MPI startup(): threading: app_threads: 1
[0] MPI startup(): threading: runtime: generic
[0] MPI startup(): threading: is_threaded: 0
[0] MPI startup(): threading: async_progress: 0
[0] MPI startup(): threading: num_pools: 64
[0] MPI startup(): threading: lock_level: global
[0] MPI startup(): threading: enable_sep: 0
[0] MPI startup(): threading: direct_recv: 1
[0] MPI startup(): threading: zero_op_flags: 1
[0] MPI startup(): threading: num_am_buffers: 1
[0] MPI startup(): threading: library is built with per-vci thread granularity

Here was my job script:

#!/bin/bash
#SBATCH --job-name=TiSrO3
#SBATCH --output=output.out
#SBATCH --error=error.error
#SBATCH --time=6:00:00
#SBATCH --mem=0
#SBATCH -n 4
#SBATCH -p bansil
#SBATCH -N 2
#SBATCH --constraint=ib

module load intel/compilers-2022.0.1
module load intel/mpi-2021.5.0
module load hdf5/1.12.1-intel2022
source /shared/centos7/intel/oneapi/2022.1.0/setvars.sh

BIN=/work/bansil/programs/VASP632intel2022omp/vasp.6.3.2/bin/vasp_std

export OMP_NUM_THREADS=1
export OMP_PLACES=cores
export OMP_PROC_BIND=close
export OMP_STACKSIZE=512m
export I_MPI_PIN_DOMAIN=omp
export I_MPI_PIN=yes
#export I_MPI_FABRICS=shm:ofi
export I_MPI_DEBUG=4

mpirun -np 4 $BIN



Interestingly when I specify "export I_MPI_FABRICS=shm:ofi" specifically I get the following error:

[0] MPI startup(): Intel(R) MPI Library, Version 2021.5 Build 20211102 (id: 9279b7d62)
[0] MPI startup(): Copyright (C) 2003-2021 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.13.2rc1-impi
[0] MPI startup(): libfabric provider: mlx
[0] MPI startup(): Load tuning file: "/shared/centos7/intel/oneapi/2022.1.0/mpi/2021.5.0/etc/tuning_generic_shm-ofi_mlx_hcoll.dat"

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 0 PID 47352 RUNNING AT d3041
= KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 2 PID 47354 RUNNING AT d3041
= KILLED BY SIGNAL: 9 (Killed)
===================================================================================

The job script for this run was:

#!/bin/bash
#SBATCH --job-name=TiSrO3
#SBATCH --output=output.out
#SBATCH --error=error.error
#SBATCH --time=6:00:00
#SBATCH --mem=0
#SBATCH -n 4
#SBATCH -p bansil
#SBATCH -N 2
#SBATCH --constraint=ib

module load intel/compilers-2022.0.1
module load intel/mpi-2021.5.0
module load hdf5/1.12.1-intel2022
source /shared/centos7/intel/oneapi/2022.1.0/setvars.sh

BIN=/work/bansil/programs/VASP632intel2022omp/vasp.6.3.2/bin/vasp_std

export OMP_NUM_THREADS=1
export OMP_PLACES=cores
export OMP_PROC_BIND=close
export OMP_STACKSIZE=512m
export I_MPI_PIN_DOMAIN=omp
export I_MPI_PIN=yes
export I_MPI_FABRICS=shm:ofi
export I_MPI_DEBUG=4

mpirun -np 4 $BIN

This also occurs for calculations with a single node. Here is the error:

[0] MPI startup(): Intel(R) MPI Library, Version 2021.5 Build 20211102 (id: 9279b7d62)
[0] MPI startup(): Copyright (C) 2003-2021 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.13.2rc1-impi
[0] MPI startup(): libfabric provider: mlx
[0] MPI startup(): Load tuning file: "/shared/centos7/intel/oneapi/2022.1.0/mpi/2021.5.0/etc/tuning_generic_shm-ofi_mlx_hcoll.dat"

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 1 PID 63791 RUNNING AT d3057
= KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 2 PID 63792 RUNNING AT d3057
= KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 3 PID 63793 RUNNING AT d3057
= KILLED BY SIGNAL: 9 (Killed)
===================================================================================

and here is the job script:

#!/bin/bash
#SBATCH --job-name=TiSrO3
#SBATCH --output=output.out
#SBATCH --error=error.error
#SBATCH --time=6:00:00
#SBATCH --mem=0
#SBATCH -n 4
#SBATCH -p bansil
#SBATCH -N 1
#SBATCH --constraint=ib

module load intel/compilers-2022.0.1
module load intel/mpi-2021.5.0
module load hdf5/1.12.1-intel2022
source /shared/centos7/intel/oneapi/2022.1.0/setvars.sh

BIN=/work/bansil/programs/VASP632intel2022omp/vasp.6.3.2/bin/vasp_std

export OMP_NUM_THREADS=1
export OMP_PLACES=cores
export OMP_PROC_BIND=close
export OMP_STACKSIZE=512m
export I_MPI_PIN_DOMAIN=omp
export I_MPI_PIN=yes
export I_MPI_FABRICS=shm:ofi
export I_MPI_DEBUG=4

mpirun -np 4 $BIN

This is very confusing. I hope this can help pinpoint the problem. Thank you for your continued help with this issue.

Matt

Re: SiC8_GW0R test hangup VASP 6.3.2

Posted: Fri Aug 12, 2022 1:56 pm
by alexey.tal
Ok, so when you run your calculation with shm:ofi it doesn't hang up but gets killed by the operating system. Can you try to find the reason why these processes were killed? You can look for the corresponding messages in the output of the dmesg command on the compute node.

Regarding your previous post, the choice of the tuning file does not necessarily reflect the default communication fabric, but it would be helpful to know what it is. Can you set the flag I_MPI_DEBUG=16 and run the calculations: 1) without defining the fabric; 2) with I_MPI_FABRICS=shm 3) ofi and 4) shm:ofi. The comparison of the outputs should allow one to understand what the default fabric is.

Also, it would be helpful if you could provide the full output and OUTCARs, so that we could see where exactly the calculations stop.

Re: SiC8_GW0R test hangup VASP 6.3.2

Posted: Sat Sep 10, 2022 9:25 pm
by matthew_matzelle
Hi Alexey,

Sorry for my late reply.

I have done the tests as asked for and included a zip file with them all. In addition to the error output, the output, the batch script and the OUTCAR, I've also include the dmesg output for both the successful run and the run that was killed.

Thank you for your help,
Matthew Matzelle
vaspbugreport2.zip

Re: SiC8_GW0R test hangup VASP 6.3.2

Posted: Thu Oct 20, 2022 3:21 pm
by matthew_matzelle
Hi All,

I'm just wondering if there has been any progress on this issue?

Thanks for your hard work,
Matt

Re: SiC8_GW0R test hangup VASP 6.3.2

Posted: Fri Oct 21, 2022 2:00 pm
by alexey.tal
Hi Matt,

Thank you for doing all these tests.

In the shmofi directory, there is an OUTCAR file and it looks like the calculation went through quite far and then stopped, although the output.out looks like the process was killed at the MPI initialization, i.e., no stdout from VASP. Are you sure that it is the correct OUTCAR? Also, the provided tests are only done on a single node, but I assume that shm:ofi didn't work on multiple nodes either.

But otherwise the situation is clear. The ofi fabric causes hang ups when it is used for the intranode communication, which probably has to do with the multiple endpoint communication. So far we have seen that using shm for intranode and ofi for internode communication usually works. However, in your case the shm:ofi option fails too. From the provided log files, it looks like it doesn't hang but hits a segmentation fault somewhere at the MPI initialization. So the issue is in Intel MPI, not VASP. I wasn't able to reproduce your issue with shm:ofi on any of our AMD machines, but we will post an update if we come up with a solution. If you manage to make it work please let us know.