Segmentation fault in MLWF_TRAFO_RUN of mlwf.F

Problems running VASP: crashes, internal errors, "wrong" results.

Moderators: Global Moderator, Moderator

Post Reply
Message
Author
satoru_matsuishi2
Newbie
Newbie
Posts: 1
Joined: Tue Sep 29, 2020 7:42 am

Segmentation fault in MLWF_TRAFO_RUN of mlwf.F

#1 Post by satoru_matsuishi2 » Thu May 27, 2021 11:33 am

Dear VASP developers,

I’d like to report an issue about VASP 6.2.0 with wanner90 support.

Referring to “Makefile.include nv acc+omp+mkl” in the VASP Manual, the VASP executables are compiled using NVHPC-SDK 20.9, openmpi-3.1.4 and Intel MKL 21.2.0 with -DVASP2WANNIER90 option.

To perform the HSE06 calculation for large system, we use 8 nodes (4 TESLA P100 for NVlink-Optimized Servers and 2 Xeon E5-2680 v4/node) with NCCL and GPUDirect RDMA supports.

When LWANNIER90 is set to TRUE in INCAR, a segmentation fault occurs during the Wannier projection process displaying “Projection [***/***] done” and then some calculation nodes go down.

I found that this issue is due to an out-of-bounds reference to array at line 1237 of mlwf.F:

PROJECTIONS(:,IK,IS,IW) = MLWF%A_matrix(:,IW,IK,IS)

As shown in lines 972 and 1229, the first dimension of PROJECTIONS is smaller than that of MLWF%A_matrix by MLWF%NEXCLB.

Line 972:
ALLOCATE(MLWF%A_matrix(MLWF%NB_TOT-MLWF%NEXCLB,MLWF%NUM_WANN,MLWF%NKPTS,MLWF%ISPIN))

Line 1229:
ALLOCATE(PROJECTIONS(MLWF%NB_TOT,MLWF%NKPTS,MLWF%ISPIN,MLWF%NUM_WANN))

The following patches may be applied to solve this problem.

--- a/src/mlwf.F 2021-01-18 21:50:58.000000000 +0900
+++ b/src/mlwf.F 2021-05-26 22:21:39.000000000 +0900
@@ -1231,10 +1231,11 @@ MODULE mlwf
ALLOCATE(MLWF%lwindow(MLWF%NB_TOT,MLWF%NKPTS,MLWF%ISPIN))
MLWF%U_matrix = CMPLX(0.0_q,0.0_q)
MLWF%lwindow = .TRUE.
+ PROJECTIONS = CMPLX(0.0_q,0.0_q)
DO IS=1,MLWF%ISPIN
DO IK=1,MLWF%NKPTS
DO IW=1,MLWF%NUM_WANN
- PROJECTIONS(:,IK,IS,IW) = MLWF%A_matrix(:,IW,IK,IS)
+ PROJECTIONS(1:SIZE(MLWF%A_matrix,1),IK,IS,IW) =
MLWF%A_matrix(:,IW,IK,IS)
MLWF%U_matrix(IW,IW,IK,IS) = CMPLX(1.0_q,0.0_q,KIND=q)
ENDDO
ENDDO

This issue seems still remain in VASP 6.2.1.

henrique_miranda
Global Moderator
Global Moderator
Posts: 414
Joined: Mon Nov 04, 2019 12:41 pm
Contact:

Re: Segmentation fault in MLWF_TRAFO_RUN of mlwf.F

#2 Post by henrique_miranda » Fri May 28, 2021 9:06 am

Thank you for this bug report.
The fix for this issue will be included in the next release.

Post Reply