"internal error in: mpi.F, M_sumb_d: invalid vector size" in wannier mode

Problems running VASP: crashes, internal errors, "wrong" results.

Moderators: Global Moderator, Moderator

Post Reply
Message
Author
dongshen_wen
Newbie
Newbie
Posts: 8
Joined: Wed Nov 02, 2022 10:30 am

"internal error in: mpi.F, M_sumb_d: invalid vector size" in wannier mode

#1 Post by dongshen_wen » Tue Jan 10, 2023 10:50 am

Dear developers and users,


I am encountering a problem in the vasp_ncl calculation during writing out the wannier results. I am using vasp6.3.2 with the wannier90 v3 interface.
The calculation of the NSCF was good until it enters the wannier library mode:

Code: Select all

 entering main loop
       N       E                     dE             d eps       ncg     rms          rms(c)
DAV:   1    -0.157375086006E+03   -0.15738E+03   -0.69218E-06345984   0.191E-02
 Calling wannier_setup of wannier90 in library mode
 Wannier90 mode
 Computing MMN (overlap matrix elements)
 -----------------------------------------------------------------------------
|                     _     ____    _    _    _____     _                     |
|                    | |   |  _ \  | |  | |  / ____|   | |                    |
|                    | |   | |_) | | |  | | | |  __    | |                    |
|                    |_|   |  _ <  | |  | | | | |_ |   |_|                    |
|                     _    | |_) | | |__| | | |__| |    _                     |
|                    (_)   |____/   \____/   \_____|   (_)                    |
|                                                                             |
|     internal error in: mpi.F  at line: 1359                                 |
|                                                                             |
|     M_sumb_d: invalid vector size n -1953955840                             |
|                                                                             |
 -----------------------------------------------------------------------------
I've found a similar discussion forum/viewtopic.php?p=23166#p23166 but my case does not seem to be the same. I applied the patch from that thread but the error still exist.
As mentioned in the discussion forum/viewtopic.php?f=4&t=18069&start=15, vasp6.3.x have a better wannier parallelization scheme.

I attach my inputs and error files. Please let me know if you need anything else. Thank you in advance for any hints.

Best,
Dongsheng
You do not have the required permissions to view the files attached to this post.

dongshen_wen
Newbie
Newbie
Posts: 8
Joined: Wed Nov 02, 2022 10:30 am

Re: internal error in mpi.F

#2 Post by dongshen_wen » Thu Jan 12, 2023 9:13 am

A quick follow up: I tried using different parallelization parameters in the bash job script and here is what I found.
In the bash job, when I set (number of tasks)*(number of nodes)=NBANDS, vasp_ncl completed the calculation without the error I mentioned. But I just have one test and will update if I find something new.
dongshen_wen wrote: Tue Jan 10, 2023 10:07 am Dear Alexey,

I tried the patch above but it did not seem to work. I still got the same error message. It seems to me it came from the M_sumb_d subroutine on line 1329 in mpi.F. I saw that the patch updates the M_sum_d to M_sum_d8 so I'm not sure if this is the case.
I've also tried commenting out the LORBIT tag, the error persisted. So I guess it might not come from the same subroutine SPHPRO_FAST as you mentioned.

I will open a new thread to update the files.

Best,
Dongsheng


andreas.singraber
Global Moderator
Global Moderator
Posts: 231
Joined: Mon Apr 26, 2021 7:40 am

Re: "internal error in: mpi.F, M_sumb_d: invalid vector size" in wannier mode

#3 Post by andreas.singraber » Thu Jan 12, 2023 9:22 am

For later reference: this thread originated from here: https://vasp.at/forum/viewtopic.php?f=3&t=18774

andreas.singraber
Global Moderator
Global Moderator
Posts: 231
Joined: Mon Apr 26, 2021 7:40 am

Re: "internal error in: mpi.F, M_sumb_d: invalid vector size" in wannier mode

#4 Post by andreas.singraber » Fri Jan 20, 2023 8:47 pm

Dear Dongsheng,

sorry for the delay... I had a closer look and I suspect there is a similar problem with a 4-byte integer overflowing in mlwf.F in line 1234:

Code: Select all

CALLMPI( M_sum_z( WDES%COMM_KINTER, MLWF%M_matrix, SIZE(MLWF%M_matrix)))
The SIZE function returns a 4-byte integer which may be too small to capture the total number of matrix elements and hence overflows. A negative number is then passed on to M_sum_z which is then internally calling M_sumb_d which raises the error you are observing.

I will work on a fix, discuss it with my colleagues and present it to you... please stay tuned...

Best,
Andreas Singraber

dongshen_wen
Newbie
Newbie
Posts: 8
Joined: Wed Nov 02, 2022 10:30 am

Re: "internal error in: mpi.F, M_sumb_d: invalid vector size" in wannier mode

#5 Post by dongshen_wen » Thu Jan 26, 2023 10:38 am

Dear Andreas,

Thank you for this update. It also appeared to me that the vasp6.3.2 returns different overlap matrix, projection, or eigenvalues. I tried the same BCC Fe example on vasp6.3.2 and vasp6.2.1. With the .amn, .mmn, and .eig from 6.2.1, the disentanglement and wannierization processes were good (with small spread and reproduced DFT bands). When using these files generated by 6.3.2, the disentanglement and wannierization processes never converged with extremely large spread.
I'll keep posting when I find something new.

Best,
Dongsheng
andreas.singraber wrote: Fri Jan 20, 2023 8:47 pm Dear Dongsheng,

sorry for the delay... I had a closer look and I suspect there is a similar problem with a 4-byte integer overflowing in mlwf.F in line 1234:

Code: Select all

CALLMPI( M_sum_z( WDES%COMM_KINTER, MLWF%M_matrix, SIZE(MLWF%M_matrix)))
The SIZE function returns a 4-byte integer which may be too small to capture the total number of matrix elements and hence overflows. A negative number is then passed on to M_sum_z which is then internally calling M_sumb_d which raises the error you are observing.

I will work on a fix, discuss it with my colleagues and present it to you... please stay tuned...

Best,
Andreas Singraber

Post Reply