internal error in: mpi.F at line: 898

Question on input files/tags, interpreting output, etc.

Please check whether the answer to your question is given in the VASP online manual or has been discussed in this forum previously!

Moderators: Global Moderator, Moderator

Post Reply
Message
Author
scanmat_centre
Newbie
Newbie
Posts: 25
Joined: Mon Feb 01, 2021 1:53 pm

internal error in: mpi.F at line: 898

#1 Post by scanmat_centre » Thu Dec 07, 2023 8:56 am

We installed Vasp 6.4.0, without any warning or error using nvdia_hpc_sdk/23.9 kit in our GPU machine.
We were able to run calculations for a few days and afterwards we are encountering an internal error in mpi.F file.

The detailed error is given below for your reference.

Code: Select all

Local host: scanmatdgx1
--------------------------------------------------------------------------
 running    1 mpi-ranks, on    1 nodes
 distrk:  each k-point on    1 cores,    1 groups
 distr:  one band on    1 cores,    1 groups
 OpenACC runtime initialized ...    1 GPUs detected
 -----------------------------------------------------------------------------
|                     _     ____    _    _    _____     _                     |
|                    | |   |  _ \  | |  | |  / ____|   | |                    |
|                    | |   | |_) | | |  | | | |      | |                    |
|                    |_|   |  _ <  | |  | | | | |_ |   |_|                    |
|                     _    | |_) | | || | | |__| |    _                     |
|                    (_)   |____/   \____/   \_____|   (_)                    |
|                                                                             |
|     internal error in: mpi.F  at line: 898                                  |
|                                                                             |
|     M_init_nccl: Error in ncclCommInitRank                                  |
|                                                                             |
|     If you are not a developer, you should not encounter this problem.      |
|     Please submit a bug report.                                             |
|                                                                             |
 -----------------------------------------------------------------------------

Warning: ieee_inexact is signaling
    1
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.

Please help us to resolve the issue.
Thanks in advance.
SCANMAT.

jonathan_lahnsteiner2
Global Moderator
Global Moderator
Posts: 146
Joined: Fri Jul 01, 2022 2:17 pm

Re: internal error in: mpi.F at line: 898

#2 Post by jonathan_lahnsteiner2 » Thu Dec 07, 2023 9:03 am

Dear scanmat_centre,

Is it possible for you to update to the latest version of vasp. You can download vasp.6.4.2 from the vasp portal.
Please check if you still get the same issue.

All the best Jonathan

Post Reply