internal error in: mpi.F when running GW calculation

Problems running VASP: crashes, internal errors, "wrong" results.

Moderators: Global Moderator, Moderator

Post Reply
Message
Author
awang95
Newbie
Newbie
Posts: 41
Joined: Thu May 26, 2022 3:50 pm

internal error in: mpi.F when running GW calculation

#1 Post by awang95 » Sun Apr 23, 2023 3:04 am

Hi,

I'm getting the following error when I'm try to run a GW calculation - would appreciate it it the issue could be looked into; thanks!

I'm using VASP 6.3.1 with intelmpi

Below is my INCAR, and the output up to the error

INCAR:

Code: Select all

# template_INCAR_g0w0

# start parameters
ISTART = 1 	
SYSTEM = template_g0w0

# method and accuracy/precision
PREC = accurate
ALGO = EVGW0
GGA = PE
LHFCALC = .TRUE.
HFSCREEN = 0.11
	# HFSCREEN value from: https://doi.org/10.1063/1.2404663 discussion
	# does it matter if keep the xc tags for gw? 

# electronic optimization
ENCUT = 600
	# for qe pwscf calculation: 1 Ry = 13.6056980659 eV
NELM = 1
EDIFF = 1e-5
LMAXMIX = 4 
	# 4 for d elements, 6 for f elements

# ionic relaxation
#EDIFFG = -0.01           
#NSW = 100
#IBRION = -1
#ISIF = 2

# optical calculations
#NBANDS = 1200
#LOPTICS = .TRUE.
#LPEAD = .TRUE.
#OMEGAMAX = 40

# g0w0

ENCUTGW = 400
#ENCUTGWSOFT = 400
NOMEGA = 80
	# number of imaginary time grid points; make sure can be divided by total number of nodes for efficiency
NBANDS = 800
#NBANDSGW = 400
NELMGW = 1

# smearing and broadening 
ISMEAR = 0
SIGMA = 0.01

# spin considerations
ISPIN = 2

# performance optimization
#NCORE = 10
	# 10 cores/node
#KPAR = 4
NTAUPAR = 1

# output writing
#LORBIT = 11
LWAVE = .TRUE.

output:

Code: Select all

 running on   80 total cores
 distrk:  each k-point on   80 cores,    1 groups
 distr:  one band on    1 cores,   80 groups
 vasp.6.3.1 04May22 (build Jul 22 2022 15:58:29) complex                        
  
 POSCAR found type information on POSCAR GaNiO H 
 POSCAR found :  4 types and      86 ions
 Reading from existing POTCAR
 -----------------------------------------------------------------------------
|                                                                             |
|           W    W    AA    RRRRR   N    N  II  N    N   GGGG   !!!           |
|           W    W   A  A   R    R  NN   N  II  NN   N  G    G  !!!           |
|           W    W  A    A  R    R  N N  N  II  N N  N  G       !!!           |
|           W WW W  AAAAAA  RRRRR   N  N N  II  N  N N  G  GGG   !            |
|           WW  WW  A    A  R   R   N   NN  II  N   NN  G    G                |
|           W    W  A    A  R    R  N    N  II  N    N   GGGG   !!!           |
|                                                                             |
|     You use a magnetic or noncollinear calculation, but did not specify     |
|     the initial magnetic moment with the MAGMOM tag. Note that a            |
|     default of 1 will be used for all atoms. This ferromagnetic setup       |
|     may break the symmetry of the crystal, in particular it may rule        |
|     out finding an antiferromagnetic solution. Thence, we recommend         |
|     setting the initial magnetic moment manually or verifying carefully     |
|     that this magnetic setup is desired.                                    |
|                                                                             |
 -----------------------------------------------------------------------------

 scaLAPACK will be used
 -----------------------------------------------------------------------------
|                                                                             |
|           W    W    AA    RRRRR   N    N  II  N    N   GGGG   !!!           |
|           W    W   A  A   R    R  NN   N  II  NN   N  G    G  !!!           |
|           W    W  A    A  R    R  N N  N  II  N N  N  G       !!!           |
|           W WW W  AAAAAA  RRRRR   N  N N  II  N  N N  G  GGG   !            |
|           WW  WW  A    A  R   R   N   NN  II  N   NN  G    G                |
|           W    W  A    A  R    R  N    N  II  N    N   GGGG   !!!           |
|                                                                             |
|     For optimal performance we recommend to set                             |
|       NCORE = 2 up to number-of-cores-per-socket                            |
|     NCORE specifies how many cores store one orbital (NPAR=cpu/NCORE).      |
|     This setting can greatly improve the performance of VASP for DFT.       |
|     The default, NCORE=1 might be grossly inefficient on modern             |
|     multi-core architectures or massively parallel machines. Do your        |
|     own testing! More info at https://www.vasp.at/wiki/index.php/NCORE      |
|     Unfortunately you need to use the default for GW and RPA                |
|     calculations (for HF NCORE is supported but not extensively tested      |
|     yet).                                                                   |
|                                                                             |
 -----------------------------------------------------------------------------

 Reading from existing POTCAR
 -----------------------------------------------------------------------------
|                                                                             |
|               ----> ADVICE to this user running VASP <----                  |
|                                                                             |
|     You have a (more or less) 'large supercell' and for larger cells it     |
|     might be more efficient to use real-space projection operators.         |
|     Therefore, try LREAL= Auto in the INCAR file.                           |
|     Mind: For very accurate calculation, you might also keep the            |
|     reciprocal projection scheme (i.e. LREAL=.FALSE.).                      |
|                                                                             |
 -----------------------------------------------------------------------------

 -----------------------------------------------------------------------------
|                                                                             |
|           W    W    AA    RRRRR   N    N  II  N    N   GGGG   !!!           |
|           W    W   A  A   R    R  NN   N  II  NN   N  G    G  !!!           |
|           W    W  A    A  R    R  N N  N  II  N N  N  G       !!!           |
|           W WW W  AAAAAA  RRRRR   N  N N  II  N  N N  G  GGG   !            |
|           WW  WW  A    A  R   R   N   NN  II  N   NN  G    G                |
|           W    W  A    A  R    R  N    N  II  N    N   GGGG   !!!           |
|                                                                             |
|     The default for ENCUTGWSOFT is different in this version of VASP.       |
|     If you wish to obtain identical results as using pre vasp.6.3           |
|     please manually set ENCUTGWSOFT =400 in the INCAR file.                 |
|                                                                             |
 -----------------------------------------------------------------------------

 LDA part: xc-table for Pade appr. of Perdew
 POSCAR found type information on POSCAR GaNiO H 
 POSCAR found :  4 types and      86 ions
 found WAVECAR, reading the header
 WAVECAR: different cutoff or change in lattice found
 POSCAR, INCAR and KPOINTS ok, starting setup
 FFT: planning ... GRIDC
 FFT: planning ... GRID_SOFT
 FFT: planning ... GRID
 reading WAVECAR
 the WAVECAR file was read successfully
 initial charge from wavefunction
 available memory per node:   76.15 GB, setting MAXMEM to   77973

 The Fermi energy was updated, please check that it is located mid-gap
 values below the HOMO (VB) or above the LUMO (CB) will cause erroneous energies
 E-fermi :  -2.2847

 calculate exact exchange contribution
 the WAVEDER file was read successfully
energies w= 
    0.000    0.000    0.601    0.000    1.202    0.000    1.801    0.000
    2.397    0.000    2.990    0.000    3.579    0.000    4.164    0.000
    4.744    0.000    5.318    0.000    5.887    0.000    6.450    0.000
    7.006    0.000    7.557    0.000    8.101    0.000    8.640    0.000
    9.173    0.000    9.700    0.000   10.223    0.000   10.741    0.000
   11.254    0.000   11.764    0.000   12.270    0.000   12.773    0.000
   13.275    0.000   13.774    0.000   14.273    0.000   14.771    0.000
   15.270    0.000   15.770    0.000   16.272    0.000   16.776    0.000
   17.284    0.000   17.796    0.000   18.313    0.000   18.836    0.000
   19.367    0.000   19.906    0.000   20.455    0.000   21.014    0.000
   21.585    0.000   22.170    0.000   22.769    0.000   23.386    0.000
   24.021    0.000   24.676    0.000   25.355    0.000   26.058    0.000
   26.790    0.000   27.553    0.000   28.351    0.000   29.187    0.000
   30.065    0.000   30.992    0.000   31.971    0.000   33.010    0.000
   34.116    0.000   35.297    0.000   36.564    0.000   37.927    0.000
   39.400    0.000   40.999    0.000   42.743    0.000   44.653    0.000
   46.760    0.000   49.094    0.000   51.698    0.000   54.623    0.000
   57.935    0.000   61.716    0.000   66.076    0.000   71.158    0.000
   77.158    0.000   84.349    0.000   93.119    0.000  104.047    0.000
  118.031    0.000  136.545    0.000  162.185    0.000  200.000    0.000
 responsefunction array rank=   45120
 LDA part: xc-table for Pade appr. of Perdew

 min. memory requirement per mpi rank  95301.6 MB, per node 190603.3 MB

 -----------------------------------------------------------------------------
|                                                                             |
|           W    W    AA    RRRRR   N    N  II  N    N   GGGG   !!!           |
|           W    W   A  A   R    R  NN   N  II  NN   N  G    G  !!!           |
|           W    W  A    A  R    R  N N  N  II  N N  N  G       !!!           |
|           W WW W  AAAAAA  RRRRR   N  N N  II  N  N N  G  GGG   !            |
|           WW  WW  A    A  R   R   N   NN  II  N   NN  G    G                |
|           W    W  A    A  R    R  N    N  II  N    N   GGGG   !!!           |
|                                                                             |
|     This job will probably crash, due to insufficient memory available.     |
|     Available memory per mpi rank: 77973 MB, required memory: 95301 MB.     |
|     Reducing NTAUPAR or using more computing nodes might solve this         |
|     problem.                                                                |
|                                                                             |
 -----------------------------------------------------------------------------

 allocating   1 responsefunctions rank= 45120
 allocating   1 responsefunctions rank= 45120
 Doing            1  frequencies on each core in blocks of            1
NQ=   1    0.0000    0.0000    0.0000, 
 -----------------------------------------------------------------------------
|                     _     ____    _    _    _____     _                     |
|                    | |   |  _ \  | |  | |  / ____|   | |                    |
|                    | |   | |_) | | |  | | | |  __    | |                    |
|                    |_|   |  _ <  | |  | | | | |_ |   |_|                    |
|                     _    | |_) | | |__| | | |__| |    _                     |
|                    (_)   |____/   \____/   \_____|   (_)                    |
|                                                                             |
|     internal error in: mpi.F  at line: 1825                                 |
|                                                                             |
|     M_sumb_d: invalid vector size n -223338496                              |
|                                                                             |
 -----------------------------------------------------------------------------

I'm using the following command to run VASP:

Code: Select all

srun --ntasks-per-node=2 vasp_std

merzuk.kaltak
Administrator
Administrator
Posts: 277
Joined: Mon Sep 24, 2018 9:39 am

Re: internal error in: mpi.F when running GW calculation

#2 Post by merzuk.kaltak » Mon Apr 24, 2023 10:10 am

Dear awang95,

Without a proper error report (input, stdout and some basic output files) I have to guess what is the cause of the problem.
Inspecting your INCAR file, it seems that you have selected the GW algorithm for the real frequency axis (ALGO=EVGW0).
Furthermore the posted stdout shows 86 ions (with 4 types) which indicates a relatively large unit cell (see warnings about LREAL).

From this, I suspect you run out of memory (which vasp also tells you in the warning below). So I suggest to select the
low-scaling alternative ALGO=EVGW0R (available as of 6.4.1).
If this doesn't help you fix the problem, please upload a full error report that contains following files in a zip (or tarball) format:

Code: Select all

INCAR
KPOINTS
POSCAR
POTCAR
OUTCAR
stdout
jobscript

awang95
Newbie
Newbie
Posts: 41
Joined: Thu May 26, 2022 3:50 pm

Re: internal error in: mpi.F when running GW calculation

#3 Post by awang95 » Tue Apr 25, 2023 1:49 am

ok I'll try; thanks!

Post Reply