Page 1 of 1

internal error in SET_INDPW_FULL: insufficient memory

Posted: Wed Dec 06, 2017 10:31 am
by DannyVanpoucke
Dear VASPers,

I have recently run into a problem which got me quite puzzled. In my study I am comparing DFT and DFT+U optimized structures. In a second step, I use a HSE06 calculation to generate accurate electronic structures. Although the structures are visually the same (and have the same reported symmetry), one case(DFT+U optimized) gives me the error:

internal error in SET_INDPW_FULL: insufficient memory

While the other(DFT optimized) runs happily to the end.

Does anyone have any experience with this error? And knows how to resolve it or knows where it come from?

The INCAR file used is this one:

Code: Select all

general:
    SYSTEM = Cr2O3_ZnSub_NUDfree_h221
    ISTART = 0   
    ICHARG = 1 
    ISMEAR = 0  
    SIGMA = 0.05
    EDIFF = 1.0E-4
    PREC = Accurate
    ENCUT = 600
    LWAVE = .TRUE.
    LCHARG = .TRUE.
    LVTOT = .FALSE.
    LVHAR = .FALSE.
    ISPIN =  2   
    VOSKOWN = 1 
    LASPH = .TRUE. 
dynamic:
    IBRION = -1 
    NSW = 0      
parallel:
   LPLANE = .TRUE.
   NPAR = 2 ; KPAR = 2
dos-properties
      LORBIT = 11 
      EMIN = -70  
      EMAX = 35  
      NEDOS = 10500 
magnetic properties for HSE06:
    !NUPDOWN = free
    !ISYM = 0 
     MAGMOM = 2.0 3*4.0 4*-4.0 4*4.0 4*-4.0 4*4.0 4*-4.0 4*4.0 4*-4.0 4*4.0 4*-4.0 4*4.0 4*-4.0  72*0.0
         LORBIT = 11
         LMAXMIX = 4 
    AMIX = 0.2
    BMIX = 0.0001
    AMIX_MAG = 0.8
    BMIX_MAG = 0.0001
HSE06 parameters:
    ALGO = A
    LHFCALC = .TRUE.
    HFSCREEN = 0.2
    AEXX = 0.25
    AGGAX = 0.75
    AGGAC = 1.0
    ALDAC = 1.0
    PRECFOCK = N
    NKRED = 1

    NELM = 30
Thank you.
Danny

Re: internal error in SET_INDPW_FULL: insufficient memory

Posted: Fri Dec 08, 2017 11:14 am
by alex
Hi Danny,

it looks like you are running into trouble with the k-point set.
- Do you have a hexagonal cell? Cr2O3 suggests so...
- Did you center the mesh at the gamma point?
Otherwise you might create a HUGE number of k-points, because you are off symmetry with the set.

Cheers,

alex

Re: internal error in SET_INDPW_FULL: insufficient memory

Posted: Wed Dec 13, 2017 9:51 am
by DannyVanpoucke
Hi Alex,

Yes, indeed the cell is hexagonal, and yes the k-points are Gamma centered.
The strange thing is that the same calculation for DFT(+U) works fine. (If I would have gotten the same error there I would have been less amazed.)

Danny

Re: internal error in SET_INDPW_FULL: insufficient memory

Posted: Thu Dec 14, 2017 10:52 am
by alex
Hi Danny,

maybe your HSE06 calculation simply does not fit into your machine. Please allow at least 2GB/process. Better 4GB. You'll see the requirements growing rapidly after the initial iterations are done.

Some other tags I'm puzzled about: NPAR and KPAR. I never used it, because exact exchange calculations did not allow NPAR .le. number of cores. And with KPAR I simply don't know. I'm also using a slightly older version of VASP.

Hth,

alex

Re: internal error in SET_INDPW_FULL: insufficient memory

Posted: Fri Dec 15, 2017 10:11 am
by DannyVanpoucke
Hi Alex,

Indeed memory was one of the first things to check (and I have more than 2gb/core available), however, note that this calculation doesn't even start a first iteration (while an almost identical one runs to the end without issues.). The NPAR and KPAR are set to make sure such a calculation can even run. KPAR (parallelisation over k-points works ridiculously well for hybrids in my experience with VASP), and indeed NPAR is suggested to be set equal to the the number of cores, but I have found it can also be set to 1/node (which give an enormous speedup) and seems to work fine.

Thank you for your suggestions.

Danny
PS: I have received a solution of the problem by the vaspmasters. Apparently the symmetry was slightly broken in this system, and the issue gets resolved by setting SYMPREC to a lower value.

Re: internal error in SET_INDPW_FULL: insufficient memory

Posted: Sun Dec 24, 2017 7:42 am
by mwistey
SYMPREC does not fix this for me. Nor does more memory--even on a 1.5 TB machine with 20GB per core (!). It doesn't matter whether I set SYMPREC to 1E-3 or 1E-7, nor KPOINTS from 2x2x2 to 3x3x3 (auto generated, including Gamma). It doesn't matter if my system is perfectly symmetric (Ge diamond lattice) or slightly off (1 C atom, 127 Ge atoms). Is this a VASP bug, or am I just missing something obvious?

INCAR is posted below.

ISTART=0 # 1=resume from previous WAVECAR. 0=start from scratch. BandUP says use 0 in step_1.
#INIWAV=1 # always use default 1. (0 would be jellium)
ICHARG=2 #1=read and update CHGCAR (self-consistent). 11=Read CHGCAR *and don't change it* (i.e. nonselfconsistent). ICHARG=2 #2=Take superposition of atomic charge densities but allow to change.
LHFCALC=.TRUE. # By default, this will be PBE0 using PBE pseudopotentials
HFSCREEN=0.2 # this turns on HSE06; HFSCREEN=0 is PBE0
PRECFOCK=Fast # can use Normal or Accurate for higher quality, but cost goes up considerably
#NKRED=2 # speeds up calc, but lower quality; ok for perfect crystals with a gap and enough kpoints, questionably for Ge, so I (CAS) turned it off here (commented out)
#SYMPREC=1E-7 # Tried values from 1E-3 to 1E-7
ISMEAR=0 # 0=Gauss for semi's. -5=tetra (small cells). >0=(metals only).
SIGMA=0.05 # width of smearing, eV.
#ALGO=All # alternative to ALGO/TIME below, which work well for this system
TIME=0.5
ALGO=Damped # rec'd for small gap systems
LDIAG=.TRUE. # might want to turn this on; see manual
PREC=Accurate
LREAL=.FALSE. # Don't change this for subsequent calcs.
ENCUT=520 # Don't change this for subsequent calcs.
EDIFF=1E-9
NGX=126 # Prevent aliasing. Based on 12/19/2017 runs.
NGY=126
NGZ=126
NSW=0 # If we don't converge in 10 steps, we probably won't
IBRION=-1 #1=update ion positions. #-1=no update but re-optimize e- degrees of freedom each NSW loop. If no ionic update is required use NSW=0 instead.
NBANDS=784
NELMIN=3 # minumum number of electronic steps
NELM=3 # maximum number of electronic steps unless EDIFF reached
NSIM=4 #
NPAR=28 # CPUs per group. 28 for LEAP, 24 for Comet. 8=sqrt(72) on LEAP himem.
# KPAR=2 # How to parallelize k points
LASYNC=.FALSE. # reduces inter-node communication
LWAVE=.TRUE. # BandUP doesn't need WAVECAR, but VASP doesn't resume a job properly without it; starts over from scratch.
LCHARG=.TRUE. # BandUP needs CHGCAR

Re: internal error in SET_INDPW_FULL: insufficient memory

Posted: Wed Apr 25, 2018 9:51 pm
by mwistey
If anyone is still tracking this, this bug is reproducible and still wild. This week it killed 22 out of roughly 60 of my large simulation jobs. If anyone finds a workaround or code fix, please post it!

Re: internal error in SET_INDPW_FULL: insufficient memory

Posted: Sat May 05, 2018 9:41 pm
by DannyVanpoucke
@mwistey: Two more options that could help ( I just had a fight with an HSE system which also did not get resolved by the method above):
1) Make sure KPAR & NPAR together give rise to an NPAR = # cores. Let me explain: Assume you have 5 nodes of 28 core, and 75 k-points. In this case if you set KPAR=20 you end up with 7 cores per K-point, so that means NPAR=7. [fixed my current problem]

2) Switch of symmetry : ISYM = 0 [that would have been my next try, if 1 didn't work]

Good luck.

Danny

Re: internal error in SET_INDPW_FULL: insufficient memory

Posted: Fri Dec 21, 2018 10:01 pm
by mwistey
If anyone is still running into this problem, I found that running the job on a different number of nodes may allow the job to run. So it may be a bug in the way VASP counts the tasks that are assigned to each CPU or node.

@DannyVanpoucke: Thanks for the suggestions. I did verify NPAR = # cores, so it wasn't that. I didn't try the ISYM=0 yet. A few of my jobs don't seem to run with either 1 more or 1 fewer nodes, so I'll try that next.

Re: internal error in SET_INDPW_FULL: insufficient memory

Posted: Tue Jul 06, 2021 8:15 am
by rolando_saniz1
Dear all,

We found (more precisely, Maxime van den Bossche from the support team at KU Leuven) a fix for the problem referred to in the subject.

The complete error message hints at where the problem lies ( "see wave.F safeguard").

The error can be avoided by slightly increasing the safeguard for the number of plane waves in wave.F Specifically, in line 1897 of src/wave.F (in VASP 6.1.2) Maxime incremented the NRPLWV variable to 16, from 8 in the original version.

Maxime's explanation is that in wave.F, an estimate of the number of plane waves seems to be calculated by each MPI process, and then the maximum of these numbers is taken and broadcasted. Then each process increments this number by 8. The modification simply consists of again incrementing the resulting number by 8. Note that the problem is not that there is not enough RAM, but that the size of an allocated array appears to be insufficient in some cases.

Note that one can try to avoid the problem by changing the NPAR and KPAR settings. For a given number of cores, we found that while for some settings the allocated array can be insufficient, it might be sufficient for other settings. Of course, the settings one is forced to use in that case might not be optimal. Changing the safeguard value can provide a more general solution.

Hope the above helps.

Best regards,
Rolando

Re: internal error in SET_INDPW_FULL: insufficient memory

Posted: Thu Jul 08, 2021 10:21 am
by henrique_miranda
Thank you for reporting this.

Adding +8 to the first dimension of the WF is done to prevent this error from occurring.
If this is not enough in some cases then increasing further +16 as you point out can fix the issue.

But this is admittedly not the optimal solution and we will try to implement an approach that guarantees that this never occurs in a future version of the code.

It would be very helpful for us if you could provide the INCAR, POSCAR, and KPOINTS file and OUTCAR where this issue occurred.
This is so that we can reproduce the issue locally (the OUTCAR contains information about the number of cores) and check if the new implementation fixes the problem.

Kind regards,
Henrique Miranda