Error "EDDDAV: Call to ZHEGV failed." with AM05

Problems running VASP: crashes, internal errors, "wrong" results.

Moderators: Global Moderator, Moderator

Locked
Message
Author
aldo_ugolotti
Newbie
Newbie
Posts: 1
Joined: Thu May 04, 2023 1:06 pm

Error "EDDDAV: Call to ZHEGV failed." with AM05

#1 Post by aldo_ugolotti » Tue May 23, 2023 8:04 am

Dear developers,

I am trying to run some slab calculations using the AM05 exchange and correlation functional on a HPC machine; however, despite all my efforts I am always encountering the "EDDDAV: Call to ZHEGV
failed. Returncode = XXX" error.
I have already tried the solutions suggested in the following thread on the forum (forum/viewtopic.php?t=10409), but none of them worked.

I have made additional tests, and this is what I could observe:

-- this error is strongly dependent on the choice of xc potential: the same input files do not give any issue when selecting a different functional, e.g. PBEsol; however, the same setup (AM05) works fine for bulk calculations;
-- changing the parallelization details (i.e. both at slurm side [MPI/OPENMPI, number of processes, with/without GPU] and at INCAR side [NPAR, KPAR]) doesn't solve the issue, but changes the XXX code reported in the error message;
-- different INCAR setups, i.e. diagonalization schemes, plane waves cutoff, etc, end up with the same error;
-- the use of different pseudopotentials or different VASP versions (6.3 or 6.4) doesn't solve the issue;
-- the starting geometry seems to be free of any systematic errors, like too-short distances; increasing the lattice scaling factor do not avoid the issue;
-- starting from converged wavefunctions/charge densities (obtained through another xc functional) still produces the error;
-- I got this error using VASP on the Italian Cineca HPC machines: I get the same error on any of their clusters (marconi, marconi100, galileo100). I have been in touch with their support and they excluded that this error can be related to any library issue (e.g. the scalapack, as suggested in the forum);
-- the error is not always generated at the same step: sometimes at the beginning of the first diagonalization and sometimes after several electronic (or even ionic) steps. That is strongly machine-related. When the issue doesn't appear immediately, however, the total energy grows almost exponentially during the first electronic steps; correspondingly, the some energies in the OUTCAR are reported by a sequence of *****.

I am attaching a set of input/output files to check on the issue. Any help on solving it would be very appreciated.
Thank you in advance,

Aldo Ugolotti
You do not have the required permissions to view the files attached to this post.

martin.schlipf
Global Moderator
Global Moderator
Posts: 455
Joined: Fri Nov 08, 2019 7:18 am

Re: Error "EDDDAV: Call to ZHEGV failed." with AM05

#2 Post by martin.schlipf » Thu May 25, 2023 8:01 am

Unfortunately, there is no guarantee that there will be a solution to this problem. According to the list of things you tried there is not a lot more to try. The only real advice I can give is to try to reduce the problem size (i.e. using a smaller system and setting less INCAR tags) to figure out if there is some way to make it work. I will mention a few further points that you already tried according to your write-up so you can compare your approach to the suggestion
  • In the current setup, you set an ENCUT that is smaller than ENMAX of the POTCAR files. This is not recommended and smaller than the default. Increasing ENCUT above ENMAX may lead to a better description of the system
  • I noticed is the setting of NBANDS which is also just set to the minimum required. Sometimes adding a few additional bands can improve the convergence by having a bit more flexible basis
  • Different ALGO choices should affect the convergence. In particular, you can try the direct optimization ones (ALGO = A or ALGO = D)
  • You k-point mesh is very dense. I don't know if the system is supposed to be metallic in plane, but I would not use such dense meshes if you are still struggling to make the system converge.

Locked