Page 1 of 1

VASP 6.2 ACC GPU corrector takes much more time

Posted: Wed Feb 10, 2021 4:18 pm
by david_keller
Is there a reason why the timing for the "CORREC" and "EDDIAG" sections would take longer for a run using a GPU verses and identical run using only CPUs?
We are finding using the OpenACC GPU version to be beneficial as far as run times with the exception of time spent in "CORREC" corrector section and the "EDDIAG" section.

-------------------------------------- Iteration 1( 11) ---------------------------------------

POTLOK: cpu time 0.0137: real time 0.0137
SETDIJ: cpu time 0.7244: real time 0.7261
TRIAL : cpu time 1.5328: real time 1.5365
CORREC: cpu time 1.4981: real time 1.5017
EDDIAG: cpu time 0.3957: real time 0.3967
CHARGE: cpu time 0.3385: real time 0.3393

POTLOK: cpu time 0.0149: real time 0.0150
SETDIJ: cpu time 0.0562: real time 0.0563
TRIAL : cpu time 1.6162: real time 1.6201
CORREC: cpu time 0.9915: real time 0.9940
EDDIAG: cpu time 0.7746: real time 0.7817
CHARGE: cpu time 0.1655: real time 0.1660

Perhaps this section does not run on the GPU yet??

Re: VASP 6.2 ACC GPU corrector takes much more time

Posted: Wed Apr 21, 2021 4:13 pm
by mmarsman

Could you provide the inputs for this job: a tgz with the INCAR, KPOINTS, POSCAR, and an OUTCAR file?
And could you tell me on what hardware you are running this, and with how many MPI-ranks? Then I'll have a look!
From the timings it seems this is a rather small workload.
For small systems it is not unexpected that (parts of) the code will actually be slower on GPU than on CPU.
But without the inputs that is speculation on my part, of course.

The parts you are looking at (TRIAL and CORREC) are both ported to GPU, and I agree it is remarkable that one is appreciably slower on GPU than on CPU, and the other is not. I will try to find out what causes this.


Re: VASP 6.2 ACC GPU corrector takes much more time

Posted: Mon May 10, 2021 5:01 pm
by david_keller
Run Directory attached.