Page 1 of 1

vasp 6.4.3 compiles successfully, but make test fails.

Posted: Mon Apr 08, 2024 7:05 am
by Yong ZUO
vasp 6.4.3 compiles successfully, but make test fails. Environment: gcc:11.4.0 aocc/aocl:4.1.0 openmpi:4.15 CPU: AMD EPYC 9754.

Re: vasp 6.4.3 compiles successfully, but make test fails.

Posted: Mon Apr 08, 2024 9:57 am
by michael_wolloch
Dear Yong Zuo,

great that you had no issue with the compilation. When I look at your testsuite.log file, I see that you are running as a root user:
mpirun has detected an attempt to run as root.

Running as root is *strongly* discouraged as any mistake (e.g., in
defining TMPDIR) or bug can result in catastrophic damage to the OS
file system, leaving your system in an unusable state.

We strongly suggest that you run mpirun as a non-root user.

You can override this protection by adding the --allow-run-as-root option
to the cmd line or by setting two environment variables in the following way:
the variable OMPI_ALLOW_RUN_AS_ROOT=1 to indicate the desire to override this
protection, and OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1 to confirm the choice and
add one more layer of certainty that you want to do so.
We reiterate our advice against doing so - please proceed at your own risk.
mpirun stops execution immediately, so VASP is not producing outputs, and the tests fail. I assume you did not set any of the environment variables mentioned, and I see that you did not use the command line option "--allow-run-as-root".

I would suggest that you re-run the test suite as a non-root user. If it still does not work, please get back to us with some additional info: Your makefile.include and the output of printenv

All the best, Michael

Re: vasp 6.4.3 compiles successfully, but make test fails.

Posted: Tue Apr 09, 2024 2:35 am
by Yong ZUO
Dear Michael Wolloch
Thank you for your reply.
I successfully recompiled it as a not-root user, but make test still reports an error. Can you help me take a look at it again? The attachment information is the information you may need
Best regards
Yong Zuo

Re: vasp 6.4.3 compiles successfully, but make test fails.

Posted: Tue Apr 09, 2024 12:25 pm
by michael_wolloch
Hello again,

the issue that you have now:

Code: Select all

VERY BAD NEWS! internal error in subroutine SGRGEN: Too many elements 49
has been seen before for 6.4.2, and I could reproduce it with 6.4.3 using AOCC 4.0.0, AOCL 4.0, and openMPI 4.1.3. With AOCC 3.2.0, AOCL 3.1, and openMPI 4.1.2 the code runs as expected. It seems that since AOCC 4.0 the compiler aggressively optimizes the SGRGEN routine in a way that breaks something.

There are three options to try and fix this issue:

1) Add symlib.o to the OBJECTS_O1 line in your makefile.include such that it reads:

Code: Select all

OBJECTS_O1 += fftw3d.o fftmpi.o fftmpiw.o symlib.o
This was suggested by user huangjs and fixes the problem.

2) Compile the code with openMP support using the arch/makefile.include.aocc_ompi_aocl_omp makefile.include and uncomment the fftlib section at the end. This works for me (again using AOCC 4.0.0, AOCL 4.0, and openMPI 4.1.3) without issues. You can still run the resulting executable without much overhead with MPI only if specifying only one openMP thread per MPI rank (see this wiki article for more info).

3) Use the gnu compiler instead of AOCC, while still taking advantage of the AOCL libraries with your EPYC machine. The makefile.include.gnu_ompi_aocl is appropriate.