Hello!
thanks for uploading a larger batch of your training runs! We finally (most likely) found the origin of the discrepancies between your DFT reference data and ML predictions. In your training runs you use Grimme DFT-D2 van der Waals corrections (INCAR tag IVDW = 10). These corrections are computed whenever an ab initio calculation is performed during training and you can find the corresponding energy contributions in sections like this in the OUTCAR file:
Code: Select all
Number of pair interactions contributing to vdW energy: 2057320
Edisp (eV): -44.21481
FORVDW: cpu time 0.1362: real time 0.1367
The vdW-corrections are automatically added to energy, forces and stress before feeding the data into the machine learning routines. Hence, the ML force field trains on the potential energy landscape which includes the vdW-corrections. As a consequence, also the predictions automatically include them. However, we found that you seem to have forgotten them when you created your test set. In the directories in "Error Analysis/dfterror/run_data" (from your first upload) you performed individual ab initio calculations for the 50 test structures. Unlike in all training runs, the tag
is missing from the INCAR files (the OUTCAR files summarize the INCAR contents). Therefore, there is now a mismatch between the settings for reference and predicted data:
- ab initio reference data without vdW-corrections
- ML predictions with vdW-corrections (implicitly added because enabled during training)
You will need to recompute the ab initio data for your test set. Since the shift of energies due to vdW-corrections is roughly -44 eV you should find that the reference energies will shift the points in your energy plot to the correct position:
energy-shift.png
We hope that this will fix the main issue for you!
While looking through your data we found two more things worth mentioning:
- In your POSCARs you have defined an orthorhombic lattice. However, in your ICONST file you use a setup for fixing the shape of a cubic cell instead. This corresponds to item (3) in this list here while you should actually be using number (4). To be honest, we are not sure whether this would really cause problems (your lattice seems to change normally along the trajectory) but to avoid any issues please try to use the recipe for orthorhombic lattices instead. Or, alternatively, use a cubic lattice.
- Your ML-prediction-only runs (ML_MODE = run, ML_ISTART = 2) were executed on 240 to 480 cores, just like the ab initio calculations. However, since the ML force field method is computationally much less demanding, this is a huge waste of CPU hours. Please perform a separate benchmark for the ML-only runs, starting from a single core (you may want to set ML_OUTBLOCK and ML_OUTPUT_MODE = 0). You will probably see no more decrease in execution time with more than 10 to 20 cores. Using even more cores will not speed up your simulations and will just increase the communication overhead and consume extra energy and CPU hours.
All the best,
Andreas Singraber
You do not have the required permissions to view the files attached to this post.