Using metadynamics to train a machine-learned force field

It can be tricky to model transition states using static methods methods. Sometimes, this is insufficient and more time-consuming dynamic methods must be used. By using advance MD methods in combination with machine-learned force fields, the cost can be significantly reduced. Metadynamics is one such method, applying a biased potential on selected geometric parameters, to describe rare events. This tutorial is based on an older metadynamics how-to. The POSCAR, POTCAR, KPOINTS, ICONST, and PENALTYPOT files can be found there (or downloaded in a zip file at the end of this tutorial).

Input

INCAR.mlff

PREC=Normal
EDIFF=1e-6
LWAVE=.FALSE.
LCHARG=.FALSE.
NELECT=22
NELMIN=4
LREAL=.FALSE.
ALGO=VeryFast
ISMEAR=0
SIGMA=0.05

POMASS = 12.011 4.0 35.453 

############################# MD setting #####################################
IBRION=0                                           # MD simulation
NSW=40000                                          # number of steps
POTIM=1                                            # integration step
TEBEG=600                                          # simulation temperature
MDALGO=1                                           # metaDynamics with Andersen thermostat
ANDERSEN_PROB=0.05                                 # collision probability
HILLS_BIN=50                                       # update the time-dependent bias
                                                             # potential every 50 steps
HILLS_H=0.0005                                     # height of the Gaussian
HILLS_W=0.05                                       # width of the Gaussian
##############################################################################
RANDOM_SEED = 311137787 0 0 

# MLFF tags
ML_LMLFF         = .TRUE.
ML_MODE          =  TRAIN
ISIF=2

INCAR.refit

# MLFF tags
ML_LMLFF         = .TRUE.
ML_MODE          =  REFIT
ISIF=2

INCAR.md

POMASS = 12.011 4.0 35.453 

############################# MD setting #####################################
IBRION=0                                           # MD simulation
NSW=100000                                         # number of steps
POTIM=1                                            # integration step
TEBEG=600                                          # simulation temperature
MDALGO=1                                           # metaDynamics with Andersen thermostat
ANDERSEN_PROB=0.05                                 # collision probability
HILLS_BIN=50                                       # update the time-dependent bias
                                                             # potential every 50 steps
HILLS_H=0.005                                      # height of the Gaussian
HILLS_W=0.05                                       # width of the Gaussian
##############################################################################
RANDOM_SEED = 311137787 0 0 

# MLFF tags
ML_LMLFF         = .TRUE.
ML_MODE          =  RUN
ISIF=2

ML_ESTBLOCK = 20
ML_OUTBLOCK = 20

INCAR.interactive

PREC=Normal
EDIFF=1e-6
LWAVE=.FALSE.
LCHARG=.FALSE.
NELECT=22
NELMIN=4
LREAL=.FALSE.
ALGO=VeryFast
ISMEAR=0
SIGMA=0.05

POMASS = 12.011 4.0 35.453 

############################# MD setting #####################################
IBRION=0                                           # MD simulation
NSW=185                                            # number of steps
POTIM=1                                            # integration step
TEBEG=600                                          # simulation temperature
MDALGO=1                                           # metaDynamics with Andersen thermostat
ANDERSEN_PROB=0.05                                 # collision probability
HILLS_BIN=50                                       # update the time-dependent bias
                                                             # potential every 50 steps
HILLS_H=0.0005                                     # height of the Gaussian
HILLS_W=0.05                                       # width of the Gaussian
##############################################################################
RANDOM_SEED = 311137787 0 0 

# MLFF tags
ML_LMLFF         = .TRUE.
ML_MODE          = TRAIN
ML_ICRITERIA     = 0
ML_NMDINT        = 1
ML_MCONF         = 2000

ISIF             = 2
INTERACTIVE      = .TRUE.

Step-by-step instructions

Step 0: Training the MLFF (optional)

Required files: POSCAR, POTCAR, INCAR.mlff, KPOINTS, ICONST, PENALTYPOT

The first step is to generate the machine-learned force fields (MLFFs). Run enough steps to have explored a large portion of the configuration space of the reaction. This step will not sample many points around the transition state, which you can account for later. Use the INCAR.mlff file INCAR file for this step. We have included an ML_AB file in the zip file below, so you can skip this step.

Important: Make sure that there is the ICONST and PENALTYPOT files or the MD will not be metadynamics.

cd e00_training
vasp_std

Step 1: Refitting the MLFF

Required files: POSCAR, POTCAR, INCAR.refit, KPOINTS

Copy the ML_ABN from step 0 to ML_AB (or from the zip file) and refit. Use the INCAR.refit file for this.

cd e01_refit
cp ../e00_training/ML_ABN ML_AB
vasp_std

Step 2: Running the MD simulation

Required files: POSCAR, POTCAR, INCAR.md, KPOINTS, ICONST, PENALTYPOT

First, link to the MLFF that you have just generated: ln -s ../e01_refit/ML_FFN ML_FF. Then, run the molecular dynamics (MD) simulation using INCAR.md. Make sure that the PENALTYPOT file is present, as this is vital for containing the molecule and modeling the reaction using metadynamics. You will track the two C-Cl distances as collective variables using the ICONST file.

cd e02_MD
ln -s ../e01_refit/ML_FFN ML_FF
vasp_std

MD analysis

Once the metadynamics MD simulation is completed, plot the time evolution of the collective variables (cf. ICONST) using the timeEv.sh script and gnuplot:

bash ./timeEv.sh
gnuplot -e "set terminal jpeg; set xlabel 'timestep'; set ylabel 'Collective variable (Ang)'; set style data lines; plot 'timeEvol1.dat', 'timeEvol2.dat'" > timeEvol.jpg

The obtained time evolution of the collective variables looks like the following:

The Cl^- ion is initially more distant from the C atom. After a few thousand steps, the chloromethane molecule inverts as the chloride attacks and the opposing chlorine atom is expelled as a chloride. After a thousand more steps, the MLFF breaks down, and a non-physical structure is formed at which the MD simulation gets stuck.

Important: If the simulation does not fail, repeat the calculation, try increasing the number of ionic steps (NSW), until you run a simulation where the MLFF breaks down. Alternatively, decrease the number of NSW used for training the MLFF in the first place. This will train an MLFF with fewer structures, i.e., a worse force field that will break down sooner. We only recommend this in this exercise so that you can learn how to fix an MLFF.

Spilling factor

This is when the simulation leaves the configuration space of the MLFF training set. You can see this by checking the spilling factor in the ML_LOGFILE:

grep SFF ML_LOGFILE

# SFF ########################################################################################################
# SFF This line shows the spilling factor,
# SFF
# SFF nstep ............ MD time step or input structure counter
# SFF sfmax ............ Maximal spilling factor among atoms
# SFF sfmin ............ Minimal spilling factor among atoms
# SFF sfmean ........... Mean value of spilling factor
# SFF sfvar ............ Variance of spilling factor
# SFF threshold ........ Value of threshold criterion for learning
# SFF ########################################################################################################
# SFF               nstep            sfmax            sfmin           sfmean            sfvar        threshold
# SFF                   2                3                4
# SFF ########################################################################################################
SFF                   100   9.29675803E-06   2.13512863E-07   2.26596576E-06   1.01225022E-11   2.00000000E-03
SFF                   200   1.11874507E-06   4.26678995E-07   7.93844208E-07   5.02375989E-14   2.00000000E-03
SFF                   300   1.69448509E-06   2.46957156E-07   9.74621854E-07   2.15210308E-13   2.00000000E-03
...
SFF                 99960   9.99309893E-01   3.36556878E-03   3.40857829E-01   2.16239137E-01   2.00000000E-03
SFF                 99980   9.99464421E-01   5.99966976E-03   3.44462977E-01   2.14118191E-01   2.00000000E-03
SFF                100000   9.99581963E-01   5.29236322E-03   3.41837281E-01   2.15977938E-01   2.00000000E-03

We have set ML_OUTBLOCK = 20 so that only every 20th structure is included. Notice how sfmax goes to 1. This indicates that it is completely outside of the training set. You can plot this to visualize what happens and compare to the plot above. First, grep for the spilling factor SFF in the ML_LOGFILE:

grep '^SFF[[:space:]]*[0-9]' ML_LOGFILE | awk '{print $2 " " $3}' > sff.dat

Then plot sff.dat using Python:

import py4vasp
import numpy as np

step, spilling_factor = np.loadtxt("./e02_MD/sff.dat", unpack=True)

py4vasp.plot(step, spilling_factor, xlabel="MD step", ylabel="Spilling factor", label="Spilling factor")

Extracting new training structures

The structures after the spilling factor reaches 1 are not meaningful. However, those immediately before are very useful. These are where the MLFF begins to break down. You can extract these structures using the sff_grep.py script:

./sff_grep.py ML_LOGFILE XDATCAR 5E-4 0.8

This will extract from XDATCAR all of the structures in the configurations ML_LOGFILE that have a spilling factor between 5E-4 and 0.8 and output them to a POSCAR.interactive file. POSCAR.interactive can then be used in the next step to continue training the MLFF.

Important: POSCAR.interactive is a file containing the position of the ions for each configuration in direct coordinates. The POSCAR must still be included.

Step 4: Retrain the MLFF

Required files: POSCAR, POTCAR, INCAR.interactive, KPOINTS, ICONST, PENALTYPOT from old HILLSPOT

Take the POSCAR.interactive that you have generated in the preceding step, take the ML_AB file that you have already trained in step 0 (or refitted in step 1), along with the PENALTYPOT from step 2, and the INCAR.interactive file - making sure to update the NSW to the number of structures in POSCAR.interactive. Then run a calculation with the same POSCAR, POTCAR, and KPOINTS files from step 1, ensuring to input the POSCAR.interactive file into the VASP executable, enabled by setting INTERACTIVE = .TRUE., e.g.:

cd e03_interactive
cp ../e00_training/ML_ABN ML_AB
cp ../e00_training/HILLSPOT PENALTYPOT
vasp_std < POSCAR.interactive

Important: Ensure that you update NSW in the INCAR to be equal to the number of configurations in your POSCAR.interactive file.

Step 5: Refit the MLFF

Required files: POSCAR, POTCAR, INCAR.refit, KPOINTS

Refit the MLFF as before, copying the ML_ABN to ML_AB and refitting with INCAR.refit.

cd e04_new_refit
cp ../e03_interactive/ML_ABN ML_AB
vasp_std

Step 6: Repeat the MD simulation

Required files: POSCAR, POTCAR, INCAR.md, KPOINTS, ICONST, PENALTYPOT from old HILLSPOT

Copy your old biased potential (HILLSPOT file) from your first MD run to the PENALTYPOT in your new directory. This saves you a lot of time by enabling you to begin from where the previous calculation failed. Then, run the calculation as normal:

cd e05_new_MD
cp ../e02_MD/HILLSPOT PENALTYPOT
ln -s ../e04_new_refit/ML_FFN ML_FF
vasp_std

MD analysis

Once the calculation is finished, take a look at the time evolution of the collective variables using timeEv.sh script and gnuplot:

bash ./timeEv.sh
gnuplot -e "set terminal jpeg; set xlabel 'timestep'; set ylabel 'Collective variable (Ang)'; set style data lines; plot 'timeEvol1.dat', 'timeEvol2.dat'" > timeEvol.jpg

The obtained time evolution of the collective variables should look like the following:

File:updated inversions.jpeg

Rather than getting stuck after a few thousand steps, this new MLFF is much more stable and can keep on inverting back and forth over several thousand steps.

You can clearly see how often the green and purple lines switch, i.e., the inversion reaction occurs. This indicates that we are accurately modeling the transition state.

Spilling factor

You can confirm the stability of your MLFF by taking a look at the spilling factor:

grep '^SFF[[:space:]]*[0-9]' ML_LOGFILE | awk '{print $2 " " $3}' > sff.dat

Then plot sff.dat using Python:

import py4vasp
import numpy as np

step, spilling_factor = np.loadtxt("./e05_new_MD/sff.dat", unpack=True)

py4vasp.plot(step, spilling_factor, xlabel="MD step", ylabel="Spilling factor", label="Spilling factor")

Plotting the biased potential

As a final step, you can visualize the biased potential that you have generated. First, take the HILLSPOT file and, using the gaussians.py script, generate a file containing the sum of the gaussians used for the biased potential gaussian_sum_output.txt, removing the first 9 lines that contain the initial biased potential that restrains the reaction in place:

./gaussians.py HILLSPOT 9

Then, plot the biased potential with the contourplot.py script:

./contourplot.py

This will generate the following interactive plot:

File:gaussian sum.png

You can set a cutoff for the z-axis, e.g., 0.05, to get a better look at the transition state region:

./contourplot.py 0.05

See how much more densely populated the transition state region is than from the previous MD:

File:gaussian sum TS compare.png

We generated these scripts using an AI, so we recommend trying to do this yourself. Generating these scripts took around 5 minutes with only a few prompts.

Practical hints

Make sure to set the mass of hydrogen to above 1, e.g., POMASS = 12.011 4.0 35.453, rather than POMASS = 12.011 1.0 35.453. This enables larger time steps for the MD simulation.
We found that reducing the ANDERSEN_PROB to 0.05 from 0.1 (see previous exercise).
It is crucial to reduce the hill height (HILLS_H) by an order of magnitude or more (e.g., 0.0005 instead of 0.005). This avoids being driven out of the reliably fitted energy surface. It keeps the MLFF in the trained region for longer, stabilising the spilling factor.
For long MLFF runs, it may be worth increasing HILLS_BIN to 100 in order to stay within the trained region for longer. This is only important for longer metadynamics simulations. You should also increase ML_ESTBLOCK = 100 and ML_OUTBLOCK = 100 in this case.
Once you have a stable ML_FF, to improve the stability for longer MLFF runs, you should collect the positions for structures that are outside of the reliably fitted energy surface. These are those that have larger spilling factors. While continuing training (see step 4 above), ML_NMDINT = 1 should be used and the Bayesian threshold set to the final value from the previous training runs ML_ICRITERIA = 0.