Part 3: Substitution reaction of chloromethane by a chloride ion¶

Content¶

8 Slow-growth simulation of the free-energy profile
9 Probability distribution of the reactant state
10 Average velocity at the transition state

8 Slow-growth simulation of the free-energy profile
$\uparrow$
¶

By the end of this tutorial, you will be able to:

use thermodynamic integration to get a free-energy profile
perform a slow-growth simulation and judge its quality
interpret the content of the ICONST file
extract some information from the REPORT file

8.1 Task¶

Compute the free energy as a function of reaction coordinate for the nucleophilic substitution of chloromethane by a chloride ion using the slow-growth approach and thermodynamic integration.

By computing the free-energy profile of a chemical reaction, you can obtain the activation free energy of a reaction. That is, the work needed to transition from a stable reactant state to the transition state. One way to get the free-energy profile is the so-called thermodynamic integration, where you integrate the free-energy gradients along the reaction path. That path connects the initial and final thermodynamic states occurring in the chemical reaction. The so-called reaction coordinate is a one-dimensional representation of the reaction path.

A slow-growth simulation using ab-initio molecular dynamics (MD) captures transformations, the system might undergo during the reaction, for instance activating soft degrees of freedom like rotations. Performing the transformation slowly aims to make the process reversible. That is, the work being done on the system equals free energy, as can be shown within thermodynamics.

In contrast, in a fast-switching simulation, the free energy is computed using pure statics by performing the transformation many, many times quickly. It yields a set of values for irreversible work that is being done on the system. This set of values can be converted into free energy via the Jarzinsky identity. Yet another approach is based on the analysis of the local topology of the potential-energy surface (PES) around stationary points. This static approach only requires relaxation and vibration analysis of few stationary points on the PES and is therefore computationally inexpensive. Nevertheless, this approach cannot account for soft modes that might be activated during the transition and, hence, it merely gives a qualitative estimate of the reaction path.

No description has been provided for this image

Above you can see the chemical reaction that you will simulate in this example. It is an S$_\mathrm{N}$2 reaction. In particular, chloromethane, i.e., CH$_3$Cl, and Cl$^-$ collide. Then, chloromethane, being a nucleophile, binds Cl$^-$ and forms a discrete, detectable intermediate state. Finally, the substitution reaction is completed, when the system releases the Cl$^-$ that was bound before. In this case the S$_\mathrm{N}$2 reaction is symmetric as one Cl$^-$ attacks and another Cl$^-$ leaves, but in general another functional group, i.e., part of the molecule, could be released. Therefore the free-energy profile of this reaction is symmetric.

8.2 Input¶

The input files to run this example are prepared at $TUTORIALS/MD/e08_slow-growth-SN2.

POSCAR.init

CH3Cl and Cl anion                               
   1.00000000000000     
    12.0000000000000000    0.0000000000000000    0.0000000000000000
     0.0000000000000000   12.0000000000000000    0.0000000000000000
     0.0000000000000000    0.0000000000000000   12.0000000000000000
     C    H    Cl
     1    3    2
Direct
  0.5514939507826783  0.6445322822677826  0.6618499542507703
  0.5955920236688621  0.6360489324678542  0.7468089748516230
  0.5000590382923739  0.5725532184521409  0.6443833645770085
  0.5061562091176164  0.7207841961851618  0.6499733084904860
  0.6457080633995789  0.6469765007123230  0.5426978869294665
  0.5382007268289341  0.4048714890598017  0.8450701690068790

INCAR.refit

SYSTEM   = CH3Cl and Cl anion
KSPACING = 1000                 ! Gamma point only

! machine learning
ML_LMLFF  = T       ! use machine-learned force fields
ML_MODE   = REFIT   ! read ML_AB and create ML_FFN

INCAR.init

SYSTEM   = CH3Cl and Cl anion
ISYM     = 0                    ! no symmetry imposed
POMASS   = 12.011 3.000 35.453  ! ionic masses
KSPACING = 1000                 ! Gamma point only

! MD
IBRION = 0        ! MD (treat ionic degrees of freedom)
NSW    = 1500     ! no of ionic steps
POTIM  = 2        ! MD time step in fs

MDALGO = 1             ! Andersen thermostat
ANDERSEN_PROB = 0.05   ! collision probability

TEBEG  = 300      ! temperature at beginning
ISIF   = 2        ! update positions; cell shape and volume fixed

LBLUEOUT = T      ! write free-energy gradient to REPORT file 
INCREM   = 8e-4   ! transformation velocity (slow growth)

! machine learning
ML_LMLFF  = T     ! use machine-learned force fields
ML_MODE   = RUN   ! read ML_AB and ML_FF, no learning

POTCAR

Pseudopotentials of C, H and Cl.

ICONST

R 1 5 0
R 1 6 0
S 1. -1.  0

ML_AB

Collection of ab-initio data from previous calculations for CH3Cl and Cl anion: Lattice vectors, atomic positions, energies, forces, and stress tensors.

Check the tags set in the INCAR files! INCAR.refit contains two tags to controll the MLFF method. ML_LMLFF=T switches on its use and ML_MODE=refit allows generating a force field based on an existing ML_AB file.

The SYSTEM tag takes any string and serves just for your own reference. ISYM generally switches off the use of any symmetry and you always need to set it for MD calculations. The POMASS tag specifies the ionic mass and also gets set by the POTCAR file. To overwrite the values set by the POTCAR file, you can specify POMASS in the INCAR file in the same order. The reason to do this here, is to use the hydrogen isotope tritium in the simulation that is heavier than hydrogen. This causes the oscillation of hydrogen bonds to be slower and allows to use a larger time step defined using POTIM, and still capture the same physics. Does this change the reaction rate in an MD simulation using force fields? How about in an ab-initio MD simulation? In other words, will you see an isotope effect?

Click to see the answer!

No, the isotope effect, that changes the reaction rate in real experiments, is a quantum mechanical effect of the ionic degree of freedom. To capture the underlying mechanism, you have to treat the ions quantum mechanically. In principal, the machine-learned force fields are trained to reproduce ab-initio MD, so both methods capture the same effects. That is, the answer could not be yes for one and no for the other, unless your force fields are insufficiently trained. Recall that within ab-initio MD the ions are treated classically, while the forces acting on each ion are computed by treating the electrons quantum mechanically. Finally, in classical mechanics, the mass cancels in the equations to compute thermodynamical averages, so that you cannot observe the isotope effect in MD simulations.

Another tag in the INCAR file is the KSPACING tag, that specifies the density of $\mathbf{k}$ points. We set it to a ridiculously large number, so that only the Γ point will be initialized. Alternatively, you could also setup a KPOINTS file that specifies a single $\mathbf{k}$ point.

IBRION, NSW and POTIM control how the ionic positions are updated. MDALGO, ANDERSEN_PROB, TEBEG and ISIF together define the thermodynamic ensemble. What ensemble does the setting correspond to? How does the Andersen thermostat include the temperature in the simulation?

Click to see the answer!

The canonical ensemble, i.e., the NVT ensemble.
The Andersen thermostat employs stochastic collisions to mimic the action of a heat bath.

The reaction coordinate is defined as a constraint specified in the ICONST file. Read about the ICONST file on the VASP Wiki and answer the following question! What reaction coordinate is chosen for this calculation?

Click to see the answer!

Atom 1 is C and Atom 5 and 6 are the Cl$^-$ ions. The ICONST file defines a complex reaction coordinate, that is the difference between the two bond lengths of the two C-Cl bonds. That is, if the distance between C and Cl within chloromethane is 1.8 Å, and the distance between C and the remote Cl$^-$ ion is 3 Å, then the value of the reaction coordinate is 1.8 Å - 3 Å = -1.2 Å. At the transition state, both Cl$^-$ ions are bound at equal distance, so that the reaction coordinate is zero.

In practice, finding a suitable reaction coordinate is a major challenge when facing a new reaction process.

Additional to the constraints specified in the ICONST file, the INCREM tag constrains the MD simulation to evolve along the reaction path. It specifies the transformation velocity in units per step, i.e., POTIM, not per time.

LBLUEOUT switches on the computation of the free energy gradients and writes to the REPORT file.

The POSCAR file defines the initial ionic positions and the POTCAR file contains details about the atomic species.

8.3 Calculation¶

Open a terminal, navigate to this example's directory and refit the ab-initio energies, forces and stress stored in the ML_AB file to generate a force field by entering the following:

cd $TUTORIALS/MD/e08_*
cp INCAR.refit INCAR
cp POSCAR.init POSCAR
mpirun -np 4 vasp_gam

Copy the generated force field from the output ML_FFN file to an input ML_FF file. Then, start the actual MD simulation by executing the following:

cp ML_FFN ML_FF
bash run.sh

Click here to reveal the run.sh script!

#!/bin/bash

# ensure that this sequence of MD runs is reproducible
cp POSCAR.init POSCAR
cp INCAR.init INCAR
rseed="RANDOM_SEED =         311137787                0                0"
echo $rseed >> INCAR

backupfiles="REPORT vaspout.h5"

for i in 1 2
do

  # start vasp
  mpirun -np 3 vasp_gam

  # ensure that this sequence of MD runs is reproducible
  rseed=$(grep RANDOM_SEED REPORT |tail -1)
  cp INCAR.init INCAR
  echo $rseed >> INCAR

  # use the last configuration generated in the previous
  # run as initial configuration for the next run
  cp CONTCAR POSCAR

  # backup some important files
  mkdir run$i
  for f in $backupfiles
  do
      cp $f run$i/.
  done
done

Note how we take the final structure from the old run (CONTCAR) and take it as the initial structure for the new run (POSCAR):

cp CONTCAR POSCAR

We also use the final RANDOM_SEED from the first run to initialize the atomic velocities of the second run:

rseed=$(grep RANDOM_SEED REPORT |tail -1)
cp INCAR.init INCAR
echo $rseed >> INCAR

It is important to do both of these steps to ensure continuity between the first and second run.

It starts 3000 MD steps. This takes a moment, so let's learn about thermodynamic integration. Once you have evaluated the free-energy gradient $\frac{\partial A}{\partial \xi}$ at values of the reaction coordinates $\xi'$, you can compute the free energy difference $\Delta A_{\mathrm{A}\to\mathrm{B}}$ between any two states, A and B, by integrating along the reaction path: $$ \tag{1} \Delta A_{\mathrm{A}\to\mathrm{B}} = \int_{\xi_{\mathrm{A}}}^{\xi_\mathrm{B}} \left. \frac{\partial A}{\partial \xi} \right|_{\xi'} \mathrm{d} \xi' = \Delta A(\xi_\mathrm{B}) - \Delta A(\xi_{\mathrm{A}}). $$ The main difficulty is to get a sufficiently well sampled free-energy gradient $\frac{\partial A}{\partial \xi}$ at each $\xi'$. In the slow-growth approach, the reaction coordinate $\xi'$ is changed with each time step $\mathrm{d}\tau$, set by POTIM, with transformation velocity $\dot{\xi}'$, set by INCREM. The above integral is rewritten within the slow-growth approach using $$ \tag{2} \mathrm{d} \xi' = \dot{\xi}' \mathrm{d}\tau. $$ If the transformation velocity and time step are not sufficiently small, this approach fails. The quality of your simulation, can be checked by reversing the process. If the free-energy profile, that is the plot of $\Delta A(\xi)$, shows a hysteresis the transformation velocity and time step are not sufficiently small.

In practice, a slow-growth simulation is used only to get a quick look at the free-energy profile. To perform a high quality thermodynamic integration, other methods, such as the blue-moon method that is also implemented in VASP, are better since they allow for fine control of convergence, but they come at the expense of higher computational cost.

After the first 1500 MD steps are done, a directory called run1 appears in this example's directory. Open the run1/REPORT file!

For each MD step, the following information is written:

========================================
         MD step No.       1
========================================

  Atomic velocities initialized by STEP_tb

  >parameters for SHAKE algorithm
   SHAKEMAXITER =      1000
       SHAKETOL =   0.100000E-04
       SHAKESCA =    2.00000000

  SHAKE converged in   14 steps

  >Const_coord
   cc> S   -1.79976  -1.79977 -0.256768E-05

  >Blue_moon
        lambda        |z|^(-1/2)    GkT           |z|^(-1/2)*(lambda+GkT)
   b_m>  0.176383E+01  0.179942E+01  0.375764E-03  0.317455E+01

  >Energies
                    E_tot             E_pot             E_kin               EPS                ES
   e_b>   -0.27730066E+02   -0.27867234E+02    0.17427490E+00    0.00000000E+00
    0.00000000E+00

  >Metadynamics

  >Temperature
               T_sim     T_inst
   tmprt>    300.000    288.909

  >Thermostat, num. of collisions:                0

           RANDOM_SEED =         311137787               78                0

For the post-processing of a slow-growth simulation, you can find $\xi'$ as the first number in the line starting with cc>, and the free-energy gradient $\frac{\partial A}{\partial \xi}$ at $\xi'$ as the first number in the line starting with b_m>! This will allow you to perform the integration in Eq. (1).

You can now get a first look at the MD trajectories using py4vasp:

import py4vasp
my_calc = py4vasp.Calculation.from_path("./e08_slow-growth-SN2/run1") 

my_calc.structure[:].plot()

After additional 1250 MD steps are done, a directory called run2 appears in this example's directory and you have completed the calculation. You can extract the free-energy gradients using the provided script free-energy_gradient.sh by entering the following into the terminal:

cd $TUTORIALS/MD/e08_*
bash free-energy_gradient.sh

This creates a file called free-energy_gradient.dat by executing free-energy_gradient.sh

#!/bin/bash

rm -f free-energy_gradient.dat
touch free-energy_gradient.dat

for i in 1 2
do
  if test -f run$i/REPORT
  then
    grep cc run$i/REPORT  | awk '{print $3}' > xxx
    grep b_m run$i/REPORT | awk '{print $2}' > fff
    paste xxx fff >> free-energy_gradient.dat
  fi
done

sort -n free-energy_gradient.dat > free-energy_gradient_.dat
mv free-energy_gradient_.dat free-energy_gradient.dat
rm xxx fff

Perform the thermodynamic integration and plot the free energy as a function of reaction coordinate using the code below!

import numpy as np
from scipy import integrate
from py4vasp import plot

filename = "./e08_slow-growth-SN2/free-energy_gradient.dat"

# read the free-energy gradients and reaction coordinate from file
xi, dA = np.loadtxt(filename, unpack=True)

# integrate the free-energy gradients over the reaction coordinate as in Eq (1)
DA = integrate.cumulative_trapezoid(dA, xi, initial=0)

# plot the free energy as a function of reaction coordinate
plot(xi, DA, xlabel="Reaction coordinate (Angstrom)", ylabel="Free energy (eV)")

In the free-energy plot, that you have generated, where is the reactant state? And where is the product state? What value of the reaction coordinate corresponds to the transition state? Why is the free energy almost symmetric with respect to the reaction coordinate?

Click to see the answer!

For $\xi<0$, the reactant state is simulated. For $\xi>0$, the product state is simulated. And $\xi=0$ corresponds to the transition state.

The free-energy profile is symmetric, because the reaction is symmetric. Here, deviation from perfect symmetry point towards too large values of the time step set by POTIM, or the transformation velocity set by INCREM.

Find representative configurations for each state in the MD trajectory! What are the characteristic features for each state? Hint: You can get the distance between two atoms by right-clicking the first once, and the second twice.

import py4vasp
my_calc = py4vasp.Calculation.from_path("./e08_slow-growth-SN2") 

my_calc.structure[:].plot()

Click to see the answer!

Check step 0 for the reactant state, approximately step 750 for the transition state and step 1499 for the product state. You can enter the step count by clicking on the number and editing the text.

8.4 Questions¶

Which file specifies the constraints for constrained molecular dynamics?
Along which path does INCREM act?
Are free-energy profiles always symmetric?
What is the slow growth simulation approach typically used for? When does it fail?

9 Probability distribution of the reactant state
$\uparrow$
¶

By the end of this tutorial, you will be able to:

compute the probability density of the reaction coordinate
monitor a geometric parameter

9.1 Task¶

Perform a free MD simulation in the reactant state of chloromethane and a chloride ion, and compute the probability density of the reaction coordinate.

If the reaction between chloromethane and a chloride ion is not activated, the transition does not occur spontaneously. This can be confirmed by computing the probability density of the reaction coordinate: $$\tag{1} P(\xi') = \left\langle \delta( \xi' - \xi(\mathbf{R})) \right\rangle_{\mathrm{R}}, \qquad \xi'<0 $$ starting within the ensemble of the reactant.

9.2 Input¶

The input files to run this example are prepared at $TUTORIALS/MD/e09_free-MD-SN2, except for the ML_FF file. Create a link to the force field generated in the previous exercise, instead of copying it, to save disc memory by entering the following into the terminal:

cd $TUTORIALS/MD/e09_*
ln -s $TUTORIALS/MD/e08_*/ML_FF .

POSCAR.init

CH3Cl and Cl anion
   1.00000000000000     
    12.0000000000000000    0.0000000000000000    0.0000000000000000
     0.0000000000000000   12.0000000000000000    0.0000000000000000
     0.0000000000000000    0.0000000000000000   12.0000000000000000
   C    H    Cl
     1     3     2
Direct
  0.5514939507826783  0.6445322822677826  0.6618499542507703
  0.5955920236688621  0.6360489324678542  0.7468089748516230
  0.5000590382923739  0.5725532184521409  0.6443833645770085
  0.5061562091176164  0.7207841961851618  0.6499733084904860
  0.6457080633995789  0.6469765007123230  0.5426978869294665
  0.5382007268289341  0.4048714890598017  0.8450701690068790

INCAR.init

SYSTEM   = CH3Cl and Cl anion
ISYM     = 0                    ! no symmetry imposed
POMASS   = 12.011 3.000 35.453  ! ionic masses
KSPACING = 1000                 ! Gamma point only

! MD
IBRION = 0        ! MD (treat ionic degrees of freedom)
NSW    = 5000     ! no of ionic steps
POTIM  = 2        ! MD time step in fs

MDALGO = 1             ! Andersen thermostat
ANDERSEN_PROB = 0.05   ! collision probability

TEBEG  = 300      ! temperature at beginning
ISIF   = 2        ! update positions; cell shape and volume fixed

! machine learning
ML_LMLFF  = T  ! use machine learned force fields
ML_MODE   = run  ! read ML_AB and ML_FF, no learning

POTCAR

Pseudopotentials of C, H and Cl.

ICONST

R 1 5 0
R 1 6 0
S 1. -1.  7

ML_FF

Force field parameters.

Recall the meaning of the tags in the INCAR file! Compared to Example 8, LBLUEOUT and INCREM are not set. That is the free energy gradients are not written to the REPORT file and the MD calculation is not forced along the reaction path. Instead this is an unconstrained, or free, MD calculation with machine-learned force fields.

In the ICONST file the STATUS tag of the complex coordinate in the last line is 7. What does this imply? Find the answer on the VASP Wiki!

9.3 Calculation¶

Open a terminal, navigate to this example's directory and run VASP by entering the following:

cd $TUTORIALS/MD/e09_*
bash run.sh

Click here to reveal the run.sh script!

#!/bin/bash

# ensure that this sequence of MD runs is reproducible
cp POSCAR.init POSCAR
cp INCAR.init INCAR
rseed="RANDOM_SEED =         311137787                0                0"
echo $rseed >> INCAR

backupfiles="REPORT vaspout.h5"

for i in 1 2
do

  # start vasp
  mpirun -np 3 vasp_gam

  # ensure that this sequence of MD runs is reproducible
  rseed=$(grep RANDOM_SEED REPORT |tail -1)
  cp INCAR.init INCAR
  echo $rseed >> INCAR

  # use the last configuration generated in the previous
  # run as initial configuration for the next run
  cp CONTCAR POSCAR

  # backup some important files
  mkdir run$i
  for f in $backupfiles
  do
      cp $f run$i/.
  done
done

This starts 10000 MD steps! As this will take some time, so let's learn about how to compute the reaction rate constant and the activation energy measured experimentally!

The reaction rate constant of a chemical reaction from reactant state R to product state P, that is measured in an experiment, is described by Eyring's Equation: $$ \tag{2} k_{\mathrm{R}\to \mathrm{P}} = \frac{k_{\mathrm{B}}T}{h} \exp\left( - \frac{\Delta A^\ddagger}{k_{\mathrm{B}}T} \right). $$ Here, $\Delta A^\ddagger$ is the phenomenological activation energy, which can experimentally be accessed via the rate constant vs. T measurements (Arrhenius plot), $h$ is the Planck constant, $T$ is the temperature and $k_{\mathrm{B}}$ is the Boltzmann constant. The rate constant can also be written down in terms of quantities available by MD simulations: $$\tag{3} k_{\mathrm{R}\to \mathrm{P}} = \frac{\langle|\dot{\xi}^*|\rangle}{2} P(\xi_{\mathrm{ref},\mathrm{R}}) \exp\left( - \frac{\Delta A_{\xi_{\mathrm{ref},\mathrm{R}\to \xi^*}}}{k_{\mathrm{B}}T} \right), $$ where $\Delta A_{\xi_{\mathrm{ref},\mathrm{R}\to \xi^*}}$ is the free-energy difference between a reference state in the R state at $\xi_{\mathrm{ref},\mathrm{R}}$ and the transition state at $\xi^*$. $\Delta A_{\xi_{\mathrm{ref},\mathrm{R}\to \xi^*}}$ can be directly obtained from the free-energy profile you have computed in Example 8!

The choice of the reference state is arbitrary as long as you are in the R state. The reaction rate in Eq (2), is independent of the choice of the reference state. That is, because it is also directly proportional to the probability to find the system in that reference state. Lastly, the reaction rate depends on the thermodynamical average of the absolute value of the generalized velocity at the transition state $\langle|\dot{\xi}^*|\rangle$, which you will compute in Example 10!

Using Eq. (2) and Eq. (3), you can find a relation between the phenomenological activation energy measured in experiment and the activation energy obtained from MD simulations: $$\tag{4} \Delta A^\ddagger = \Delta A_{\xi_{\mathrm{ref,R}} \to \xi} - k_{\mathrm{B}} T \ln \left( \frac{h}{k_{\mathrm{B}} T} \frac{\langle | \dot{\xi^*}(0)| \rangle}{2} P(\xi_{\mathrm{ref,R}}) \right). $$ It is a common mistake to forget about the second term on the right side of Eq. (4), when comparing experimental and theoretical results!

After the first 5000 MD steps are done, a directory called run1 appears in this example's directory. Check what is written to the REPORT file in this simulation! Could you extract the free-energy gradient?

Click to see the answer!

No, because LBLUEOUT is not set. Also note that the line corresponding to the monitored reaction coordinate corresponding in the REPORT file starts with mc> for monitored coordinate, in contrast to cc> in Example 8 where the coordinate was constrained.

========================================
         MD step No.       1
========================================

  Atomic velocities initialized by STEP_tb

  >Monit_coord
   mc> S         -1.80056

  >Energies
                    E_tot             E_pot             E_kin               EPS                ES
   e_b>   -0.27679404E+02   -0.27867234E+02    0.18782946E+00    0.00000000E+00
    0.00000000E+00

  >Metadynamics

  >Temperature
               T_sim     T_inst
   tmprt>    300.000    290.621

  >Thermostat, num. of collisions:                0

           RANDOM_SEED =         311137787               78                0

You can visualize the MD trajectories using py4vasp!

import py4vasp
my_calc = py4vasp.Calculation.from_path("./e09_free-MD-SN2/run1") 

my_calc.structure[:].plot()

After the calculation has completed, a second directory run2 appears in this example's directory. Change to this example's directory and grep the reaction coordinate from the REPORT files of both runs and write them to reaction_coordinate.dat!

Click to see the answer!

cd $TUTORIALS/MD/e09_*
    grep mc run1/REPORT | awk '{print $3}' > reaction_coordinate.dat
    grep mc run2/REPORT | awk '{print $3}' >> reaction_coordinate.dat

Note that the corresponding line in the REPORT file starts with mc> for monitored coordinate, in contrast to cc> in Example 8 where the coordinate was constrained.

import numpy as np
import plotly.express as px
import pandas as pd

filename = "./e09_free-MD-SN2/reaction_coordinate.dat"

# read the reaction coordinate xi from file
xi = np.loadtxt( filename )

# plot the probability density of xi as a histogram
df = pd.DataFrame({"reaction coordinate (Angstrom)" : xi})
px.histogram(df, x="reaction coordinate (Angstrom)", nbins=200, histnorm="probability density")

import numpy as np
from py4vasp import plot

filename = "./e09_free-MD-SN2/reaction_coordinate.dat"

# read the reaction coordinate xi from file
xi = np.loadtxt( filename )

# plot the probability density of xi as a histogram
rate, bins = np.histogram(xi, bins=200, density=True)
centers = 0.5 * (bins[1:] + bins[:-1])
plot(centers, rate, xlabel="reaction coordinate (Angstrom)", ylabel="probability density")

Does the reaction occur spontaneously in this simulation? What if you continued the simulation? What is the value of $P(\xi')$ at $\xi'=-1.5$?

Click to see the answer!

No, the reaction does not occur spontaneously. This is evident from the vanishing probability density in the product's region $\xi' > 0.1Å$. Since the reaction is an activated process with an free-energy barrier much higher than $k_BT$, it is very unlikely to observe even a single transition event in a straightforward MD simulation of affordable length. This is why reactions and other activated processes are called rare events. Considering the interval $[-1.5Å,-1.49Å]$, $P(\xi')\approx1.54$.

9.4 Questions¶

What does STATUS=7 in the ICONST file set?
What is an Arrhenius plot?

10 Average velocity at the transition state
$\uparrow$
¶

By the end of this tutorial, you will be able to:

perform a convergence study over number of molecular dynamics steps
compute the reaction rate and phenomenological activation energy
run a constrained molecular dynamics simulation

10.1 Task¶

Compute the average generalized velocity at the transition state, where chloromethane binds a chloride ion, and compute the reaction rate and phenomenological activation energy.

Recall that the reaction rate constant can be written as $$\tag{1} k_{\mathrm{R}\to \mathrm{P}} = \frac{\langle|\dot{\xi}^*|\rangle}{2} P(\xi_{\mathrm{ref},\mathrm{R}}) \exp\left( - \frac{\Delta A_{\xi_{\mathrm{ref},\mathrm{R}\to \xi^*}}}{k_{\mathrm{B}}T} \right). $$ And consequently the relation between the phenomenological free energy of activation $\Delta A^\ddagger$ and the reversible work $\Delta A_{\xi_{\mathrm{ref,R}} \to \xi^*}$ is given by $$\tag{2} \Delta A^\ddagger = \Delta A_{\xi_{\mathrm{ref,R}} \to \xi^*} - k_{\mathrm{B}} T \ln \left( \frac{h}{k_{\mathrm{B}} T} \frac{\langle | \dot{\xi}^*| \rangle}{2} P(\xi_{\mathrm{ref,R}}) \right). $$ You have computed all but one quantity to obtain $k_{\mathrm{R}\to \mathrm{P}}$ and $\Delta A^\ddagger$ from your MD simulations in Example 8 and Example 9. The missing quantity is the thermodynamical average of the generalized velocity at the transition state, which is defined as $$\tag{3} \dot{\xi} = \sum_{i=1}^N \sum_{\mu={x,y,z}} \frac{\partial \xi}{\partial R_{i,\mu}} {\dot R}_{i,\mu}, $$ where $R_{i,\mu}$ are the Cartesian coordinates of atom $i$.

10.2 Input¶

The input files to run this example are prepared at $TUTORIALS/MD/e10_reaction-rate-SN2, except for the ML_FF file. The machine learning file is quite big, so you can get it linked by typing the following into the terminal:

cd $TUTORIALS/MD/e10_*
ln -s $TUTORIALS/MD/e08_*/ML_FF .

POSCAR.init

transition state
1.0
12.00000000 0.00000000 0.00000000
0.00000000 12.00000000 0.00000000
0.00000000 0.00000000 12.00000000
C H Cl
1 3 2
direct
0.58713224 0.55301018 0.67971407
0.62447013 0.62816585 0.66438574
0.61071465 0.47956073 0.64324619
0.51527572 0.55388286 0.73866022
0.44165343 0.58799096 0.55554878
0.71833270 0.52000018 0.81936197

INCAR.init

SYSTEM   = CH3Cl and Cl anion
ISYM     = 0                    ! no symmetry imposed
POMASS   = 12.011 3.000 35.453  ! ionic masses
KSPACING = 1000                 ! Gamma point only

! MD
IBRION = 0        ! MD (treat ionic degrees of freedom)
NSW    = 600      ! no of ionic steps
POTIM  = 2        ! MD time step in fs

MDALGO = 1             ! Andersen thermostat
ANDERSEN_PROB = 0.05   ! collision probability

TEBEG  = 300      ! temperature at beginning
ISIF   = 2        ! update positions; cell shape and volume fixed

LBLUEOUT = T      ! write free-energy gradient to REPORT file 

! machine learning
ML_LMLFF  = T    ! use machine-learned force fields
ML_MODE   = RUN  ! read ML_AB and ML_FF, no learning

POTCAR

Pseudopotentials of C, H and Cl.

ICONST

R 1 5 0
R 1 6 0
S 1. -1.  0

ML_FF

Force field parameters.

The POSCAR file defines a structure that is exactly on the transition state.

The INCAR tags are the same as in Example 9, but INCREM is not set. Combined with the settings of the ICONST file, this is a constrained MD simulation at the transition state.

What is the reaction coordinate? What does POMASS do? What ensemble is defined? Why? Are Kohn-Sham orbitals computed during the simulation?

10.3 Calculation¶

Open a terminal, navigate to this example's directory and run VASP by entering the following:

cd $TUTORIALS/MD/e10_*
bash run.sh

Click here to reveal the run.sh script!

#!/bin/bash

# ensure that this sequence of MD runs is reproducible
cp POSCAR.init POSCAR
cp INCAR.init INCAR
rseed="RANDOM_SEED =         311137787                0                0"
echo $rseed >> INCAR

backupfiles="REPORT vaspout.h5"

for i in 1 2 3 4 5 6 7 8 9 10
do

  # start vasp
  mpirun -np 3 vasp_gam

  # ensure that this sequence of MD runs is reproducible
  rseed=$(grep RANDOM_SEED REPORT |tail -1)
  cp INCAR.init INCAR
  echo $rseed >> INCAR

  # use the last configuration generated in the previous
  # run as initial configuration for the next run
  cp CONTCAR POSCAR

  # backup some important files
  mkdir run$i
  for f in $backupfiles
  do
      cp $f run$i/.
  done
done

This will run 6000 MD steps. Meanwhile, revisit the data of your calculation in Example 8. Extract the value of $\Delta A_{\xi_{\mathrm{ref,R}} \to \xi^*}$! For instance, you might choose $\xi_{\mathrm{ref,R}}=-1.5$Å and use the following code to do that.

import numpy as np
from scipy import integrate

filename = "./e08_slow-growth-SN2/free-energy_gradient.dat"

# read the free-energy gradients and reaction coordinate from file
xi, dA = np.loadtxt(filename, unpack=True)

# integrate the free-energy gradients over the reaction coordinate as in Eq (1)
DA = integrate.cumulative_trapezoid(dA, xi, initial=0)

# this part is new
# specify the reference reaction coordinate
xi_reference = -1.5

# take the mean of all values of the free energy very close to the transition state at xi=0
DA_transition = np.mean([DA[i] for i in np.where(np.abs(xi) < 1e-3 )])
# same for the reference state 
DA_reference = np.mean([DA[i] for i in np.where(np.abs(xi-xi_reference) < 1e-3 )])

print("Work needed to get from the reference state (at -1.5 Angstrom) to the transition state: ")
print(DA_transition - DA_reference, "eV")

Work needed to get from the reference state (at -1.5 Angstrom) to the transition state: 
0.40579890836676114 eV

Also revisit Example 9 and get $P(\xi_{\mathrm{ref,R}})$!

Click to see the answer!

Considering the interval $[-1.5Å,-1.49Å]$, $P(\xi')\approx1.54$.

Within the SHAKE algorithm, the generalized velocity is computed as follows: From the inverse mass metric tensor $$\tag{4} Z = \sum_{i=1}^N \frac{1}{m_i} \sum_{\mu=x,y,z} \left( \frac{\partial \xi}{\partial R_{i,\mu}}\right)^2 $$ the generalized velocity reads: $$\tag{5} \langle | \dot{\xi}^*| \rangle = \sqrt{ \frac{ 2k_{ \mathrm{B} } T }{ \pi } } \frac{1}{ \langle Z^{-1/2} \rangle_{\xi^*} }. $$ This is done by executing the following script:

cd $TUTORIALS/MD/e10_*
bash velocity.sh 300 > velocity.dat

Click here to reveal the velocity.sh script!

#!/bin/bash

T=$1
rm -f tmp.tmp

for i in 1 2 3 4 5 6 7 8 9 10
do
  if test -f run$i/REPORT
  then
    grep b_m run$i/REPORT |awk '{print $3}' >> tmp.tmp
    # unbiased absolute value of velocity
    vAbs_u=$(awk 'BEGIN {t="'${T}'";a=0.;b=0.;f=(1.323296617200639e+23*t)^0.5} {a+=$1*$1;b+=$1} END {a/=NR;b/=NR;print 2*f/b}' tmp.tmp )
    echo $i $vAbs_u
  fi
done

rm tmp.tmp

Plot the computed generalized velocity over number of MD steps, in order to judge the convergence of your result!

import numpy as np
from py4vasp import plot

filename = "./e10_reaction-rate-SN2/velocity.dat"

# read the reaction coordinate xi from file
n, v = np.loadtxt(filename, unpack=True)

# plot the free-energy profile
plot(n*600, v, xlabel="Number of MD steps", ylabel="Generalized velocity")

# final value of the generalized velocity
print(v[-1])

7842540000000.0

For the purpose of this tutorial, the convergence is good enough. In your own calculations, you might want to closely monitor the convergence of all computed quantities with respect to the number of MD steps, the size of the MD step in fs and other parameters that can affect the quality of your result! For instance, you can use the Mann-Kendall trend test to ensure your data does not have a trend, i.e., the average does not drift in time.

The code below computes the reaction rate constant $k_{\mathrm{R}\to \mathrm{P}}$ defined in Eq. (1) and the free energy of activation $\Delta A^\ddagger$ defined in Eq. (2)

import numpy as np

# generalized velocity divided by 2
vhalf = 0.5* v[-1]

# probability density of the reference state
P = 1.54

# free energy difference between the reference state and the transition state (KJ/mol)
DA = DA_transition - DA_reference

# Boltzmann constant (eV/K)
kB = 8.617333262145e-5

# Boltzmann constant x temperature
kBT = kB*300

# Equation (1)
k = vhalf*P*np.exp(-DA/kBT)

# Planck constant (eVs)
h = 4.135667696e-15

# Equation (2)
DA_expt = DA - kBT*np.log(h/kBT*vhalf*P)

print("Reaction rate constant: ", k)
print("Free energy of activation: ", DA_expt)

Reaction rate constant:  920080.544706275
Free energy of activation:  0.40669186674065694

These values can be compared to values obtained by ab-initio MD simulations for this system found in literature. The experimental value of the activation energy is predicted to be 0.4182 eV by ab-initio MD using the Blue moon method. Note that small deviations in $\Delta A^\ddagger$ have a large effect on $k_{\mathrm{R}\to \mathrm{P}}$, so this is not usually compared.

10.4 Questions¶

How can the generalized velocity be extracted from VASP output?
How many molecular dynamics steps are necessary to get a good ensemble average?
What steps are necessary to estimate the experimental activation energy using a molecular dynamics simulation?

Impressive! You have finished Part 3!¶

Go to Top $\uparrow$

Part 3: Substitution reaction of chloromethane by a chloride ion¶

Content¶

8 Slow-growth simulation of the free-energy profile $\uparrow$¶

8.1 Task¶

8.2 Input¶

8.3 Calculation¶

8.4 Questions¶

9 Probability distribution of the reactant state $\uparrow$¶

9.1 Task¶

9.2 Input¶

9.3 Calculation¶

9.4 Questions¶

10 Average velocity at the transition state $\uparrow$¶

10.1 Task¶

10.2 Input¶

10.3 Calculation¶

10.4 Questions¶

Impressive! You have finished Part 3!¶

8 Slow-growth simulation of the free-energy profile
$\uparrow$
¶

9 Probability distribution of the reactant state
$\uparrow$
¶

10 Average velocity at the transition state
$\uparrow$
¶