I am a PhD who started simulating KOH (aq) 5 months ago. My experience is in molecular dynamics, mostly with LAMMPS, but now I am moving into QM simulations. This is mostly out of interest for MLFFs, which might provide my field with many useful insights. However, my question is regarding a pure AIMD calculation, as the QM should be set up correctly before starting an ML simulation. To introduce my research first: I am modeling a system of 110 H2O and 1 KOH molecules in liquid phase. My goal is to calculate the transport properties of this mixture. This is difficult with traditional MD as the OH- reacts with H2O molecules and reacting force fields are typically inaccurate. This reaction is similar to the more famous process that H+ undergoes in water, both are typically called Grotthuss transfer reactions. I want to stress that I am not interested in the most accurate QM calculations. My aim is to have adequate atomic forces to model the atomic diffusion and the reactions, but to optimize for calculation speed, as AIMD simulations for transport properties need over 100 000 time steps and computation time is scarce.
Now let's get to my simulation settings. I have (as per instructions) attached a *.zip directory with my simulation inputs. I am running my simulations with the following POTCARs: PAW_PBE H_h_GW 12Apr2008, PAW_PBE O_GW_new 19Mar2012, and PAW_PBE K_sv_GW 31Mar2010. I selected these as I aim to use two different functionals: the GGA functional RPBE+DFT-D3 (as this is a good reference functional and there is ample experience with water systems and this functional) and the metaGGA rVV10-r2SCAN functional (as literature states that this functional should be very good at modeling liquid systems). This means that the O and K POTCARs should support metaGGA functionals, these do. I added both INCARs to the *.zip. I set ENCUT to 550 eV after checking how the properties developed for both functionals, see figure Determining_ENCUT.png. In a similar method, I have set the smearing width and found the number of k-points (gamma mesh with 1k-point). I believe that I have set both functionals correctly (with their respective VdW correction), but please provide some feedback if you notice a mistake here. I set the time step to 0.5 fs and applied the Nosé-Hoover thermostat. Although I want to model 298K, I increased the temperature a bit. I have seen research indicate that the melting temperature of pure water is somewhat elevated with both RPBE-D3 and rVV10-r2SCAN and I want to ensure that I am modeling a liquid phase system.
During my simulations, I encountered 3 problems:
- The most easy one is probably the energy drift I found in my systems. For Nosé-Hoover, the pseudo-Hamiltonian energy (where we include the thermostat energies as well) should be purely conserved. However, I find a drift of approximately 5eV for my entire system during 1.25 ps, this drift also continues for much longer simulation times. I added a figure showing these drifts as Pseudo_Hamiltionian.png I assume that this is caused by not minimizing the convergence loop enough. Changing the EDIFF to 1e-5 or 1e-6 could help. To get adequate trajectories, can I have some advice, what is a good tradeoff for speed/accuracy here?
- A little more tricky is the issues I have with the rVV10-r2SCAN functional. It does not want to converge, I tried multiple ALGO and PREC settings. The simulation output file of an example of this is added under the name r2SCAN.out. The strange part is that my pure H2O system does converge. This is a smaller system (64 H2O molecules), and I used this to get my settings correct. Somehow, the change in species (potassium is added) and the system size really impacted the convergence. Are there tricks to solve this issue?
- The last issue is a bit more VASP user case oriented. I can run my simulations on 3 different clusters, one with 48 cores Intel Cascade nodes, one with 128 cores AMD Rome nodes, and one with 192 cores AMD Genoa nodes (this feels like a luxury). I investigated the scaling of my VASP install using a different number of cores and compilers. Furthermore, I was quite disappointed with the result and wanted to know if I did something wrong here. I added two figures indicating the calculation time for 1 ps and the parallel efficiency as scaling_efficiency.pdf and scaling_time.pdf, respectively. Please do not be thrown off by the logarithmic axes. These scaling graphs are made using 25 time steps only. These initial time steps are more extensive, so I indicated a calculation with 2500 time steps (1.25p ps) as well. What we can see from these graphs is that the parallel efficiency quickly drops below 30%. I compiled my code depending on the platform, but all of them at least have the following packs: gcc 11, openmpi, openblas, netlib-scalapack, fftw and hdf5. Additionally, I use the NCORE = 2 up to number-of-cores-per-socket tag in the INCARs. Are there further improvements possible, or are gamma-point only calculations with a single k-point just not that easy to scale up?
With kind regards,