ML FFN: Difference between revisions

From VASP Wiki
No edit summary
No edit summary
 
(6 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{DISPLAYTITLE:ML_FFN}}
{{DISPLAYTITLE:ML_FFN}}  
This binary file contains a newly created force field from machine learning runs with the options {{TAG|ML_MODE}}=<code>train</code>, <code>refit</code> or <code>select</code>. It's structure is identical to the {{TAG|ML_FF}} file. To be able to use the new force field in the {{FILE|ML_FFN}} file it has to be simply copied to {{FILE|ML_FF}} and the {{FILE|INCAR}} tag {{TAG|ML_MODE}}=<code>run</code> has to be set (see [[Machine_learning_force_field_calculations:_Basics#Step-by-step_instructions|here]] for the basic workflow).
The {{FILE|ML_FFN}} file is a binary file that contains a force field generated from [[Machine learning force field calculations: Basics|machine-learning runs]] with the options {{TAG|ML_MODE|train, refit or select}}. {{FILE|ML_FFN}} and {{FILE|ML_FF}} file structures are identical, just the former file name is used for {{VASP}} output files while the latter is the expected input file name. In order to apply a force field, copy the {{FILE|ML_FFN}} file to {{FILE|ML_FF}} and set {{TAG|ML_MODE|run}} in the {{FILE|INCAR}} file. See [[Machine_learning_force_field_calculations:_Basics#Step-by-step_instructions|machine learning force field calculations: Basics]] for the basic workflow.
{{NB|mind|Available as of {{VASP}} 6.3.0. File header and fast prediction mode available as of {{VASP}} 6.4.0}}
== File header ==


Since VASP 6.4.0 the {{FILE|ML_FFN}} file starts with an ASCII header containing the most important INCAR tags in effect during generation of this force field. In Linux shells this can be easily extracted issuing the following command:
Since VASP 6.4.0 the {{FILE|ML_FFN}} file starts with an ASCII header containing the most important INCAR tags in effect during generation of this force field. In Linux shells this can be easily extracted issuing the following command:
  head -n 1 ML_FFN
  head -n 1 ML_FFN
The output may look like this:
The output may look like this:
  ML_FF 0.2.1 binary { "date" : "2023-03-16T13:49:44.829", "ML_LFAST" : False, "ML_DESC_TYPE" :  0, "types" : [ "Si" ], "training_structures" : 984, "local_reference_cfgs" : [ 110 ], "descriptors" : [ 142 ], "ML_IALGO_LINREG" : 3, "ML_RCUT1" :  6.0000E+00, "ML_RCUT2" :  6.0000E+00, "ML_W1" :  1.0000E-01, "ML_SION1" :  5.0000E-01, "ML_SION2" :  5.0000E-01, "ML_LMAX2" : 4, "ML_MRB1" : 8, "ML_MRB2" : 8, "ML_IWEIGHT" : 3, "ML_WTOTEN" :  1.0000E+00, "ML_WTIFOR" :  1.0000E+00, "ML_WTSIF" :  1.0000E-10 }
  ML_FF 0.2.1 binary { "date" : "2023-03-16T13:49:44.829", "ML_LFAST" : false, "ML_DESC_TYPE" :  0, "types" : [ "Si" ], "training_structures" : 984, "local_reference_cfgs" : [ 110 ], "descriptors" : [ 142 ], "ML_IALGO_LINREG" : 3, "ML_RCUT1" :  6.0000E+00, "ML_RCUT2" :  6.0000E+00, "ML_W1" :  1.0000E-01, "ML_SION1" :  5.0000E-01, "ML_SION2" :  5.0000E-01, "ML_LMAX2" : 4, "ML_MRB1" : 8, "ML_MRB2" : 8, "ML_IWEIGHT" : 3, "ML_WTOTEN" :  1.0000E+00, "ML_WTIFOR" :  1.0000E+00, "ML_WTSIF" :  1.0000E-10 }
followed by some extra spaces (because the header is always 4096 characters long). After the string <code>ML_FF</code> the header contains the file version number. {{VASP}} machine-learned force field files are versioned since {{VASP}} 6.4.0 to allow for compatibility checks. Typically, {{FILE|ML_FFN}} files are backward compatible, i.e., newer versions of {{VASP}} will be able to read files generated by older {{VASP}} versions. The opposite is generally not true. However, the {{FILE|ML_FFN}} version number does not automatically increase with the {{VASP}} version, so two or more consecutive {{VASP}} versions could create {{FILE|ML_FFN}} files with identical version number. The version number in the header is followed by the word <code>binary</code>. The remainder of the header contains a [https://www.json.org JSON] string with general information about the force field and values of selected {{FILE|INCAR}} tags. For example, the timestamp following the <code>"date"</code> key is also written to the <code>FFOUT</code> lines in {{FILE|ML_LOGFILE}} which helps to identify later on which force field files belong to which log files. Another important piece of information is <code>"ML_LFAST"</code> key which allows you to check whether this force field is ready for fast prediction mode.
followed by some extra spaces (because the header is always 4096 characters long). After the string <code>ML_FF</code> the header contains the file version number. {{VASP}} machine-learned force field files are versioned since {{VASP}} 6.4.0. This is necessary because whenever a new feature related to machine-learned force fields is available the contents of the generated files could change. Hence, when reading an {{FILE|ML_FF}} file a {{VASP}} executable deduces the possible features with the help of the file's version number, i.e., performs a compatibility check. Typically, {{FILE|ML_FFN}} files are backward compatible, i.e., newer versions of {{VASP}} will be able to read files generated by older {{VASP}} versions. The opposite is generally not true: if you provide an {{FILE|ML_FF}} file to an older {{VASP}} version you may receive an error message stating that the file is too new and not supported. However, the {{FILE|ML_FFN}} version number does not automatically increase with the {{VASP}} version, so two or more consecutive {{VASP}} versions could create {{FILE|ML_FFN}} files with identical version number. The following table gives an overview of existing {{FILE|ML_FFN}} file versions:
{| class="wikitable"
|-
! VASP<br />version
! ML_FFN<br />version
! Changes
|-
| 6.3.X
| 0.1.0
| Initial version (no file header)
|-
| 6.4.0
| 0.2.0
| Fast prediction mode, ASCII header
|-
| 6.4.1
| 0.2.1
| Support for {{TAG|ML_DESC_TYPE}}
|-
| 6.4.2
| 0.2.1
|
|-
| 6.4.3
| 0.2.2
| Changes for {{TAG|ML_DESC_TYPE}}
|-
| 6.5.0
| 0.2.4
| Support for [[Best_practices_for_machine-learned_force_fields#Spilling_factor:_error_estimates_during_production_runs|spilling factor]]
|-
| 6.5.1
| 0.2.4
|
|}
The version number in the header is followed by the word <code>binary</code>. The remainder of the header contains a [https://www.json.org JSON] string with general information about the force field and values of selected {{FILE|INCAR}} tags. {{NB|tip|The timestamp following the <code>"date"</code> key is also written to the <code>FFOUT</code> lines in {{FILE|ML_LOGFILE}} which helps to identify later on which force field files belong to which log files.}}
== Fast prediction mode ==
 
As of {{VASP}} 6.4.0 a crucial step to achieve optimal execution speeds when a force field is deemed ready for production is to apply the "refit" mode ({{TAG|ML_MODE|refit}}, this is also outlined in the [[Machine_learning_force_field_calculations:_Basics#Step-by-step_instructions|step-by-step instructions]]). Similarly to training (continuation) ({{TAG|ML_MODE|train}}) or selection ({{TAG|ML_MODE|select}}) runs there will be a new {{FILE|ML_FFN}} generated. However, in contrast to the other options the force field from refitting is ready for use of fast prediction mode during prediction-only runs ({{TAG|ML_MODE|run}}). We can query the <code>"ML_LFAST"</code> key-value pair in the JSON part of the file header to check whether fast prediction mode is enabled:
* <code>"ML_LFAST" : true</code>: The {{FILE|ML_FFN}} file is ready for the fast prediction mode, no Bayesian error estimates are possible (no <code>BEEF</code> lines in {{FILE|ML_LOGFILE}}).
* <code>"ML_LFAST" : false</code>: Fast prediction mode is not possible, {{TAG|ML_MODE|run}} runs much slower but provides Bayesian error estimate output in {{FILE|ML_LOGFILE}}.
If the fast prediction mode is available, {{VASP}} will automatically use it because this is highly recommended for production runs. However, if refitting is performed for other reasons (e.g. [[Best_practices_for_machine-learned_force_fields#Retraining_with_re-selection_of_local_reference_configurations|hyper-parameter optimization]]) it may be useful to create a "slow" force field which allows to compute Bayesian error estimates. This can be achieved with the alternative refitting mode {{TAG|ML_MODE|refitbayesian}}.
 
== Related tags and articles ==
{{TAG|ML_LMLFF}}, {{TAG|ML_MODE}}, {{TAG|ML_LFAST}}, {{TAG|ML_ESTBLOCK}}, {{FILE|ML_LOGFILE}}
----
----


[[Category:Files]][[Category:Machine-learned force fields]][[Category:Output files]]
[[Category:Files]][[Category:Machine-learned force fields]][[Category:Output files]]

Latest revision as of 07:57, 24 October 2025

The ML_FFN file is a binary file that contains a force field generated from machine-learning runs with the options ML_MODE = train, refit or select. ML_FFN and ML_FF file structures are identical, just the former file name is used for VASP output files while the latter is the expected input file name. In order to apply a force field, copy the ML_FFN file to ML_FF and set ML_MODE = run in the INCAR file. See machine learning force field calculations: Basics for the basic workflow.

Mind: Available as of VASP 6.3.0. File header and fast prediction mode available as of VASP 6.4.0

File header

Since VASP 6.4.0 the ML_FFN file starts with an ASCII header containing the most important INCAR tags in effect during generation of this force field. In Linux shells this can be easily extracted issuing the following command:

head -n 1 ML_FFN

The output may look like this:

ML_FF 0.2.1 binary { "date" : "2023-03-16T13:49:44.829", "ML_LFAST" : false, "ML_DESC_TYPE" :   0, "types" : [ "Si" ], "training_structures" : 984, "local_reference_cfgs" : [ 110 ], "descriptors" : [ 142 ], "ML_IALGO_LINREG" : 3, "ML_RCUT1" :  6.0000E+00, "ML_RCUT2" :  6.0000E+00, "ML_W1" :  1.0000E-01, "ML_SION1" :  5.0000E-01, "ML_SION2" :  5.0000E-01, "ML_LMAX2" : 4, "ML_MRB1" : 8, "ML_MRB2" : 8, "ML_IWEIGHT" : 3, "ML_WTOTEN" :  1.0000E+00, "ML_WTIFOR" :  1.0000E+00, "ML_WTSIF" :  1.0000E-10 }

followed by some extra spaces (because the header is always 4096 characters long). After the string ML_FF the header contains the file version number. VASP machine-learned force field files are versioned since VASP 6.4.0. This is necessary because whenever a new feature related to machine-learned force fields is available the contents of the generated files could change. Hence, when reading an ML_FF file a VASP executable deduces the possible features with the help of the file's version number, i.e., performs a compatibility check. Typically, ML_FFN files are backward compatible, i.e., newer versions of VASP will be able to read files generated by older VASP versions. The opposite is generally not true: if you provide an ML_FF file to an older VASP version you may receive an error message stating that the file is too new and not supported. However, the ML_FFN version number does not automatically increase with the VASP version, so two or more consecutive VASP versions could create ML_FFN files with identical version number. The following table gives an overview of existing ML_FFN file versions:

VASP
version
ML_FFN
version
Changes
6.3.X 0.1.0 Initial version (no file header)
6.4.0 0.2.0 Fast prediction mode, ASCII header
6.4.1 0.2.1 Support for ML_DESC_TYPE
6.4.2 0.2.1
6.4.3 0.2.2 Changes for ML_DESC_TYPE
6.5.0 0.2.4 Support for spilling factor
6.5.1 0.2.4

The version number in the header is followed by the word binary. The remainder of the header contains a JSON string with general information about the force field and values of selected INCAR tags.

Tip: The timestamp following the "date" key is also written to the FFOUT lines in ML_LOGFILE which helps to identify later on which force field files belong to which log files.

Fast prediction mode

As of VASP 6.4.0 a crucial step to achieve optimal execution speeds when a force field is deemed ready for production is to apply the "refit" mode (ML_MODE = refit, this is also outlined in the step-by-step instructions). Similarly to training (continuation) (ML_MODE = train) or selection (ML_MODE = select) runs there will be a new ML_FFN generated. However, in contrast to the other options the force field from refitting is ready for use of fast prediction mode during prediction-only runs (ML_MODE = run). We can query the "ML_LFAST" key-value pair in the JSON part of the file header to check whether fast prediction mode is enabled:

  • "ML_LFAST" : true: The ML_FFN file is ready for the fast prediction mode, no Bayesian error estimates are possible (no BEEF lines in ML_LOGFILE).
  • "ML_LFAST" : false: Fast prediction mode is not possible, ML_MODE = run runs much slower but provides Bayesian error estimate output in ML_LOGFILE.

If the fast prediction mode is available, VASP will automatically use it because this is highly recommended for production runs. However, if refitting is performed for other reasons (e.g. hyper-parameter optimization) it may be useful to create a "slow" force field which allows to compute Bayesian error estimates. This can be achieved with the alternative refitting mode ML_MODE = refitbayesian.

Related tags and articles

ML_LMLFF, ML_MODE, ML_LFAST, ML_ESTBLOCK, ML_LOGFILE