ML tasks
Here we describe the key words controlling the settings of models in input.ml file.
- ml_type (integer)
- 
The type of of ML task. The following options are possible: Regression models ml_type=0Regression using basis functions (e.g. LML, QNML, polynomial chaos, etc.) ml_type=1Kernel ridge regression ml_type=2Reserved only for advanced users Other tasks ml_type=-1Compute descriptors only without any fit ml_type=-2Analyze the data and, if required, make a choice of kernel to use in ml_type=1Default is ml_type=0.
Descriptors only
The following keywords are useful for the tasks with ml_type=-1, where descriptors are computed and written into files without performing regression. Other types of ml_type do not exclude using these options. For the descriptor-specific settings, see the Descriptors section.
- desc_forces (logical)
- 
This key word controlls wheather the descriptors of forces are computed or not. When setting desc_forces=.true., please verify that descriptors of forces are implemented for the descriptor that you choose.Default is desc_forces=.false.
- write_desc (logical)
- 
This key word activates writing descriptors into files. When using ml_type=-1, consider settingwrite_desc=.true.Default is write_desc=.false.
Example: typical commands for the task of computing descriptors (e.g. for structural analysis)
ml_type=-1
write_desc=.true.
desc_forces=.false.
More examples can be found in the Examples section
Regression models
Type of regression
The following options are relevant for the ml_type=0 regression models with basis functions (linear in parameters models) namely:
linear (LML), quandratic (QNML, QML) or polynomial chaos.
The kernel regressions require separate treatment threfore the options for kernel models are presented in separate
section Kernel section.
- mld_order (integer)
- 
Define the type of regression the descriptor space: - mld_order = 1this enable a linear fit (known also as LML) and will gives \(1+D\) parameters
- mld_order = 2this enable a quandratic fit and will gives \(1 + D + D^2\) parameters. In this case there is a supplementary choice about the type of quandratic model by the option- snap_type_quadratic. About the quadratic fit see Goryaeva et al. 2021 for more details.
- mld_order = 3this enable a polynomial chaos type of fitting. Two others parameters should be set:- polyc_n_polyand- polyc_n_hermite. The number of paramters is very large given by this formulae: 1 +- polyc_n_hermite\(\times D\) +- polyc_n_hermite\(\times D^2\) + + … +- polyc_n_hermite\(\times D\) polyc_n_poly
 Default is mld_order = 1.
- mld_type_quadratic (integer)
- 
The type of quadratic fit. For the case 1 the solution is preconditionned by the linear fit i.e. the first 1 + \(D\) are exactelly set to the LML solution and only the remaining \(D^2\) parameters are fitted quadratically. For the case 2 the full quadratic solution is provided, all the parameters are fitted without preconditionning. Shortly speaking mld_type_quadratic=1is for QNML andmld_type_quadratic=2is for QML. See Goryaeva et al. 2021 for more details.Default is mld_type_quadratic = 1.
- polyc_n_poly (integer)
- 
active for polynomial chaos regression i.e. mld_order=3. Is the order of polynomial degree.Default is polyc_n_poly=3.
- polyc_n_hermite (integer)
- 
The maxiumum degree for Hermite polynomials for the polynomial chaos regression. Miladyhandle Hermite basis up to the 4 th order.Default is polyc_n_hermite=2.
Solving algorithm
- mld_fit_type (integer)
- 
The type of algorithm used in order to solve least square (LS) problem \(\mathbf{A} \beta= \mathbf{b}\). We recommend without any hesitation mld_fit_type = 4.\(\mathbf{A}\) is \(M \times D\) matrix, \(M\) being the number of observations and \(D\) the number of parameters (in the case of linear ML the dimension of descriptor + 1), \(\beta\) the parameter matrix \(D \times 1\) and \(\mathbf{b}\) the observations matrix \(M \times 1\). Actually in the MiLaDyimplementation we build \(\mathbf{Amat}\) matrix that has the dimensions \(d \times m\) being in fact \(\mathbf{A}^T\) (with the notation used for this documentation):- mld_fit_type=0: home made solver based on LU / QR decomposition for general matrix. There is no any particular assuption for the size or rank of \(\mathbf{A}\) matrix. All the cases \(M > D\), \(M < D\) and \(M = D\) are treated.
- mld_fit_type=1: solution based on QR decomposition for serial and ScaLapack version. Adapted for full rank matrix \(\mathbf{A}\) and use the assumption that \(\textrm{rank}(A) = \min(M,D)\), in other words, \(A\) has full rank. In serial version, if \(\mathbf{A}\) is not full rank the inversion will stop with a error. Uses a QR or LQ factorization of \(\mathbf{A}\).
- mld_fit_type=2: restricted only for advanced users Solution with constraints. The constraints are of form \(\mathbf{B}x=\mathbf{d}\). The matrix \(\mathbf{B}\) and the vector \(\mathbf{d}\) are filled with all the data (input and target for energy, force or stress) contained in the class fixed by- snap_class_constraints.
- 
mld_fit_type=3: For serial version this is adapted for the general case when we may have \(\textrm{rank}(\mathbf{A}) < \min(M,D)\), in other words, \(\mathbf{A}\) may be rank-deficient, we seek the minimum norm least squares solution \(\beta\) which minimizes both \(\left| \beta \right|^2\) and \(\left| b - A \beta \right|^2\). With this option a rank estimation is possible. The ScaLapack version uses Cholesky decomposition for symmetric and positive definite matrix consequnetly sometimes should be avoided.Warning Avoid this solution for Scalapack version. Is very likely to obtain weird results in the most favorable cases but probably you will have segmetation fault and or NaNas parameters :)).
- mld_fit_type=4: In the general case when we may have \(\textrm{rank}(\mathbf{A}) < \min(M,D)\), in other words, \(\mathbf{A}\) may be rank-deficient, we seek the minimum norm least squares solution \(\beta\) which minimizes both \(\left| \beta \right|^2\) and \(\left| b - A \beta \right|^2\). Is the slowest but is by far mathematically most complete solution based on SVD decomposition. With this option a rank estimation (via SVD and driven by the option- svd_rcond).
 
Default is mld_fit_type=4.
- svd_rcond (real)
- 
The value of the limit from which the singular eigenvalues of the design matrix \((\mathbf{A})\) (or any matrix) are zero. Any eigenvalue lower that this limit is treated as zero. Obviously the value of svd_rcondhas an impact on the \(\textrm{rank}(\mathbf{A})\). If a negative value is choosen thenMiladyfixsvd_rcond=100.d0*epsilon(1.d0)where epsilon is the Lapack machine precision function (around1.d-15in most of the cases).Default is svd_rcond=-1.
- snap_class_constraints (character)
- 
The class that imposes the constraints on fit. Is active only if mld_fit_type=2. All the configuration mentioned in this class will fill the constraints matrix \(\mathbf{B}\) and the target vector \(\mathbf{d}\).Default is "07".
- write_design_matrix (logical)
- 
Dump the design matrix, which contains the descriptors (energy, force and stress if mentioned in db_model.in) and the weigths. The design matrix is writen design in filedesign_matrix.dat. This matrix has the dimensionnumber_of_data\(\times\)dim_desc, where number_of_data are the number of data points (energy, force or stress) used to fit the potential anddim_descis the dimension of descriptor. And additional filedesign_matrix.infois written with information about the database used for trainning.This option is active for normal trainning ( ml_type >= 0).Warning The design matrix is written as the transpose of the matrix used internally by miladyFormat of the design_matrix.dat : - The first 3 lines are comments 
- Then are - number_of_datalines with- dim_desccolumns.
 Format of the design_matrix.info : - The first 5 lines are comments. 
- Then there are - number_of_datalines with- 4columns.
- The first column is a tag that indicates whether the corresponding line comes from energy (1), force (2), or stress (3). 
- The second column gives the target value (energy, force, or stress) used for training. 
- The third column indicates the weight used for that data point in the loss function. 
- The fourth column indicates the source file of that data point. 
 Default write_design_matrix=.false.
Regularization and loss
- mld_regularization_type (integer)
- 
- mld_regularization_type=0: no regularization
- 
mld_regularization_type=1: applies the regularized solution of parameters, \(\mathbf{w}(\lambda_{krr})\) found by the Moore-Penrose inversion:\[\mathbf{w}(\lambda_{krr})= \left( \mathbf{A}^T \mathbf{A} + \lambda_{rr} \mathbf{I} \right)^{-1} \mathbf{A}^T \mathbf{y}\]The properties of the logarithmic search grid of \(\lambda_{krr}\) are defined by the following parameters: - 
real,min_lambda_krrandmax_lambda_krrthe min and max of the logarithmic grid. If one of them is negative then an automatic grid with 21 points is set-up between1.d-10and1.d+10.Defaultmin_lambda_krr =1.d-10andmax_lambda_krr =1.d+10.
- 
integer,n_values_lambda_krrthe number of points on grid.Defaultn_values_lambda_krr=21
 
- 
 Default is 0.
- type_of_loss (integer)
- 
This option defines the type of the loss function. It can have values 1, 2 or 3. The loss function has the following four terms: \[J(\mathbf{w}) = J_E(\mathbf{w}) + J_F(\mathbf{w}) + J_S(\mathbf{w}) + R(\mathbf{w}, \lambda) \, ,\]for energy, forces, stress losses and regularization, respectively. We have implemented three types of losses. Here are the details for each of them. The energy part of loss: \[\begin{split}\begin{aligned} J_E^1(\mathbf{w}) & = & \frac{1}{2} \sum_{m_E=1}^{M_E} \omega_{m_E}\left( E_{m_E} - \hat{E}_{m_E}(\mathbf{w}) \right)^2 \\ J_E^2(\mathbf{w}) & = & \frac{1}{2} \sum_{m_E=1}^{M_E} \frac{ \omega_{m_E}} {M_E} \left( E_{m_E} - \hat{E}_{m_E}(\mathbf{w}) \right)^2 \\ J_E^3(\mathbf{w}) & = & \frac{1}{2} \sum_{m_E=1}^{M_E} \frac{ \omega_{m_E}} {M_E} \left( \frac{E_{m_E} - \hat{E}_{m_E}(\mathbf{w})}{N_{m_E}} \right)^2\end{aligned}\end{split}\]where \(M_E\) are the number of energy configuration included in the train database whilst \(\omega_{m_E}\) (the one which is defined in the db_model.infile) and \(N_{m_E}\) are the weights and the number of atoms of the \(m_E^{\textrm{th}}\) configuration.In the case of forces: \[\begin{split}\begin{aligned} J_F^1(\mathbf{w}) & = & \frac{1}{2} \sum_{m_F=1}^{M_F} \omega_{m_F}\left( f_{m_E} - \hat{f}_{m_E}(\mathbf{w}) \right)^2 \\ J_F^2(\mathbf{w}) & = & \frac{1}{2} \sum_{m_f=1}^{M_F} \frac{ \omega_{m_F}} {M_F} \left( f_{m_F} - \hat{f}_{m_F}(\mathbf{w}) \right)^2 \\ J_F^3(\mathbf{w}) & = & \frac{1}{2} \frac{1}{M_{F,S}} \sum_{s=1}^{M_{F,S}} \sum_{a=1}^{N_s} \frac{1}{3 N_{s}} \left( f_{s,a} - \hat{f}_{s,a}(\mathbf{w}) \right)^2\end{aligned}\end{split}\]where \(\omega_{m_F}\) is the weight of the \(m_F^{\textrm{th}}\) point in the forces database of a total of \(M_F\) datapoints. \(M_{F,S}\) is the number of systems, which contain forces that should be fitted, \(s\) is some order nummber of the system and \(N_s\) is the the number of atoms in that \(s\) system. In the case of stress: \[\begin{split}\begin{aligned} J_S^1(\mathbf{w}) & = & \frac{1}{2} \sum_{m_S=1}^{M_S} \omega_{m_S}\left( \sigma_{m_S} - \hat{\sigma}_{m_S}(\mathbf{w}) \right)^2 \\ J_S^2(\mathbf{w}) & = & \frac{1}{2} \sum_{m_S=1}^{M_S} \frac{ \omega_{m_S}} {M_S} \left( \sigma_{m_S} - \hat{\sigma}_{m_S}(\mathbf{w}) \right)^2 \\ J_F^3(\mathbf{w}) & = & \frac{1}{2} \frac{1}{M_{S,S}} \sum_{s=1}^{M_{S,S}} \sum_{\alpha=1}^{6} \frac{1}{6} \left( \sigma_{s, \alpha} - \hat{\sigma}_{s, \alpha}(\mathbf{w}) \right)^2\end{aligned}\end{split}\]where the above notations have the same meaning as in the case of forces. \(M_S\) denotes the number of datapoints of stress observables (6 points for each system). \(M_{S,S}\) denotes the number of systems that have stress information. and \(\sigma_{s, \alpha}\) is one component of the stress in the particular system \(s\). Default is 1.
- train_only (loginal)
- 
Only the trainning is performed. No tests at all. This option is tested only in the case of ml_type=0. target vector \(d\).Default is .false..