.. _`sec:models`: ML tasks ========= Here we describe the key words controlling the settings of models in ``input.ml`` file. .. option:: ml_type (integer) The type of of ML task. The following options are possible: +------------------------------------------------------------+ | **Regression models** | +---------------+--------------------------------------------+ | ``ml_type=0`` | Regression using basis functions | | | (e.g. LML, QNML, polynomial chaos, etc.) | +---------------+--------------------------------------------+ | ``ml_type=1`` | Kernel ridge regression | +---------------+--------------------------------------------+ | ``ml_type=2`` | *Reserved only for advanced users* | +---------------+--------------------------------------------+ | **Other tasks** | +---------------+--------------------------------------------+ | ``ml_type=-1``| Compute descriptors only without any fit | +---------------+--------------------------------------------+ | ``ml_type=-2``| Analyze the data and, if required, make | | | a choice of kernel to use in ``ml_type=1`` | +---------------+--------------------------------------------+ Default is ``ml_type=0``. .. _`sec: desc_only`: Descriptors only ^^^^^^^^^^^^^^^^ The following keywords are useful for the tasks with ``ml_type=-1``, where descriptors are computed and written into files without performing regression. Other types of ``ml_type`` do not exclude using these options. For the descriptor-specific settings, see the :ref:`Descriptors section`. .. option:: desc_forces (logical) This key word controlls wheather the descriptors of forces are computed or not. When setting ``desc_forces=.true.``, please verify that descriptors of forces are implemented for the :ref:`descriptor` that you choose. Default is ``desc_forces=.false.`` .. option:: write_desc (logical) This key word activates writing descriptors into files. When using ``ml_type=-1``, consider setting ``write_desc=.true.`` Default is ``write_desc=.false.`` **Example:** typical commands for the task of computing descriptors (e.g. for structural analysis) .. code-block:: fortran ml_type=-1 write_desc=.true. desc_forces=.false. More examples can be found in the :ref:`Examples section` .. _`sec: regression`: Regression models ================= Type of regression ^^^^^^^^^^^^^^^^^^ The following options are relevant for the ``ml_type=0`` regression models with basis functions (linear in parameters models) namely: linear (LML), quandratic (QNML, QML) or polynomial chaos. The kernel regressions require separate treatment threfore the options for kernel models are presented in separate section :ref:`Kernel section`. .. option:: mld_order (integer) Define the type of regression the descriptor space: - ``mld_order = 1`` this enable a linear fit (known also as LML) and will gives :math:`1+D` parameters - ``mld_order = 2`` this enable a quandratic fit and will gives :math:`1 + D + D^2` parameters. In this case there is a supplementary choice about the type of quandratic model by the option ``snap_type_quadratic``. About the quadratic fit see Goryaeva et al. 2021 for more details. - ``mld_order = 3`` this enable a polynomial chaos type of fitting. Two others parameters should be set: ``polyc_n_poly`` and ``polyc_n_hermite``. The number of paramters is very large given by this formulae: 1 + ``polyc_n_hermite`` :math:`\times D` + ``polyc_n_hermite`` :math:`\times D^2` + + ... + ``polyc_n_hermite`` :math:`\times D` :sup:`polyc\_n\_poly` Default is ``mld_order = 1``. .. option:: mld_type_quadratic (integer) The type of quadratic fit. For the case 1 the solution is preconditionned by the linear fit i.e. the first 1 + :math:`D` are exactelly set to the LML solution and only the remaining :math:`D^2` parameters are fitted quadratically. For the case 2 the full quadratic solution is provided, all the parameters are fitted without preconditionning. Shortly speaking ``mld_type_quadratic=1`` is for QNML and ``mld_type_quadratic=2`` is for QML. See Goryaeva et al. 2021 for more details. Default is ``mld_type_quadratic = 1``. .. option:: polyc_n_poly (integer) active for polynomial chaos regression i.e. ``mld_order=3``. Is the order of polynomial degree. Default is ``polyc_n_poly=3``. .. option:: polyc_n_hermite (integer) The maxiumum degree for Hermite polynomials for the polynomial chaos regression. ``Milady`` handle Hermite basis up to the 4 :sup:`th` order. Default is ``polyc_n_hermite=2``. Solving algorithm ^^^^^^^^^^^^^^^^^ .. option:: mld_fit_type (integer) The type of algorithm used in order to solve least square (LS) problem :math:`\mathbf{A} \beta= \mathbf{b}`. We recommend without any hesitation ``mld_fit_type = 4``. :math:`\mathbf{A}` is :math:`M \times D` matrix, :math:`M` being the number of observations and :math:`D` the number of parameters (in the case of linear ML the dimension of descriptor + 1), :math:`\beta` the parameter matrix :math:`D \times 1` and :math:`\mathbf{b}` the observations matrix :math:`M \times 1`. Actually in the ``MiLaDy`` implementation we build :math:`\mathbf{Amat}` matrix that has the dimensions :math:`d \times m` being in fact :math:`\mathbf{A}^T` (with the notation used for this documentation): #. ``mld_fit_type=0``: home made solver based on LU / QR decomposition for general matrix. There is no any particular assuption for the size or rank of :math:`\mathbf{A}` matrix. All the cases :math:`M > D`, :math:`M < D` and :math:`M = D` are treated. #. ``mld_fit_type=1``: solution based on QR decomposition for serial and ScaLapack version. Adapted for full rank matrix :math:`\mathbf{A}` and use the assumption that :math:`\textrm{rank}(A) = \min(M,D)`, in other words, :math:`A` has full rank. In serial version, if :math:`\mathbf{A}` is not full rank the inversion will stop with a error. Uses a QR or LQ factorization of :math:`\mathbf{A}`. #. ``mld_fit_type=2``: **restricted only for advanced users** Solution with constraints. The constraints are of form :math:`\mathbf{B}x=\mathbf{d}`. The matrix :math:`\mathbf{B}` and the vector :math:`\mathbf{d}` are filled with all the data (input and target for energy, force or stress) contained in the class fixed by ``snap_class_constraints``. #. ``mld_fit_type=3``: For serial version this is adapted for the general case when we may have :math:`\textrm{rank}(\mathbf{A}) < \min(M,D)`, in other words, :math:`\mathbf{A}` may be rank-deficient, we seek the minimum norm least squares solution :math:`\beta` which minimizes both :math:`\left| \beta \right|^2` and :math:`\left| b - A \beta \right|^2`. With this option a rank estimation is possible. The ScaLapack version uses Cholesky decomposition for symmetric and positive definite matrix consequnetly sometimes should be avoided. .. warning:: Avoid this solution for Scalapack version. Is very likely to obtain weird results in the most favorable cases but probably you will have segmetation fault and or ``NaN`` as parameters :)). #. ``mld_fit_type=4``: In the general case when we may have :math:`\textrm{rank}(\mathbf{A}) < \min(M,D)`, in other words, :math:`\mathbf{A}` may be rank-deficient, we seek the minimum norm least squares solution :math:`\beta` which minimizes both :math:`\left| \beta \right|^2` and :math:`\left| b - A \beta \right|^2`. Is the slowest but is by far mathematically most complete solution based on SVD decomposition. With this option a rank estimation (via SVD and driven by the option ``svd_rcond``). Default is ``mld_fit_type=4``. .. option:: svd_rcond (real) The value of the limit from which the singular eigenvalues of the design matrix :math:`(\mathbf{A})` (or any matrix) are zero. Any eigenvalue lower that this limit is treated as zero. Obviously the value of ``svd_rcond`` has an impact on the :math:`\textrm{rank}(\mathbf{A})`. If a negative value is choosen then ``Milady`` fix ``svd_rcond=100.d0*epsilon(1.d0)`` where epsilon is the Lapack machine precision function (around ``1.d-15`` in most of the cases). Default is ``svd_rcond=-1``. .. option:: snap_class_constraints (character) The class that imposes the constraints on fit. Is active only if ``mld_fit_type=2``. All the configuration mentioned in this class will fill the constraints matrix :math:`\mathbf{B}` and the target vector :math:`\mathbf{d}`. Default is ``"07"``. .. option:: write_design_matrix (logical) Dump the design matrix, which contains the descriptors (energy, force and stress if mentioned in ``db_model.in``) and the weigths. The design matrix is writen design in file ``design_matrix.dat``. This matrix has the dimension ``number_of_data`` :math:`\times` ``dim_desc``, where number_of_data are the number of data points (energy, force or stress) used to fit the potential and ``dim_desc`` is the dimension of descriptor. And additional file ``design_matrix.info`` is written with information about the database used for trainning. This option is active for normal trainning (``ml_type >= 0``). .. warning:: The design matrix is written as the transpose of the matrix used internally by ``milady`` **Format of the design_matrix.dat :** - The first 3 lines are comments - Then are ``number_of_data`` lines with ``dim_desc`` columns. **Format of the design_matrix.info :** - The first 5 lines are comments. - Then there are ``number_of_data`` lines with ``4`` columns. - The first column is a tag that indicates whether the corresponding line comes from energy (1), force (2), or stress (3). - The second column gives the target value (energy, force, or stress) used for training. - The third column indicates the weight used for that data point in the loss function. - The fourth column indicates the source file of that data point. Default ``write_design_matrix=.false.`` Regularization and loss ^^^^^^^^^^^^^^^^^^^^^^^ .. option:: mld_regularization_type (integer) #. ``mld_regularization_type``\ =0: no regularization #. ``mld_regularization_type``\ =1: applies the regularized solution of parameters, :math:`\mathbf{w}(\lambda_{krr})` found by the Moore-Penrose inversion: .. math:: \mathbf{w}(\lambda_{krr})= \left( \mathbf{A}^T \mathbf{A} + \lambda_{rr} \mathbf{I} \right)^{-1} \mathbf{A}^T \mathbf{y} The properties of the logarithmic search grid of :math:`\lambda_{krr}` are defined by the following parameters: - | ``real``, ``min_lambda_krr`` and ``max_lambda_krr`` the min and max of the logarithmic grid. If one of them is negative then an automatic grid with 21 points is set-up between ``1.d-10`` and ``1.d+10``. | Default ``min_lambda_krr =1.d-10`` and ``max_lambda_krr =1.d+10``. - | ``integer``, ``n_values_lambda_krr`` the number of points on grid. | Default ``n_values_lambda_krr=21`` Default is ``0``. .. option:: type_of_loss (integer) This option defines the type of the loss function. It can have values 1, 2 or 3. The loss function has the following four terms: .. math:: J(\mathbf{w}) = J_E(\mathbf{w}) + J_F(\mathbf{w}) + J_S(\mathbf{w}) + R(\mathbf{w}, \lambda) \, , for energy, forces, stress losses and regularization, respectively. We have implemented three types of losses. Here are the details for each of them. The energy part of loss: .. math:: \begin{aligned} J_E^1(\mathbf{w}) & = & \frac{1}{2} \sum_{m_E=1}^{M_E} \omega_{m_E}\left( E_{m_E} - \hat{E}_{m_E}(\mathbf{w}) \right)^2 \\ J_E^2(\mathbf{w}) & = & \frac{1}{2} \sum_{m_E=1}^{M_E} \frac{ \omega_{m_E}} {M_E} \left( E_{m_E} - \hat{E}_{m_E}(\mathbf{w}) \right)^2 \\ J_E^3(\mathbf{w}) & = & \frac{1}{2} \sum_{m_E=1}^{M_E} \frac{ \omega_{m_E}} {M_E} \left( \frac{E_{m_E} - \hat{E}_{m_E}(\mathbf{w})}{N_{m_E}} \right)^2\end{aligned} where :math:`M_E` are the number of energy configuration included in the train database whilst :math:`\omega_{m_E}` (the one which is defined in the ``db_model.in`` file) and :math:`N_{m_E}` are the weights and the number of atoms of the :math:`m_E^{\textrm{th}}` configuration. In the case of forces: .. math:: \begin{aligned} J_F^1(\mathbf{w}) & = & \frac{1}{2} \sum_{m_F=1}^{M_F} \omega_{m_F}\left( f_{m_E} - \hat{f}_{m_E}(\mathbf{w}) \right)^2 \\ J_F^2(\mathbf{w}) & = & \frac{1}{2} \sum_{m_f=1}^{M_F} \frac{ \omega_{m_F}} {M_F} \left( f_{m_F} - \hat{f}_{m_F}(\mathbf{w}) \right)^2 \\ J_F^3(\mathbf{w}) & = & \frac{1}{2} \frac{1}{M_{F,S}} \sum_{s=1}^{M_{F,S}} \sum_{a=1}^{N_s} \frac{1}{3 N_{s}} \left( f_{s,a} - \hat{f}_{s,a}(\mathbf{w}) \right)^2\end{aligned} where :math:`\omega_{m_F}` is the weight of the :math:`m_F^{\textrm{th}}` point in the forces database of a total of :math:`M_F` datapoints. :math:`M_{F,S}` is the number of systems, which contain forces that should be fitted, :math:`s` is some order nummber of the system and :math:`N_s` is the the number of atoms in that :math:`s` system. In the case of stress: .. math:: \begin{aligned} J_S^1(\mathbf{w}) & = & \frac{1}{2} \sum_{m_S=1}^{M_S} \omega_{m_S}\left( \sigma_{m_S} - \hat{\sigma}_{m_S}(\mathbf{w}) \right)^2 \\ J_S^2(\mathbf{w}) & = & \frac{1}{2} \sum_{m_S=1}^{M_S} \frac{ \omega_{m_S}} {M_S} \left( \sigma_{m_S} - \hat{\sigma}_{m_S}(\mathbf{w}) \right)^2 \\ J_F^3(\mathbf{w}) & = & \frac{1}{2} \frac{1}{M_{S,S}} \sum_{s=1}^{M_{S,S}} \sum_{\alpha=1}^{6} \frac{1}{6} \left( \sigma_{s, \alpha} - \hat{\sigma}_{s, \alpha}(\mathbf{w}) \right)^2\end{aligned} where the above notations have the same meaning as in the case of forces. :math:`M_S` denotes the number of datapoints of stress observables (6 points for each system). :math:`M_{S,S}` denotes the number of systems that have stress information. and :math:`\sigma_{s, \alpha}` is one component of the stress in the particular system :math:`s`. Default is ``1``. .. option:: train_only (loginal) Only the trainning is performed. No tests at all. This option is tested only in the case of ``ml_type=0``. target vector :math:`d`. Default is ``.false.``.