Dual-sPLS: a family of Dual Sparse Partial Least Squares regressions for feature selection and prediction with tunable sparsity; evaluation on simulated and near-infrared (NIR) data - IFPEN - IFP Energies nouvelles Accéder directement au contenu
Pré-Publication, Document De Travail (Preprint/Prepublication) Année : 2023

Dual-sPLS: a family of Dual Sparse Partial Least Squares regressions for feature selection and prediction with tunable sparsity; evaluation on simulated and near-infrared (NIR) data

Dual-sPLS: une famille de régressions par PLS (moindres carrés partiels) duale parcimonieuse pour la sélection d'attributs et la prédiction, avec parcimonie accordable ; évaluation sur des données simulées et en proche infrarouge (PIR)

Résumé

Relating a set of variables X to a response y is crucial in chemometrics. A quantitative prediction objective can be enriched by qualitative data interpretation, for instance by locating the most influential features. When high-dimensional problems arise, dimension reduction techniques can be used. Most notable are projections (e.g. Partial Least Squares or PLS ) or variable selections (e.g. lasso). Sparse partial least squares combine both strategies, by blending variable selection into PLS. The variant presented in this paper, Dual-sPLS, generalizes the classical PLS1 algorithm. It provides balance between accurate prediction and efficient interpretation. It is based on penalizations inspired by classical regression methods (lasso, group lasso, least squares, ridge) and uses the dual norm notion. The resulting sparsity is enforced by an intuitive shrinking ratio parameter. Dual-sPLS favorably compares to similar regression methods, on simulated and real chemical data. Code is provided as an open-source package in R: \url{https://CRAN.R-project.org/package=dual.spls}.
Fichier principal
Vignette du fichier
2022_PREPRINT_j-chemometr-intell-lab-syst_dual-spls-oa-arxiv-generated.pdf (9.2 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Licence : CC BY - Paternité

Dates et versions

hal-03957532 , version 1 (26-01-2023)

Licence

Paternité

Identifiants

Citer

Louna Alsouki, Laurent Duval, Clément Marteau, Rami El Haddad, François Wahl. Dual-sPLS: a family of Dual Sparse Partial Least Squares regressions for feature selection and prediction with tunable sparsity; evaluation on simulated and near-infrared (NIR) data. 2023. ⟨hal-03957532⟩
54 Consultations
26 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More