|
ИСТИНА |
Войти в систему Регистрация |
ИСТИНА ПсковГУ |
||
We consider the inverse problem of spectroscopy of multi-component water solutions, aimed at determining the concentrations of various ions in the solutions based on their spectral data (Raman, IR or optical absorption spectra). While the shape of the spectra is sensitive to the concentrations of ions, the dependence of spectral intensities on ion concentrations in multi-component solutions is complex and non-linear, thus requiring analysis of many spectral channels at once. Such analysis may be performed using machine learning methods, e.g. neural networks. However, this type of analysis is sensitive to the amount and representativity of data. This study examines the possibility of augmenting the training dataset by generating extra spectra using a variational autoencoder (VAE), thus improving representativity of data. The aim is to provide reduction of the error of solving the inverse problem. There are several possible approaches to such generation. Using a conditioned VAE (cVAE) trained on experimental data necessitates a strategy to select appropriate ion concentration sets for spectrum generation. Another VAE approaches require determining target ion concentrations for generated spectra, which can be done using an ML regression model trained on experimental data, either in the feature space of the spectra or in the latent space of VAE. Subsequently generated spectra can be used in various ways along with experimental ones during training of regression neural networks solving the inverse problem. This study compares these methods and discusses their merits and drawbacks. The study was carried out at the expense of the grant No. 24-11-00266 from the Russian Science Foundation, https://rscf.ru/en/project/24-11-00266/.